oss-sec mailing list archives

Update on the distro-backdoor-scanner effort


From: Hank Leininger <hlein () korelogic com>
Date: Fri, 26 Apr 2024 14:06:16 -0600

tl;dr: We've pursued a number of avenues, with plans for more; so far
no "smoking gun" of other backdoors of a similar vein; help wanted.

So: what is this, where are we, what's next, what do we need, credits.

What is this?

- Ongoing work to look for backdoors similar to those found in 
  xz-utils, or using vectors that were discussed in its aftermath, 
  that have made their way into Linux distributions' build pipelines;
  see https://marc.info/?l=oss-security&m=171208242904550&w=4

- All the below is also covered in code, READMEs, or GH issues at
  https://github.com/hlein/distro-backdoor-scanner; the tools are
  intended to be bog-standard and work on any of the supported 
  distros, and to document what they need so our work should be
  repeatable by anybody.

Where we are: main things investigated:

- Similar exploitation toolkits / operator-behavior in other packages?

  - Unpack and scan all packages in multiple distribution families
    looking for the "fist" of the operator: similar stage0 / stage1 /
    stage2 loaders, characteristic command-line switch combinations,
    etc.; can we find earlier generations of any of the widgets used
    (and now burned) in other packages previously backdoored? See the
    patterns in bin/package_scan_all.sh

  - Unpacked and scanned (generally post-distro-patches, to focus on
    as-used-by-distros, not just rescan the same upstreams 4 times):

    - ~11k EndeavourOS/Arch packages
    - ~40k Debian packages
    - ~19k Gentoo packages
    - ~9k Rocky/RPM packages

  - Output is manageable; able to rule out all hits not part of the
    actual xz-utils backdoors as false positives.

- Examine the provenance of every .m4 in every package unpacked above

  - What m4 macro files unique/introduced by a project? Which are
    recognizable, but updated/modified from any revision ever in an
    upstreams (GNU autotools, etc.) in some way? m4 files have serial
    numbers, and the xz-utils backdoor used a big jump in its backdoor
    .m4; maybe an attempt to keep from getting clobbered after
    upstream upgrades?

  - Turns out serial numbers are made up and the points don't matter.
    But still, this author appears to have _thought_ they were
    important. So if they'd done similar somewhere else, should stand
    out there too.

  - Analyzing about 50k m4 files found about 5k that didn't match an
    upstream. around 1k of those had a near-match so we can diff them.
    That's still too many to digest manually. 3 had big serial jumps.
    2 of those seem benign; the third is of course, the trojan in
    xz-utils.

  - Big TODOs here are to implement fuzzy hashing when we don't have
    a perfect match, so that we can pick the best knowngood candidate
    to offer a diff against and to group the unknowns amongst
    themselves, and something to facilitate tracking of diff-review
    (CSV or another sqlite DB that tracks review status?), and then
    to actually read all the diffs (currently only spot-checked).

- Compare decompression of xz-utils vs other compatible tools

  - Just to check for some obvious Thompsonesque weird machine where
    xz injects malicious .c code into a tarball it unpacks, etc. Very
    unlikely to find anything.

  - Found nothing except some minor bugs in other decompressors (will
    submit upstream bugs, but low priority).

  - Still plan to add more different decompressors for completeness.

What's next: rough notions only, not yet implemented:

- Analyze IFUNC real-world use. They're dodgy and weird and useful for
  backdoors like this one. Removing IFUNC support from glibc has been
  floated: https://marc.info/?l=glibc-alpha&m=171389592724184&w=4
  But that'll get hung up on "but what if users". AFAWK nobody knows.
  So let's find out: survey sources & binaries from major distros and
  get some actual numbers. Also thegrugq made an interesting
  observation: it'd be telling which projects recently _added_ IFUNC
  use, if any. See
  https://github.com/hlein/distro-backdoor-scanner/issues/16

- Check for irregular contents in .pc files, inspired by Vegard
  Nossum's oss-security post
  https://marc.info/?l=oss-security&m=171335763115933&w=4
  This seems it'd be pretty easy to look for known bads. Starting
  notes: https://github.com/hlein/distro-backdoor-scanner/issues/7

- Systematically compare git-tagged versions of software to release
  artifacts for that same version. What differs, and why? There's
  often minor differences for what seem like good releng reasons. But
  in the xz-utils case, the backdoor author was able to get access to
  post Release assets even w/o commit/merge access; their backdoor was
  injected in files/contents that didn't match the Git repo contents.
  See https://github.com/hlein/distro-backdoor-scanner/issues/17

What do we need:

- Testers, especially on other distros in a family we support but
  only tested on one so far.

- Reproducers to rerun our analysis yourself and make sure you concur
  with our conclusions.

- Contributors to the currently outstanding issues/tools.

- Analysis help on the m4 diffs (once we have fuzzy-matching to choose
  best-fit diff comparison targets).

- Brainstorming to come up with the next big items to put on the list.

Credits:

  Most of this work has been done by Sam James of the Gentoo team and
  Hank Leininger (me), partially sponsored by KoreLogic. Thanks also
  to folks who helped us get a handle on a lot of different distros'
  ecosystems, especially Solar Designer (Rocky/RPM family), brocellous
  (Arch family).

Attachment: signature.asc
Description: Digital signature


Current thread: