oss-sec mailing list archives

Detecting code injections in packages through debug infos


From: Adrien Nader <adrien () notk org>
Date: Wed, 3 Apr 2024 09:33:16 +0200

Hi,

Following the xz-utils backdoor, I realized that such backdoors have
goals which are at odds with distributions: the code will be foreign,
probably obfuscated, maybe compiled with a different toolchain or
different settings, ...

Can we take advantage of that for detection? Or at least for making it
more difficult for attackers to go un-noticed? I believe "Jia Tan"
actually had troubles carrying this out: managed to do it but it wasn't
a walk in the park either. Additional small hindrances may prove useful
to help others stumble and be noticed.

Below is a short example. I've simply pulled packages of debug symbols
from Debian or Ubuntu for this[1][2]. Below I will be using the files
from [2].

Start by downloading and extracting the deb and ddeb for a relevant:
  mkdir d
  cd
  wget/curl ...
  dpkg -x foo.deb .

(Note that debuginfo servers could provide symbols but I think that not
long ago, I think that I couldn't find an easy way to download the files
outside of gdb or others)

Then, run "eu-unstrip" as in

  eu-unstrip -o d.zo d/usr/lib/x86_64-linux-gnu/liblzma.so.5 
d/usr/lib/debug/.build-id/f6/f3a4b96c06ffaa26772f42297cfcc4f1cb4a32.debug

After that, use "nm -lU" to print symbols defined by the library and
source file and line numbers:

  nm -lU z.so

Now, since I've already explored this, filter than with "grep -C 1
stdin" (and prettify the output by hand):

  t lzma_cputhreads_522 [SNIP]/../../../../src/liblzma/common/hardware_cputhreads.c:30
  i lzma_crc32          [SNIP]/<stdin>:145
  r lzma_crc32_table    [SNIP]/../../../../src/liblzma/check/crc32_table_le.h:5
  i lzma_crc64          [SNIP]/<stdin>:108
  r lzma_crc64_table    [SNIP]/../../../../src/liblzma/check/crc64_table_le.h:5

Addresses have been removed and [SNIP] is
"/usr/src/xz-utils-5.6.0-0.2/debian/normal-build/src/liblzma".

As you can see, filename is not really available for the two problematic
symbols. Moreover, the directory path is different compared to other
symbols.

Of course, this could be tampered but a) here it wasn't, b) there are
many other potential checks (not all using debug infos actually), for
instance:
- toolchain and compilation options used,
- availability of debug symbols,
- a checksum of the object code if that exists,
- symbolic matching of object code and debug infos,
- historical data (5.5.2beta as provided by Jia Tan was basically
  identical to 5.6, but had no backdoor),
- cross-distro (anti-)correlation maybe,
- expected variability with compiler changes (object code shall differ
  between LLVM, GCC and across their various versions)
- and I'm sure many others

Has that path already been explored? I can easily come up with ways
around the ideas above but it's all about raising the bar with stuff
that is cheap to implement (besides symbolic matching ;p ).

By the way, I started with llvm-dwarfdump and switched to dwarfdump but
their output is unfortunately really a dump rather than some convenient
serialization format like JSON. I think there are some libraries to go
through dwarf infos but I wanted to stick to command-line tools in order
not to spend most of my time deciding which library to use. A
production-ready implementation will probably need to do proper walking
of the various infos so Input about this is definitely welcome.

PS: I work at Canonical, as part of the Ubuntu Foundations team, but the
above was revenge-driven hobby work and I doubt it would be considered
on of my work duties or topics.

[1] https://launchpad.net/ubuntu/+source/xz-utils/5.6.0-0.2/+build/27848538
[2] https://snapshot.debian.org/package/xz-utils/5.6.1-1/

-- 
Adrien Nader
Conned by Jia Tan in 2024


Current thread: