funsec mailing list archives

Re: Consumer Reports Slammed for Creating 'Test' Viruses


From: Drsolly <drsollyp () drsolly com>
Date: Sun, 20 Aug 2006 15:42:18 +0100 (BST)

On Sun, 20 Aug 2006, Peter Kosinar wrote:

Why don't people think that dentists cause cavities and surgeons cause
hernias?

Maybe because the dentists and surgeons don't make you visit them every 
day, just to perform some update of their work? ;-)

I have to visit my dentist every six months, for exactly that reason.
 
Blue Boar is right in assuming that AVs rely on updates (the distinction
between the "scanning engine", "signatures", and other parts of the AV is
disappearing over the time).

It disappeared in 1991.

It might just be a matter of wording but my opinion is different. In my 
view, scanning engine is the code which is directly executed on the 
processor. Signatures are the (in most cases short) pieces of data which 
are used by the engine to determine if the file is infected or not.

Maybe you haven't written an antivirus recently. Nor have I, actually - 
not for several years, in fact. 

What you call a "signature", can actually be a signature, or it could be 
an interpreted language such as Virtran that is executed by the CPU, or it 
can even be assembly caode that is executed by the CPU.

Here is my view of the history; I may be wrong:

In the -very- beginning, there were a few really simple viruses. The 
antivirus essentially consisted of a piece of code which read the file (or 
boot sector or MBR or whatever it was scanning) and a set of strings which 
the code attempted to locate in the file directly -- i.e. akin to strstr() 
function. The code was called "scanning engine", the strings were 
"signatures".

Right. That was true until about 1990 or 1991. At that point, I developed 
Virtran, an interpreted language. Findvirus was the interpreter for 
Virtran.
 
This separation of engine and signatures allowed the AVers to react to new 
malware with very little effort -- a new virus? Bang, we'll just pick 10 
bytes long signature from its destructive payload and we're done! Great!

Somewhat later, encrypted viruses started to appear (still, in very small 
numbers). As they were encrypted differently in each infected file, it was 
no longer possible to pick a signature for them (to be exact, you could 
pick a signature from the decryptor, as it used to be quite static at 
first). So, one had to write a specific detection routine (=code) for this 
particular virus. And then for another one. And another one... As the 
number of encrypted viruses increased, this process was more and more 
tedious.

Very true, I remember the process distinctly. And some viruses were really 
difficult to detect reliably.

In other words, the engine was growing too fat, while the old-style 
signatures were no longer very useful. Thus, the obvious trick was to add 
some decryption-like abilities to the engine, which would first decrypt 
the actual body of the virus and only then run it through the signatures.

Actually, I wrote an emulator, so I could emulate PC while running on a 
PC.

Once again, dealing with new viruses was simple -- just a matter of adding 
a signature. Again, these signatures were not executable code, they were 
just bunches of bytes that were matched against the data provided by the 
engine. However, the data was no longer the raw data obtained from the 
file/boot sector -- it was preprocessed by the engine.

Not quite. You have to give the emulator instructions for each virus.
 
And so the saga continued... When a new anti-anti-virus technology 
appeared in one sample virus, it was easier to consider it a special case 
(i.e. add a specific detection code just for it). If it started to be used 
more often, it was more reasonable to create some shortcut for creating 
new detections -- i.e. add some kind of new "signatures" for particular 
type of technology or some kind of workaround which re-enabled the older 
kind of signatures to be used.

In modern AVs, the "signatures" can be almost anything -- ranging from the 
basic "substrings" described above, through expert-system-like conditions 
like: "If the program contains this, this and this at most that far from 
the first occurance of this, then it is Bagle" to small programs written 
in a meta-language interpreted by the AV on-the-fly.

In general, the detection of known malware by name is almost always based 
on some kind of "signatures", whereas the proactive detection (i.e. 
attempts to detect as-of-yet unknown malware) is based on heuristical 
methods. However, nothing prevents signatures to be used in proactive 
detection as well -- one might, for example, perform some kind of 
fuzzy-matching of the signature and the actual code...

Needless to say that the bad guys are always going to be one step ahead, 
for they can tune their piece of code until -no- available antivirus (and 
practically usable <- to rule out products which flag everything as being 
malicious ;-) ) detects it. Then, the quality of the engines and 
signatures determines how long will it take the particular AV to add a 
detection for it.
 
And the quality conrtrol.

Why do people so frequently forget about QC? When you update a product, 
you have to put it through an extensive set of tests, to check that it:

A) doesn't crash a whole bunch of computers (that's happened in the past)

B) detects all the viruses it's supposed to detect (that test isn't as 
easy as it sounds - with polymorphics, you need to have a lot of 
instances), and you need to repeat the test on all platforms, and with a 
good variaety of configuration parameters.

C) Does give major false alarms - that's also really difficult.

I can tell you - creating a driver to detect a new virus, even a horrible 
complex polymorphic virus, would take me maybe ten minutes when I was 
still doing that (the emulator was the most powerful tool for that). It's 
the testing that's time consuming.

_______________________________________________
Fun and Misc security discussion for OT posts.
https://linuxbox.org/cgi-bin/mailman/listinfo/funsec
Note: funsec is a public and open mailing list.


Current thread: