IDS mailing list archives

Re: Intrusion Detection Evaluation Datasets


From: "Stuart Staniford" <sstaniford () FireEye com>
Date: Thu, 19 Mar 2009 11:42:23 -0700


On Mar 19, 2009, at 10:29 AM, Stefano Zanero wrote:


I can't see why they should be "long sequences of the same characters", which is (evidently) a way to stress the limitations of regular expressions.

Unless there's a clear case for such expressivity being useful IRL, I
don't see why we should worry about it.

The original example was artificial, but the issue is very real.

Let me give you some more realistic examples of the kind of thing I
see on the wire every day in web attacks.

A common obfuscation technique in javascript (more common a year or
two ago) is to have something like:

<scripttaghere>
eval(unescape(%ab%23...

<omit several pages of escaped stuff here>

));
</scripttaghere>

(I have intentionally used incorrect script tags to avoid some mail
clients trying to interpret it)

The exact hex contents may well be polymorphic - different in every
instance of the attack.

A current obfuscation technique I'm seeing a lot of is

<scripttaghere>
var x = "123.78.42...

<omit several pages of dot separated numbers>
...";

now a few lines of code to decode above variable

</scripttaghere>

Again, both the name of the variable and the long string of dot
separated numbers are polymorphic.

A simple string matching signature mechanism is useless here (you can
alert on things like "eval(unescape(" and some IDS's do, but you will false
positive like crazy as legitimate pages also use the idiom).

Stuart.




Current thread: