IDS mailing list archives

Re: Intrusion Detection Evaluation Datasets


From: Stefano Zanero <zanero () elet polimi it>
Date: Fri, 13 Mar 2009 12:03:12 +0100

Stuart Staniford wrote:

There's a number of things about the framing of this discussion that are
bugging me

Me too, but nevertheless I find this to be one of the best threads on
this list in the last few months :)

again.  So the main nuisances on the wire keep changing, and any dataset
is necessarily going to get stale very quickly.

Very true, and so if any dataset is needed this has to be kept in mind.

doing things.  For us, the main focus is "What are the bad guys doing
now?" and "What features do we need to detect what they are now doing". 
Usually, if you have good features with high discrimination, most
algorithms can be tweaked to do ok. 

True, up to a point. On the other hand, many algorithms can and should
be safely discarded (many of them are instead published ;-) on the
ground of not being theoretically able to handle some types of features
correctly.

So forget looking for a dataset.  Look for a wire.  
[...]
I think the problem of producing regular timely datasets that can be
safely published is probably just about intractable

You have a point here, on the other side this poses a huge challenge to
the replicability of results, and therefore to scientific vetting of
data before publication.

So I am convinced that you need to validate your ideas against data
coming from the wire, but on the other hand some means of comparison
between different approaches must be established.

Stefano Zanero



Current thread: