IDS mailing list archives

Re: Intrusion Detection Evaluation Datasets

From: "\"Zow\" Terry Brugger" <zow () acm org>
Date: Thu, 12 Mar 2009 08:40:04 -0700

Stefano,

An overwhelming majority of network based IDSs use only spatial
information present in packet headers.


"spatial" information ? if you mean "IP addresses", then


I took "spatial" information to mean connection or packet header data
-- more than just IP addresses, but lacking the unstructured data
portions.

1) your statement is definitely not true and


Actually, I think it is: the majority of unique NIDSs that I am
familiar with were built to use the KDD Cup '99 dataset. I pray none
of those systems are actually used in production anywhere. Let's face
it, only a handful of signature based network intrusion detectors were
ever built. After Marty released Snort to the community, there really
hasn't been a need to build another. Sure, a couple have been so that
they wouldn't be "encumbered" by the open source license, but there
really haven't been any major changes to signature based detection in
the past decade (just thousands of tweaks). Most anomaly or machine
learning based detectors will only work with structured data, so they
limit themselves to the header portions of the packets or connection
records.

2) such IDSs "work" only because of the artifacts in the evaluation datasets


We can't really say that conclusively. At this point we can only say
that any successes demonstrated by those systems has been due to flaws
in the evaluation datasets. For lack of good evaluation datasets, we
have no idea how those systems might perform in real world
environments. More importantly, for any system which requires training
data we must question how portable it is across different networks;
should it require unique training data for a given network, is it
feasible that such training data will ever be available?

I see a lot of people saying (correctly) that advanced (non-signature
based) NIDS can't be researched until we have good evaluation
datasets, and I see a lot of people ignoring them and doing it anyway.
Is anyone (else) actually working on fixing the data problem?

Cheers,
Terry

Current thread:

Intrusion Detection Evaluation Datasets snort user (Mar 04)
- Re: Intrusion Detection Evaluation Datasets "Zow" Terry Brugger (Mar 06)
- Re: Intrusion Detection Evaluation Datasets Damiano Bolzoni (Mar 09)
- Re: Intrusion Detection Evaluation Datasets Jamie Riden (Mar 09)
- <Possible follow-ups>
- Re: Re: Intrusion Detection Evaluation Datasets zubair . shafiq (Mar 09)
  - Re: Intrusion Detection Evaluation Datasets Stefano Zanero (Mar 09)
- Re: Re: Intrusion Detection Evaluation Datasets zubair . shafiq (Mar 10)
  - Re: Intrusion Detection Evaluation Datasets Stefano Zanero (Mar 11)
    - Re: Intrusion Detection Evaluation Datasets "Zow" Terry Brugger (Mar 12)
    - Re: Intrusion Detection Evaluation Datasets Paul Palmer (Mar 12)
    - Re: Intrusion Detection Evaluation Datasets Stuart Staniford (Mar 13)
    - Re: Intrusion Detection Evaluation Datasets Stefano Zanero (Mar 13)
    - Re: Intrusion Detection Evaluation Datasets "Zow" Terry Brugger (Mar 13)
    - Re: Intrusion Detection Evaluation Datasets Paul Palmer (Mar 13)
    - Re: Intrusion Detection Evaluation Datasets Stefano Zanero (Mar 13)
    - Re: Intrusion Detection Evaluation Datasets Paul Palmer (Mar 13)
    - Re: Intrusion Detection Evaluation Datasets Stefano Zanero (Mar 13)
    - Message not available
    - Re: Intrusion Detection Evaluation Datasets "Zow" Terry Brugger (Mar 13)
    - Re: Intrusion Detection Evaluation Datasets Paul Palmer (Mar 13)

(Thread continues...)