Snort mailing list archives

Re: Multi Pattern Search Engine Plugin


From: "Russ Combs \(rucombs\) via Snort-devel" <snort-devel () lists snort org>
Date: Fri, 19 Apr 2024 14:57:43 +0000

Hey Vlad,

I built your patched lowmem and got the same results as unpatched.

I have the pcap but haven't tried a full reproduction. Please narrow it down to make it easier to focus on the problem. 
Just the minimum diff from the default config, your command line, and the specific rule or rules that are required to 
reproduce.

You should call the match function in the order that your algorithm generates them. Snort will figure it out from there.

Russ

________________________________
From: Vlad Ulmeanu <vlad.ulmeanu01 () gmail com>
Sent: Thursday, April 11, 2024 1:35 PM
To: Russ Combs (rucombs) <rucombs () cisco com>
Cc: snort-devel () lists snort org <snort-devel () lists snort org>
Subject: Re: [Snort-devel] Multi Pattern Search Engine Plugin

Hi Russ, thank you very much for the quick answer.

If I didn't overlook anything, I never interact with the null terminator. This is how I store and access the 
patterns/text:

```
int add_pattern(const uint8_t* P, unsigned m, const PatternDescriptor& desc, void* user) override {
        patterns.emplace_back(std::vector<uint8_t>(P, P + m), ...);

        ...
}

...

int _search(const uint8_t* T, int n, MpseMatch match, void* context, int* current_state) override {
        ...

        std::vector<uint8_t> T_nocase(T, T + n);
        for (int i = 0; i < n; i++) {
            T_nocase[i] = tolower(T_nocase[i]);
        }

        ...

        ///debug print of the text:
        printf("(n = %d) T = ", n);
        for (int i = 0; i < n; i++) printf("%d ", T[i]);
        printf("\n");
```

The reproduction is a bit involved:

* The used ruleset: `snortrules-snapshot-3170.tar.gz` from 
here<https://snort.org/downloads/registered/snortrules-snapshot-3170.tar.gz>.
* The used pcap: `Friday-WorkingHours.pcap` from here<https://www.unb.ca/cic/datasets/ids-2017.html>, or directly from 
here<http://205.174.165.80/CICDataset/CIC-IDS-2017/Dataset/CIC-IDS-2017/PCAPs/> (post-registration).
* The used config files: 
here<https://github.com/vlad-ulmeanu01/ExpoSizeStringSearch/tree/main/snort_benchmark/snort3_demo/tests/search_engines/bruteforce>
 . The `search_method` parameter can be changed on line `251` in `snort.lua`.

If the setup is ok, there should be `1300` spawned `Mpse` classes under the same process. The different `_search` 
result doesn't occur really early, so we need to remember some previous `_search` return values for the class that 
handles that specific test:

```
dbg_match_want_dq = std::deque<int>({35, 3, 32, 7, 10, 58, 1});
```

The modified versions for `lowmem.cc`, `sfksearch.cc`, `sfksearch.h` can be found 
here<https://gist.github.com/vlad-ulmeanu01/38f6c8d097805b17c5077aa4c914306b>.

Diffs can be found here:
* (`lowmem.cc`) diff<https://www.diffchecker.com/RcC4FpKJ/>
* (`sfksearch.cc`) diff<https://www.diffchecker.com/yLFPQv4Y/>
* (`sfksearch.h`) diff<https://www.diffchecker.com/wewjsweI/>

Lines `310 .. 318` of `snort.out` should look like this (they should be created in under `10`s):

```
LOWMEM KTriePrefixMatch entry, n = 24
LOWMEM: user 0x557eca5995b0, pattern len 4, index 4, nocase 1, negative 0, n 20. P nocase = 0 0 0 0
LOWMEM KTriePrefixMatch entry, n = 23
LOWMEM KTriePrefixMatch entry, n = 22
LOWMEM KTriePrefixMatch entry, n = 21
LOWMEM: user 0x557eca5995b0, pattern len 4, index 25, nocase 1, negative 0, n 17. P nocase = 0 0 0 0
search finished: matches = 2
number of patterns: 63
T = 0 0 0 0 243 127 95 75 189 112 255 71 180 46 93 169 167 197 0 248 21 0 0 0
```

Also, I want to ask if there is any specific order for calling the `match` function. If I understand correctly, we 
prioritize matches that begin earlier in `T`. As a tie-break, we prefer the pattern that was added earlier. Is this a 
concrete rule?

Thank you again for looking into this.
Vlad

În joi, 11 apr. 2024 la 17:22, Russ Combs (rucombs) <rucombs () cisco com<mailto:rucombs () cisco com>> a scris:
Hey Vlad,

Sounds like you are making progress.

lowmem is caseless which helps reduce memory. The exact match is checked during signature evaluation unless the content 
is nocase.

ac_bnfa and ac_full are also caseless. The hyperscan MPSE is case sensitive. Your algorithm can be either.

I'm not able to reproduce the match off the end of the buffer. Is it possible that your input includes a null 
terminator with a length of 25? If you want to send a pcap and config I'll take a look.

This list or a github issue are the best way to get support.

Russ

________________________________
From: Vlad Ulmeanu <vlad.ulmeanu01 () gmail com<mailto:vlad.ulmeanu01 () gmail com>>
Sent: Wednesday, April 10, 2024 3:38 PM
To: Russ Combs (rucombs) <rucombs () cisco com<mailto:rucombs () cisco com>>
Cc: snort-devel () lists snort org<mailto:snort-devel () lists snort org> <snort-devel () lists snort 
org<mailto:snort-devel () lists snort org>>
Subject: Re: [Snort-devel] Multi Pattern Search Engine Plugin

Hi, back with some bugs:

* lowmem seems to treat every pattern as if they have `nocase == true`. All calls to `KTriePrefixMatch` pass `Tnocase` 
as the useful parameter. Is this intended?

* For the following text (in `uint8_t` format):

```
(n = 24)
T = 0 0 0 0 243 127 95 75 189 112 255 71 180 46 93 169 167 197 0 248 21 0 0 0
```

Tested against the following dictionary<https://pastebin.com/raET1dJR> (originally `63` entries, only `50` are distinct 
(considering the pattern array and the `nocase`, `negated` tags)):

My 
solution<https://github.com/vlad-ulmeanu01/ExpoSizeStringSearch/blob/main/snort_benchmark/snort3_extra/src/search_engines/bruteforce/bruteforce.cc>
 finds only one match: `pat = 0 0 0 0`, with the first index that doesn't match being `4` (counting from `0`).

Lowmem<https://github.com/snort3/snort3_extra/blob/b81e2e4f9296d9ae724e8d1b409371a3715fc2cc/src/search_engines/lowmem/sfksearch.cc#L579>
 finds two matches:

```
LOWMEM index 4
LOWMEM: user 0x55a5a90e9540, pattern len 4, index 4, nocase 1, negative 0, n 20. P nocase = 0 0 0 0
LOWMEM index 25
LOWMEM: user 0x55a5a90e9540, pattern len 4, index 25, nocase 1, negative 0, n 17. P nocase = 0 0 0 0
search finished: matches = 2
```

The first one is the same as mine. The second one matches the same pattern, but ends outside of `T`, as if it also 
compares an extra `\0`? I do not know why the value of `n` is so high for the second match, I believe it should have 
been `-1`. `index + (remaining) n` should be equal to the original value of `n`, `24`.

Is this behaviour correct? How can it be explained?

Also, is there any faster way to communicate?

Thank you.

În vin., 15 mar. 2024 la 14:11, Vlad Ulmeanu <vlad.ulmeanu01 () gmail com<mailto:vlad.ulmeanu01 () gmail com>> a scris:
Thank you very much for your answer!

I called build_tree for each pattern, and passed the current pattern's associated tree object to the match function. 
The unit test passes now.

I have some follow-up questions:
* What are the literal, multi_match and flags variables from the 
PatternDescriptor<https://github.com/snort3/snort3/blob/bd6cbf1bbd3dcad9cd09261786b664d819357d94/src/framework/mpse.h#L65>
 supposed to do? lowmem ignores them. Does literal mean one byte characters? Is the order in which patterns must be 
tried to be matched conditioned by multi_match?

* Locally, snort spawns 18 instances of 
LowmemMpse<https://github.com/snort3/snort3_extra/blob/b81e2e4f9296d9ae724e8d1b409371a3715fc2cc/src/search_engines/lowmem/lowmem.cc#L42>.
 It calls add_pattern(s), then prep_patterns and get_pattern_count on all of them, but only calls _search on a single 
one. Is this related to the first paragraph in 
here<https://github.com/snort3/snort3/blob/master/src/detection/dev_notes.txt> (creating an MPSE instance for each 
...)? No further _search calls are made if the first one doesn't return a match.

* Am I supposed to interact with int *current_state from _search?

Thank you again!

În joi, 14 mar. 2024 la 13:48, Russ Combs (rucombs) <rucombs () cisco com<mailto:rucombs () cisco com>> a scris:
Vlad,

rule_tree_queue is the only implementation of MpseMatch, which is a call back provided so that your MPSE can report 
matches to the detection engine. It is not a side-effect and it is not intended to be overridden.

If you break in rule_tree_queue, you will see that KTriePrefixMatch is calling rule_tree_queue on the match via the 
match callback which was set in fp_partial (called by fp_full).

There is scant documentation on this so we will improve that.

Also, that demo is a little off. The README says it is high performance, but lowmem is decidedly low performance. It is 
also throws an unrelated 119:43 for no good reason so ignore that. 1:1 is the one you are after.

Hope that helps.
Russ


________________________________
From: Snort-devel <snort-devel-bounces () lists snort org<mailto:snort-devel-bounces () lists snort org>> on behalf of 
Vlad Ulmeanu via Snort-devel <snort-devel () lists snort org<mailto:snort-devel () lists snort org>>
Sent: Sunday, March 10, 2024 4:57 AM
To: snort-devel () lists snort org<mailto:snort-devel () lists snort org> <snort-devel () lists snort 
org<mailto:snort-devel () lists snort org>>
Subject: [Snort-devel] Multi Pattern Search Engine Plugin

Hi all,

I'm trying to plug in my Multi Pattern Search Engine<https://github.com/vlad-ulmeanu01/ExpoSizeStringSearch> into 
snort3 and run some benchmarks. I have run into some 
problems<https://stackoverflow.com/questions/78121441/snort3-where-is-the-default-implementation-for-mpsematch> with 
the setup: tried to rewrite the lowmem<https://github.com/snort3/snort3_extra/tree/master/src/search_engines/lowmem> 
example in snort3_extra<https://github.com/snort3/snort3_extra>, but there is a side effect occurring in lowmem's 
_search<https://github.com/snort3/snort3_extra/blob/b81e2e4f9296d9ae724e8d1b409371a3715fc2cc/src/search_engines/lowmem/lowmem.cc#L65C9-L65C16>
  function (that triggers another 
"allow<https://github.com/snort3/snort3_demo/blob/3fdada8224f8ec5ecea4649fdad144edec7a9c9e/tests/search_engines/ac_bnfa/expected#L2>"
 in the snort3_demo<https://github.com/snort3/snort3_demo/tree/master/tests/search_engines/ac_bnfa> 
example<https://github.com/snort3/snort3_demo/tree/master/tests/search_engines/ac_bnfa>) when calling 
match<https://github.com/snort3/snort3_extra/blob/b81e2e4f9296d9ae724e8d1b409371a3715fc2cc/src/search_engines/lowmem/sfksearch.cc#L579>
 
(MpseMatch<https://github.com/snort3/snort3/blob/be0977a3a8a98632e5cd1238c1d0da6dc2693b5f/src/search_engines/search_common.h#L39>
 -> 
rule_tree_queue<https://github.com/snort3/snort3/blob/be0977a3a8a98632e5cd1238c1d0da6dc2693b5f/src/detection/fp_detect.cc#L865>
 (I suppose this is the default implementation of MpseMatch that lowmem ends up using) -> 
MpseStash::push<https://github.com/snort3/snort3/blob/be0977a3a8a98632e5cd1238c1d0da6dc2693b5f/src/detection/fp_detect.cc#L773>
  -> 
MpseStash::process<https://github.com/snort3/snort3/blob/be0977a3a8a98632e5cd1238c1d0da6dc2693b5f/src/detection/fp_detect.cc#L832>
 -> 
rule_tree_match<https://github.com/snort3/snort3/blob/be0977a3a8a98632e5cd1238c1d0da6dc2693b5f/src/detection/fp_detect.cc#L375>).
 Unfortunately, things get quite complicated, and I couldn't pinpoint the reason for the side effect.

How can I deal with this side effect? I assume that I should call match with a non-nullptr argument for tree, but I 
don't really understand its meaning. Also, where can I find a good documentation source for snort3_extra? The best I 
could find is this<https://fossies.org/dox/snort3_extra-3.1.78.0/classLowmemMpse.html>.

Thank you,
Vlad Ulmeanu
_______________________________________________
Snort-devel mailing list
Snort-devel () lists snort org
https://lists.snort.org/mailman/listinfo/snort-devel

Please visit http://blog.snort.org for the latest news about Snort!

Current thread: