Snort mailing list archives

Re: Multi Pattern Search Engine Plugin


From: Vlad Ulmeanu via Snort-devel <snort-devel () lists snort org>
Date: Thu, 11 Apr 2024 20:35:04 +0300

Hi Russ, thank you very much for the quick answer.

If I didn't overlook anything, I never interact with the null terminator.
This is how I store and access the patterns/text:

```
int add_pattern(const uint8_t* P, unsigned m, const PatternDescriptor&
desc, void* user) override {
        patterns.emplace_back(std::vector<uint8_t>(P, P + m), ...);

        ...
}

...

int _search(const uint8_t* T, int n, MpseMatch match, void* context, int*
current_state) override {
        ...

        std::vector<uint8_t> T_nocase(T, T + n);
        for (int i = 0; i < n; i++) {
            T_nocase[i] = tolower(T_nocase[i]);
        }

        ...

        ///debug print of the text:
        printf("(n = %d) T = ", n);
        for (int i = 0; i < n; i++) printf("%d ", T[i]);
        printf("\n");
```

The reproduction is a bit involved:

* The used ruleset: `snortrules-snapshot-3170.tar.gz` from here
<https://snort.org/downloads/registered/snortrules-snapshot-3170.tar.gz>.
* The used pcap: `Friday-WorkingHours.pcap` from here
<https://www.unb.ca/cic/datasets/ids-2017.html>, or directly from here
<http://205.174.165.80/CICDataset/CIC-IDS-2017/Dataset/CIC-IDS-2017/PCAPs/>
(post-registration).
* The used config files: here
<https://github.com/vlad-ulmeanu01/ExpoSizeStringSearch/tree/main/snort_benchmark/snort3_demo/tests/search_engines/bruteforce>
. The `search_method` parameter can be changed on line `251` in `snort.lua`.

If the setup is ok, there should be `1300` spawned `Mpse` classes under the
same process. The different `_search` result doesn't occur really early, so
we need to remember some previous `_search` return values for the class
that handles that specific test:

```
dbg_match_want_dq = std::deque<int>({35, 3, 32, 7, 10, 58, 1});
```

The modified versions for `lowmem.cc`, `sfksearch.cc`, `sfksearch.h` can be
found here
<https://gist.github.com/vlad-ulmeanu01/38f6c8d097805b17c5077aa4c914306b>.

Diffs can be found here:
* (`lowmem.cc`) diff <https://www.diffchecker.com/RcC4FpKJ/>
* (`sfksearch.cc`) diff <https://www.diffchecker.com/yLFPQv4Y/>
* (`sfksearch.h`) diff <https://www.diffchecker.com/wewjsweI/>

Lines `310 .. 318` of `snort.out` should look like this (they should be
created in under `10`s):

```
LOWMEM KTriePrefixMatch entry, n = 24
LOWMEM: user 0x557eca5995b0, pattern len 4, index 4, nocase 1, negative 0,
n 20. P nocase = 0 0 0 0
LOWMEM KTriePrefixMatch entry, n = 23
LOWMEM KTriePrefixMatch entry, n = 22
LOWMEM KTriePrefixMatch entry, n = 21
LOWMEM: user 0x557eca5995b0, pattern len 4, index 25, nocase 1, negative 0,
n 17. P nocase = 0 0 0 0
search finished: matches = 2
number of patterns: 63
T = 0 0 0 0 243 127 95 75 189 112 255 71 180 46 93 169 167 197 0 248 21 0 0
0
```

Also, I want to ask if there is any specific order for calling the `match`
function. If I understand correctly, we prioritize matches that begin
earlier in `T`. As a tie-break, we prefer the pattern that was added
earlier. Is this a concrete rule?

Thank you again for looking into this.
Vlad

În joi, 11 apr. 2024 la 17:22, Russ Combs (rucombs) <rucombs () cisco com> a
scris:

Hey Vlad,

Sounds like you are making progress.

lowmem is caseless which helps reduce memory. The exact match is checked
during signature evaluation unless the content is nocase.

ac_bnfa and ac_full are also caseless. The hyperscan MPSE is case
sensitive. Your algorithm can be either.

I'm not able to reproduce the match off the end of the buffer. Is it
possible that your input includes a null terminator with a length of 25? If
you want to send a pcap and config I'll take a look.

This list or a github issue are the best way to get support.

Russ

------------------------------
*From:* Vlad Ulmeanu <vlad.ulmeanu01 () gmail com>
*Sent:* Wednesday, April 10, 2024 3:38 PM
*To:* Russ Combs (rucombs) <rucombs () cisco com>
*Cc:* snort-devel () lists snort org <snort-devel () lists snort org>
*Subject:* Re: [Snort-devel] Multi Pattern Search Engine Plugin

Hi, back with some bugs:

* lowmem seems to treat every pattern as if they have `nocase == true`.
All calls to `KTriePrefixMatch` pass `Tnocase` as the useful parameter.
Is this intended?

* For the following text (in `uint8_t` format):

```
(n = 24)
T = 0 0 0 0 243 127 95 75 189 112 255 71 180 46 93 169 167 197 0 248 21 0
0 0
```

Tested against the following dictionary <https://pastebin.com/raET1dJR>
(originally `63` entries, only `50` are distinct (considering the pattern
array and the `nocase`, `negated` tags)):

My solution
<https://github.com/vlad-ulmeanu01/ExpoSizeStringSearch/blob/main/snort_benchmark/snort3_extra/src/search_engines/bruteforce/bruteforce.cc>
finds only one match: `pat = 0 0 0 0`, with the first index that doesn't
match being `4` (counting from `0`).

Lowmem
<https://github.com/snort3/snort3_extra/blob/b81e2e4f9296d9ae724e8d1b409371a3715fc2cc/src/search_engines/lowmem/sfksearch.cc#L579>
finds two matches:

```
LOWMEM index 4
LOWMEM: user 0x55a5a90e9540, pattern len 4, index 4, nocase 1, negative 0,
n 20. P nocase = 0 0 0 0
LOWMEM index 25
LOWMEM: user 0x55a5a90e9540, pattern len 4, index 25, nocase 1, negative
0, n 17. P nocase = 0 0 0 0
search finished: matches = 2
```

The first one is the same as mine. The second one matches the same
pattern, but ends outside of `T`, as if it also compares an extra `\0`? I
do not know why the value of `n` is so high for the second match, I
believe it should have been `-1`. `index + (remaining) n` should be equal
to the original value of `n`, `24`.

Is this behaviour correct? How can it be explained?

Also, is there any faster way to communicate?

Thank you.

În vin., 15 mar. 2024 la 14:11, Vlad Ulmeanu <vlad.ulmeanu01 () gmail com> a
scris:

Thank you very much for your answer!

I called build_tree for each pattern, and passed the current pattern's
associated tree object to the match function. The unit test passes now.

I have some follow-up questions:
* What are the literal, multi_match and flags variables from the
PatternDescriptor
<https://github.com/snort3/snort3/blob/bd6cbf1bbd3dcad9cd09261786b664d819357d94/src/framework/mpse.h#L65>
supposed to do? lowmem ignores them. Does literal mean one byte
characters? Is the order in which patterns must be tried to be matched
conditioned by multi_match?

* Locally, snort spawns 18 instances of LowmemMpse
<https://github.com/snort3/snort3_extra/blob/b81e2e4f9296d9ae724e8d1b409371a3715fc2cc/src/search_engines/lowmem/lowmem.cc#L42>.
It calls add_pattern(s), then prep_patterns and get_pattern_count on all
of them, but only calls _search on a single one. Is this related to the
first paragraph in here
<https://github.com/snort3/snort3/blob/master/src/detection/dev_notes.txt>
(creating an MPSE instance for each ...)? No further _search calls are
made if the first one doesn't return a match.

* Am I supposed to interact with int *current_state from _search?

Thank you again!

În joi, 14 mar. 2024 la 13:48, Russ Combs (rucombs) <rucombs () cisco com> a
scris:

Vlad,

rule_tree_queue is the only implementation of MpseMatch, which is a call
back provided so that your MPSE can report matches to the detection engine.
It is not a side-effect and it is not intended to be overridden.

If you break in rule_tree_queue, you will see that KTriePrefixMatch is
calling rule_tree_queue on the match via the match callback which was set
in fp_partial (called by fp_full).

There is scant documentation on this so we will improve that.

Also, that demo is a little off. The README says it is high performance,
but lowmem is decidedly low performance. It is also throws an unrelated
119:43 for no good reason so ignore that. 1:1 is the one you are after.

Hope that helps.
Russ


------------------------------
*From:* Snort-devel <snort-devel-bounces () lists snort org> on behalf of
Vlad Ulmeanu via Snort-devel <snort-devel () lists snort org>
*Sent:* Sunday, March 10, 2024 4:57 AM
*To:* snort-devel () lists snort org <snort-devel () lists snort org>
*Subject:* [Snort-devel] Multi Pattern Search Engine Plugin

Hi all,

I'm trying to plug in my Multi Pattern Search Engine
<https://github.com/vlad-ulmeanu01/ExpoSizeStringSearch> into snort3 and
run some benchmarks. I have run into some problems
<https://stackoverflow.com/questions/78121441/snort3-where-is-the-default-implementation-for-mpsematch> with
the setup: tried to rewrite the lowmem
<https://github.com/snort3/snort3_extra/tree/master/src/search_engines/lowmem> example
in snort3_extra <https://github.com/snort3/snort3_extra>, but there is a
side effect occurring in lowmem's _search
<https://github.com/snort3/snort3_extra/blob/b81e2e4f9296d9ae724e8d1b409371a3715fc2cc/src/search_engines/lowmem/lowmem.cc#L65C9-L65C16>
function (that triggers another "allow
<https://github.com/snort3/snort3_demo/blob/3fdada8224f8ec5ecea4649fdad144edec7a9c9e/tests/search_engines/ac_bnfa/expected#L2>"
in the snort3_demo
<https://github.com/snort3/snort3_demo/tree/master/tests/search_engines/ac_bnfa>
 example
<https://github.com/snort3/snort3_demo/tree/master/tests/search_engines/ac_bnfa>)
when calling match
<https://github.com/snort3/snort3_extra/blob/b81e2e4f9296d9ae724e8d1b409371a3715fc2cc/src/search_engines/lowmem/sfksearch.cc#L579>
 (MpseMatch
<https://github.com/snort3/snort3/blob/be0977a3a8a98632e5cd1238c1d0da6dc2693b5f/src/search_engines/search_common.h#L39>
 -> rule_tree_queue
<https://github.com/snort3/snort3/blob/be0977a3a8a98632e5cd1238c1d0da6dc2693b5f/src/detection/fp_detect.cc#L865> (I
suppose this is the default implementation of MpseMatch that lowmem ends
up using) -> MpseStash::push
<https://github.com/snort3/snort3/blob/be0977a3a8a98632e5cd1238c1d0da6dc2693b5f/src/detection/fp_detect.cc#L773>
-> MpseStash::process
<https://github.com/snort3/snort3/blob/be0977a3a8a98632e5cd1238c1d0da6dc2693b5f/src/detection/fp_detect.cc#L832>
 -> rule_tree_match
<https://github.com/snort3/snort3/blob/be0977a3a8a98632e5cd1238c1d0da6dc2693b5f/src/detection/fp_detect.cc#L375>).
Unfortunately, things get quite complicated, and I couldn't pinpoint the
reason for the side effect.

How can I deal with this side effect? I assume that I should call match
with a non-nullptr argument for tree, but I don't really understand its
meaning. Also, where can I find a good documentation source for
snort3_extra? The best I could find is this
<https://fossies.org/dox/snort3_extra-3.1.78.0/classLowmemMpse.html>.

Thank you,
Vlad Ulmeanu


_______________________________________________
Snort-devel mailing list
Snort-devel () lists snort org
https://lists.snort.org/mailman/listinfo/snort-devel

Please visit http://blog.snort.org for the latest news about Snort!

Current thread: