Snort mailing list archives

Re: Optimized implementation of AC and AC_Q pattern matching algorithms


From: Pablo Cantos <pablocantos () gmail com>
Date: Mon, 28 Jan 2013 21:40:31 +0100

Hi Abed,

2013/1/28 abed mohammad kamaluddin <abedamu () gmail com>

Hi Pablo,

I tried you modification on OCTEON today and was disappointed that
there was a drop in performance on it, however on Intel there is
significant improvement in perf. In comparison to unmodified snort
using Network traffic and ac-q:

                                      cross-compiling (Octeon)
      gcc (Intel)
Using your modification - 15% down
 21% up
Using mine                  -  11% up
       12% up

I supposed it. I mean, when I have been working on getting a better
performance for the MPSE a Intel machine was used and I found that in many
cases a better performance was obtained using gcc, but a worse performance
was obtained with icc. This time, probably we won't be able to perform
these tests on the Intel machine until next week.


On Intel, we can remove sindex altogether:

       state = ps[2 + xlatcase[T[0]]]

Yes, I removed it too and I got very similar results, maybe because the
compiler (gcc) removed it by itself.


Maybe it could be because I am cross compiling ; there were lot of
wasted cycles in my case. Unless I introduce an intermediate state, I
am not getting any improvement. I had earlier analyzed this routine in
detail and each cycle reduced in this macro gave significant perf
improvements.

Could you please mention the following:
- Which compiler have you used (gcc/icc version).
- If you can provide the pcap file you are using for the tests it would
help.

I have used gcc 4.6.1-9ubuntu3.
One of them (1gb) was downloaded from here:
https://www.itoc.usma.edu/research/dataset/ but I don't remember which it
is exactly.
I will send you tomorrow a private link to download them. Anyway, if
someone else wants to download the pcap files just send me an email and
I'll provide him/her the link.

I will try your modifications on a bigger Intel machine sometime this
week and report the findings.

I'll be waiting for your news


Thanks,
Abed M K



On Sun, Jan 27, 2013 at 12:27 AM, abed mohammad kamaluddin
<abedamu () gmail com> wrote:

Hi Pablo,

That looks neater :) I will try your modifications on our system on
Monday
when I return to work. I used zero-alert traffic to measure perf which
gave
me 11% up on 2.9.0.4.

PS: I have noticed a more than 15% overall performance drop on 2.9.4 as
compare to 2.9.3.1 when compiled with exactly same options. I haven't
analyzed it so far, however this particular optimization has the same
relative improvement on all versions (2.9.0.4, 2.9.3.1 and 2.9.4) .

Thanks,
Abed



Message: 2

Date: Sat, 26 Jan 2013 16:50:37 +0100
From: Pablo Cantos <pablocantos () gmail com>
Subject: Re: [Snort-devel] Optimized implementation of AC and AC_Q
        pattern matching algorithms
To: snort-devel () lists sourceforge net
Message-ID:

<CADQcQ1OzJDso9FEQhG77dGtSTEiX=J80bxRDMv9t8Yz7WhH0-g () mail gmail com>
Content-Type: text/plain; charset="iso-8859-1"

Hi Abed,

First of all, thanks for your contribution.

I have checked your proposal in snort 2.9.1 by using two different pcap
files, one of them (612MB-sized) throws around 900k alerts and the other
one (1GB-sized) throws just 55 alerts. I have used an AMD Turion II
dual-core mobile M520 at 2.3GHz, 512KB cache L2 by each core and 4GB
RAM.

These are the performance jumps that I have obtained:

612MB file -> 4.24% for MPSE, 5.93% for ac-q
1GB file -> 7.93% for MPSE, 8.03% for ac-q

These results were obtained by measuring the times taken to analyze each
pcap file by the MPSE and AC, separately.

I have been working for several months with Snort to get improvements in
other areas of the MPSE, I have also studied the AC_SEARCH macros and I
didn't find them completely efficients but I didn't go further. Now,
after
seeing your proposal and checking the AC_SEARCH_Q again I thought that
taking small changes it could work even better.

You have suggested to pre-fetch the new state before was used it:

#define AC_SEARCH_Q \
+   *acstate_t new_state;* \
    for (; T < Tend; T++) \
    { \
        ps = NextState[state]; \
        sindex = xlatcase[T[0]]; \
+       *new_state = ps[2 + sindex];*  \
        if (ps[1]) \
        { \
            if (MatchList[state]) \
            { \
                if (_add_queue(&acsm->q,MatchList[state])) \
                { \
                    if (_process_queue(&acsm->q, Match,data)) \
                    { \
                        *current_state = state; \
                        return 1; \
                    } \
                } \
            } \
        } \
*-       state = ps[2 + sindex]; \
*
+       *state = new_state;* \
    }
 #endif

But I think the routine could be more efficient too if we just fetch the
new state in the end, and we move down one line:

#define AC_SEARCH_Q \
    for (; T < Tend; T++) \
    { \
        ps = NextState[state]; \
*-       sindex = xlatcase[T[0]]; \*
        if (ps[1]) \
        { \
            if (MatchList[state]) \
            { \
                if (_add_queue(&acsm->q,MatchList[state])) \
                { \
                    if (_process_queue(&acsm->q, Match,data)) \
                    { \
                        *current_state = state; \
                        return 1; \
                    } \
                } \
            } \
        } \
*+       sindex = xlatcase[T[0]]; \*
        state = ps[2 + sindex]; \
    }
#endif

In this way, I think the number of instructions could be reduced too.

And these are the results that I have obtained:

612MB file -> 15.66% for MPSE, 22.13% for ac-q
1GB file -> 34.49% for MPSE, 35.13% for ac-q

I will try to repeat these tests using other more powerful computers. In
any case, we will give more information on Monday.
-------------- next part --------------
An HTML attachment was scrubbed...

------------------------------

Message: 3
Date: Sat, 26 Jan 2013 11:26:55 -0500
From: Joel Esler <jesler () sourcefire com>
Subject: Re: [Snort-devel] Optimized implementation of AC and AC_Q
        pattern matching algorithms
To: Pablo Cantos <pablocantos () gmail com>
Cc: "snort-devel () lists sourceforge net"
        <snort-devel () lists sourceforge net>
Message-ID: <2B25430D-9736-49BB-9A81-EC9AF163C28F () sourcefire com>
Content-Type: text/plain; charset="utf-8"

Pablo,

Great stuff. We need to make sure we are testing and modifying against
2.9.4 though.

--
Joel Esler
Sent from my iPhone ?

On Jan 26, 2013, at 10:50 AM, Pablo Cantos <pablocantos () gmail com>
wrote:

Hi Abed,

First of all, thanks for your contribution.

I have checked your proposal in snort 2.9.1 by using two different
pcap
files, one of them (612MB-sized) throws around 900k alerts and the
other one
(1GB-sized) throws just 55 alerts. I have used an AMD Turion II
dual-core
mobile M520 at 2.3GHz, 512KB cache L2 by each core and 4GB RAM.

These are the performance jumps that I have obtained:

612MB file -> 4.24% for MPSE, 5.93% for ac-q
1GB file -> 7.93% for MPSE, 8.03% for ac-q

These results were obtained by measuring the times taken to analyze
each pcap file by the MPSE and AC, separately.

I have been working for several months with Snort to get improvements
in other areas of the MPSE, I have also studied the AC_SEARCH macros
and I
didn't find them completely efficients but I didn't go further. Now,
after
seeing your proposal and checking the AC_SEARCH_Q again I thought that
taking small changes it could work even better.

You have suggested to pre-fetch the new state before was used it:

#define AC_SEARCH_Q \
+   acstate_t new_state; \
    for (; T < Tend; T++) \
    { \
        ps = NextState[state]; \
        sindex = xlatcase[T[0]]; \
+       new_state = ps[2 + sindex];  \
        if (ps[1]) \
        { \
            if (MatchList[state]) \
            { \
                if (_add_queue(&acsm->q,MatchList[state])) \
                { \
                    if (_process_queue(&acsm->q, Match,data)) \
                    { \
                        *current_state = state; \
                        return 1; \
                    } \
                } \
            } \
        } \
-       state = ps[2 + sindex]; \
+       state = new_state; \
    }
#endif

But I think the routine could be more efficient too if we just fetch
the new state in the end, and we move down one line:

#define AC_SEARCH_Q \
    for (; T < Tend; T++) \
    { \
        ps = NextState[state]; \
-       sindex = xlatcase[T[0]]; \
        if (ps[1]) \
        { \
            if (MatchList[state]) \
            { \
                if (_add_queue(&acsm->q,MatchList[state])) \
                { \
                    if (_process_queue(&acsm->q, Match,data)) \
                    { \
                        *current_state = state; \
                        return 1; \
                    } \
                } \
            } \
        } \
+       sindex = xlatcase[T[0]]; \
        state = ps[2 + sindex]; \
    }
#endif

In this way, I think the number of instructions could be reduced too.

And these are the results that I have obtained:

612MB file -> 15.66% for MPSE, 22.13% for ac-q
1GB file -> 34.49% for MPSE, 35.13% for ac-q

I will try to repeat these tests using other more powerful computers.
In any case, we will give more information on Monday.



------------------------------------------------------------------------------
Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
MVC, Windows 8 Apps, JavaScript and much more. Keep your skills
current
with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
MVPs and experts. ON SALE this month only -- learn more at:
http://p.sf.net/sfu/learnnow-d2d
_______________________________________________
Snort-devel mailing list
Snort-devel () lists sourceforge net
https://lists.sourceforge.net/lists/listinfo/snort-devel
Archive:
http://sourceforge.net/mailarchive/forum.php?forum_name=snort-devel

Please visit http://blog.snort.org for the latest news about Snort!
-------------- next part --------------
An HTML attachment was scrubbed...

------------------------------




------------------------------------------------------------------------------
Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
MVPs and experts. ON SALE this month only -- learn more at:
http://p.sf.net/sfu/learnnow-d2d

------------------------------


_______________________________________________
Snort-devel mailing list
Snort-devel () lists sourceforge net
https://lists.sourceforge.net/lists/listinfo/snort-devel


End of Snort-devel Digest, Vol 78, Issue 10
*******************************************



------------------------------------------------------------------------------
Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
MVPs and experts. ON SALE this month only -- learn more at:
http://p.sf.net/sfu/learnnow-d2d
_______________________________________________
Snort-devel mailing list
Snort-devel () lists sourceforge net
https://lists.sourceforge.net/lists/listinfo/snort-devel
Archive:
http://sourceforge.net/mailarchive/forum.php?forum_name=snort-devel

Please visit http://blog.snort.org for the latest news about Snort!

Current thread: