Nmap Development mailing list archives

Re: TCP_WINDOW and TCP_MSS correlation as feature


From: Alexandru Geana <alex () alegen net>
Date: Thu, 28 May 2015 17:01:03 +0200

Hello Daniel and list,

I found the root of all evil. The differences in the novelty scores were
due to differences in vectorize.py (the vectorization function for the
new feature) and FPEngine.cc (the code snippet that added the feature
value to features[]). Basically, in vectorize.py, if mss is missing, the
function returns MISSING which gets turned into -1 by impute(). In
FPEngine.cc, mss is initialized to -1 and changed to the value of mss if
the option is included in the header. Pretty much the same thing, but
the versions of the patch that I send, do not check if mss is -1 and use
it either way. This leads to some other problems regarding the ranges
the feature values take during learning and classification and the
novelty is different. Easy fix, check if mss is either 0 or -1.

Attached to this e-mail I am sending new versions with the
aforementioned fix and without float.h.

Best regards,
Alexandru Geana
alegen.net

On 05/22, Alexandru Geana wrote:
Hello Daniel,

While investigating why the novelty has increased, I believe I stumbled
upon a bug. While scanning a Debian 7 VM, I obtained the fingerprint
below. Part of the debug output of nmap gives accuracy 39 with novelty
22 (above the 15.0 threshold) and predict.py gives 32 with novelty 11. I
am not sure exactly what the reason is, but I am looking into it. I just
wanted to share this with you.

Without the patches applied, both outputs have the same numbers. For
this reason, I did not start a new thread.

Which method prints 5.49 in your case?

Fingerprint:
============

OS:SCAN(V=6.47SVN%E=6%D=5/22%OT=22%CT=1%CU=41348%PV=N%DS=1%DC=D%G=Y%M=0800
OS:27%TM=555F3D96%P=x86_64-unknown-linux-gnu)S1(P=6000{4}280640XX{32}0016b
OS:fc39d182745eeaf1384a01237c845f70000020405a00402080a000e7a7dff{4}0103{3}
OS:%ST=0.226554%RT=0.327707)S2(P=6000{4}280640XX{32}0016bfc456a5e89deeaf13
OS:85a01237c8caf60000020405a00402080a000e7a96ff{4}0103{3}%ST=0.327141%RT=0
OS:.528054)S3(P=6000{4}280640XX{32}0016bfc5cf9b6e25eeaf1386a01237c8cf5e000
OS:0020405a00101080a000e7aafff{4}0103{3}%ST=0.426597%RT=0.5281)S4(P=6000{4
OS:}280640XX{32}0016bfc64194a017eeaf1387a01237c828580000020405a00402080a00
OS:0e7ac8ff{4}0103{3}%ST=0.527724%RT=0.727527)S5(P=6000{4}280640XX{32}0016
OS:bfc7884b9dcaeeaf1388a01237c8e3d20000020405a00402080a000e7ae1ff{4}0103{3
OS:}%ST=0.627833%RT=0.727581)S6(P=6000{4}240640XX{32}0016bfc8df292cd9eeaf1
OS:389901237c811d50000020405a00402080a000e7afaff{4}%ST=0.726663%RT=0.94969
OS:8)IE1(P=6000{4}803a40XX{32}8109091cabcd00{122}%ST=0.751024%RT=0.949752)
OS:IE2(P=6000{4}583a40XX{32}04010a7c00{3}38600123450028002dXX{32}3c0001040
OS:0{4}2b00010400{12}3a00010400{4}80000a9cabcd0001%ST=0.800413%RT=0.949794
OS:)NS(P=6000{4}183affXX{32}8800fd834000{3}XX{16}%ST=0.8992%RT=0.949843)U1
OS:(P=6000{3}01643a40XX{32}01049f9f00{4}6001234501341136XX{32}bfd9a1840134
OS:696c43{300}%ST=0.949142%RT=1.14749)TECN(P=6000{4}200640XX{32}0016bfc97a
OS:d0f2aceeaf138a805238403db00000020405a0010104020103{3}%ST=0.999213%RT=1.
OS:14754)T4(P=6000{4}140640XX{32}0016bfccdd10ee6300{4}500400005b370000%ST=
OS:1.80087%RT=1.80133)T5(P=6000{4}140640XX{32}0001bfcd00{4}eeaf138e5014000
OS:024720000%ST=1.19684%RT=1.80138)T6(P=6000{4}140640XX{32}0001bfcee6d14c4
OS:700{4}50040000f3a50000%ST=1.24727%RT=1.80141)T7(P=6000{4}140640XX{32}00
OS:01bfcf00{4}eeaf139050140000246e0000%ST=1.29656%RT=1.80145)EXTRA(FL=1234
OS:5)

Output from nmap:
=================

39.3444 22.8587  45 Linux 2.6.23 - 2.6.32
7.9485 99.8762  89 Linux 3.13 - 3.19
1.4871 20.9707  59 Linux 3.2 - 3.8
1.2185 22.5952  65 OpenWrt (Linux 3.3 - 3.10)
...

Output from predict.py:
=======================

$: ./predict.py -m nmap.model <(./nmap26fp.py scan.fp)

== /proc/self/fd/11 ==
nmapclasses:
predictions
45.  32.65%  11.07 Linux 2.6.23 - 2.6.32
89.   5.15%  97.85 Linux 3.13 - 3.19
59.   1.16%   6.31 Linux 3.2 - 3.8
65.   0.90%  10.51 OpenWrt (Linux 3.3 - 3.10)
...

Best regards,
Alexandru Geana
alegen.net

On 05/21, Daniel Miller wrote:
Alex,

Thanks, this looks good! I think, though, that we can simply use either
MISSING or UNKNOWN (both of which become -1 in the feature vector) for the
(very unlikely) case where MSS is 0. We only have one fingerprint in our
whole IPv4 database that has a MSS of 0, "Fingerprint Dell EqualLogic
PeerStorage PS100E NAS device (NetBSD 1.6.2)". This would eliminate the
need to include numpy in vectorize.py and float.h in FPEngine.cc.

I am not sure what you are seeing to cause such a high novelty with
scanme.nmap.org. My scans are coming back with 5.49. Can you provide the
fingerprint you are getting?

I will commit this with these changes pending our discussion later today.

Dan

On Mon, May 11, 2015 at 12:59 PM, Alexandru Geana <alex () alegen net> wrote:

Hello devs,

During one IRC discussion, an idea was brought up to use the correlation
between TCP_WINDOW and TCP_MSS as a feature for the IPv6 logistic
regression model. Attached to this email I am sending two patches, one
for the nmap codebase and another for the ipv6tests folder which adds
this new feature.

While testing on scanme.nmap.org, I noticed that the novelty threshold
was too low (nmap had the top result with novelty at around 20.8), so
I set the FP_NOVELTY_THRESHOLD to 25.

Let me know what you think and if you find any problems with it.

Best regards,
Alexandru Geana
alegen.net

_______________________________________________
Sent through the dev mailing list
https://nmap.org/mailman/listinfo/dev
Archived at http://seclists.org/nmap-dev/



Attachment: nmap.diff
Description:

Attachment: ipv6tests.diff
Description:

Attachment: signature.asc
Description: Digital signature

_______________________________________________
Sent through the dev mailing list
https://nmap.org/mailman/listinfo/dev
Archived at http://seclists.org/nmap-dev/

Current thread: