Nmap Development mailing list archives

Re: [NSE] generic file parsing for datafiles.lua


From: Kris Katterjohn <katterjohn () gmail com>
Date: Sun, 24 Aug 2008 14:43:46 -0500

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

jah wrote:
Hello,


Hey jah,

Kris once suggested [1] that the file parsing code from whois.nse might
be placed in datafiles.lua and this sparked the idea of making the
datafiles library a generic file parser.  The attached represents my
efforts to this end so far.
I've come up with a basic scheme for parsing any file line-by-line and
presenting a table containing the captured information, much like is
done already.  I'd like to bounce it around before I go much further
with it so I shall make an attempt to desrcibe its usage.


Nice.

First thing to say is that I haven't removed the existing cleverly-named
functions: parse_protocols(), parse_rpc(), and parse_services([proto]). 
These now wrap around the generic parse_file( filename [, ...] ).  They
could be removed in the future.


I'll get back to this later.

The idea is that a filename or path (relative to the directory
containing nmap's data files) is passed to the function along with a
table which describes the table desired in return.  The table passed
should contain patterns (with captures) which will be applied to each
line of a file using string.match().
This is best illustrated:

.
status, t = datafiles.parse_file( "nmap-services", {"^%s*([^%s#]+)%s+%d+"} )


Cool!

parse_file( "nmap-services", {[function(ln) return tonumber( ln:match(
"^%s*[^%s#]+%s+(%d+)/tcp" ) ) end] = "^%s*([^%s#]+)%s+%d+/tcp"} )

The key or value may be a function which takes a line as its argument
and returns a captured value (only one value is accepted).
t[80] = http


Very cool!


This is not polished code and I'm posting it so that folks can have a
fiddle - do expect there to be a bug or three.  I'm interested to know
if anybody has any thoughts on this approach (like, is it a bit
complicated) or ideas for a better one.



I've only tested this using rpcinfo.nse so far, so the following are just
first impressions.

I noticed one thing while glancing through the code: line 137 uses
format(file, raw) when I think you mean format(filepath, lines).

Anyway, I think this is a really cool idea.  However, my one issue is whether
or not this should be in datafiles or not.  This is because your functions can
parse much more than just Nmap's data files, which my original library was
limited to.

Maybe it should be a separate library ("parsefile" ?) and datafiles can use it
for the parse*() functions.  Or, as you suggested, maybe my parse*() functions
can be removed leaving just your functions.  But if these are removed, I
especially think the library should be renamed for the reasons I mentioned above.

Either way, I think these additions are neat and hopefully I'll get some time
to test them out some more.  For better or worse, the day the GSoC ended my
school and personal life began again, but I've always tried to squeeze Nmap
time into my schedule so we'll see :)

Cheers,

jah


Thanks,
Kris Katterjohn


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iQIVAwUBSLG57/9K37xXYl36AQKN1w/+NZvvQM7GLh7ACOYBdVdwnEQNsO4C3Hqq
BTSeXfLCkNN1SYzhUvbNsw4Tja4YLZthAG+r6+QIBdkNwzHDpT8cL8rNT9CzZ/vc
UTfYrv0BhxLIA5Tp00R9AljzSJTQLuIz5q8C71sxjmxS3v/tWjQSfTUUBVHpYd3y
bzZ85h+RbsKsZ6q3F2xyipqWIa0cjtD88H8jx5r8TTq+ndRnfElauFyLmBdI+msi
0jPYmX2MK/SbUiwK+cLsC+8CRrYzzM3pq+73zBfkYfEp2F5A1cVnx7KrQxrCw9m3
tizqudkWbL9G/402CIPvw9VBzDnyrScT7/S2piUw/k2qQ+V3RHQQ7cAcd0RXAZpP
q8mYzHhBgsKWC7fZJVL08s6idiaRGRDK1xGRKkQvGC/gy4uvcdAxEwgQdOTVqc0x
IULmYua52cPvQRCtQxz/9I/onJv1QzVPliS4qBNZ+4vHrUniGtP+/LDGaeK77h8M
MWgOcHm9JvKsyeEh/DEpmTW1POMiMcujSPSx7ISMGL8CkEvynl83hLH6aza3r9k5
20vEcP2EJxkqq/FJ4X4fa7qJus+Rv4uq+h4y20Ec0msrzze8GHDBro9VFdHpLZU0
PgZS8hJyVcgc6SmFJfnRRycTJnfeibP3HBBrGx4U8Onbq8kO4z9JXVw4JOy6eQcp
Cuno4H+8J/4=
=C2+y
-----END PGP SIGNATURE-----

_______________________________________________
Sent through the nmap-dev mailing list
http://cgi.insecure.org/mailman/listinfo/nmap-dev
Archived at http://SecLists.Org


Current thread: