Nmap Development mailing list archives
Lua bugfixes and a new buffering feature
From: doug () hcsw org
Date: Sat, 23 Jun 2007 05:19:12 -0700
Hi nmap-dev! I just found 2 showstopper bugs in the PCRE-Lua interface, fixed them and committed the fixes to SVN. It seems to work fine now although the documentation is still hopelessly insufficient for anybody that doesn't know how to read the C source code. :) The REAL required interface is: my_regex = pcre.new("my PCRE pattern", 0, "C") my_regex:exec(string_to_match_against, 0, 0) I am caching the compiled PCRE regexps into the NSE registry using a fairly straightfoward scheme: init = function() -- Start of MOTD, we'll take the server name from here nmap.registry.ircserverinfo_375 = nmap.registry.ircserverinfo_375 or pcre.new("^:([\\w-_.]+) 375", 0, "C") -- NICK already in use nmap.registry.ircserverinfo_433 = nmap.registry.ircserverinfo_433 or pcre.new("^:[\\w-_.]+ 433", 0, "C") ... Then I'm having the action() function (NOT the portrule function) call init() so that these regexps are compiled at most once per Nmap invocation and only then if the action() function for the script is actually called. Perhaps it would be useful to look for an init function which is called only once per script per nmap invocation and only right before action() is called? Another solution we should consider is passing a table to the action function that scripts can use for cross-invocation persistent data structures. This would avoid any possible registry conflict problems (every script would have its own table if it wanted it). I don't know if better registry naming is required or not. IMPORTANT NOTE FOR NSE SCRIPT WRITERS: Don't use the function receive_lines() unless you plan on doing your own line parsing. This function WILL RETURN MORE THAN JUST THE FIRST LINE OF DATA IF MORE IS AVAILABLE. This can be a problem in many scenarios. Most often with NSE you will just miss pieces of data that you don't care about anyways. But sometimes you will miss important lines or you will actually PROCESS AN INCOMPLETE LINE that just happened to be delivered with another line and/or crossed a read() boundary. Consider an application that executes this code to send data to your NSE script: write(sd, "hello\nworld\n", 12); Since write knows nothing about newlines, this will be bundled up in one packet and both lines will probably be delivered in the same read() call (which also knows nothing about newlines) by your OS. This means if you are looping for the output in, say, a while loop... while true do my_line = sd:receive_lines(1) ... my_line will probably be "hello\nworld\n" NOT "hello\n". ... If we process just hello we would miss world! Or even more insidiously, if the packet got split in the middle and you had "hello\nwo" delivered. Unless you store that "wo" for the next call you will be working with incomplete or wrong data. The way some NSE scripts deal with this (see showHTTPVersion.nse) is by keeping a string "response" and appending all data to the end of that and then running regexps on the response at every step to see if any match. This method will work fine for some tasks. But if you want to reliably process data line-by-line as it arrives you need to use something called a "buffer". The most straightforward way to implement this in modern languages is by using a closure. Although I personally find Lua syntax very cumbersome and verbose, Lua does offer a powerful set of primitives that are, in my opinion, vital to and sufficient for productive programming: lexical closures, tail-call optimisation, and dynamic typing. If the concept of closures frightens you, you can probably get away with thinking about them like objects: a closure is sort of an object with exactly one method: "apply". ;) I'm including a fairly general closure-based buffer implementation that I am using in my IRC script to process data on a line-by-line basis. Assuming you have a socket sd you use it like so: my_buffer = make_buffer(sd, "[\r\n]+") and then status, value = my_buffer() status and value are the same as for read_lines(1) (except see the comments). As you can see it is useful for much more than just lines (anything separated by something you can write a lua pattern for). Barring any so-far unnoticed bugs this should be a very safe, reliable way to parse line-based protocols and I suggest we put it (or something like it) into the NSE standard library. Empty lines currently aren't returned which could be a problem for some protocols (like HTTP) but this is a tiny tweak. Best, Doug PS. It has just come to me that maybe the best pattern to use for regular newlines might be "\r?\n" instead of "[\r\n]+"! Oh well. :) -- Generic buffer implementation using lexical closures -- -- Pass make_buffer a socket and a separator lua pattern [1] -- -- Returns a function bound to your provided socket with behaviour identical -- to receive_lines() except it will return AT LEAST ONE [2] and AT MOST ONE "line". -- The data is returned WITHOUT the pattern/newline on the end. -- Empty "lines" ARE NOT RETURNED. -- -- [1] Use the pattern "[\r\n]+" for regular newlines -- [2] Except where there is trailing "left over" data not terminated by a pattern -- (in which case you get the data anyways) -- -- -Doug, June, 2007 make_buffer = function(sd, sep) local self, result local buf = "" self = function() local i, j, status, value i, j = string.find(buf, sep) if i then if i == 1 then -- empty line buf = string.sub(buf, j+1, -1) return self() -- tail else value = string.sub(buf, 1, i-1) buf = string.sub(buf, j+1, -1) return true, value end end if result then if string.len(buf) > 0 then -- left over data with no terminating pattern value = buf buf = "" return true, value end return nil, result end status, value = sd:receive() if status then buf = buf .. value else result = value end return self() -- tail end return self end
Attachment:
signature.asc
Description: Digital signature
_______________________________________________ Sent through the nmap-dev mailing list http://cgi.insecure.org/mailman/listinfo/nmap-dev Archived at http://SecLists.Org
Current thread:
- Lua bugfixes and a new buffering feature doug (Jun 23)
- Re: Lua bugfixes and a new buffering feature Fyodor (Jun 25)