Nmap Development mailing list archives

[NSE] http.lua and delimiters


From: jah <jah () zadkiel plus com>
Date: Wed, 24 Sep 2008 03:43:21 +0100

Hello all,

I noticed a few issues with showHTMLTitle.nse and whilst I was working
through these I found that http.request() was not always returning an
HTTP response correctly.

Specifically the call to stdnse.make_buffer() uses "\r\n" as it's
pattern to delimit lines in the response.  This pattern was changed from
"\r?\n" when the ability to dechunk chunked encoding was added [1] in
tandem with a change to the second argument to table.concat() when
putting the body of the response back together again (from "\n" to
"\r\n") to avoid modifying the body and messing-up the dechunking process.

I decided to knock-up a quick script which sends an HTTP request, uses
socket.receive() in a loop to collect the response as an unmolested
string and then detects the characters used to delimit the header and
body and the characters used to delimit lines in both the header and the
body.

I then ran this script against a few hundred thousand random hosts and
extracted the following info from the results.

3902 hosts had port 80 open, but only 2770 hosts responded to the GET
request.

2451 ~88.5% used \r\n\r\n to separate header and body
Of these, 2374 delimited header values with \r\n, 5 used \n and 72 were
single value headers containing no delimiters.
Of the same 2451 hosts, 335 were header only responses, 937 delimited
lines in the body of the response with \r\n and 1179 with \n.

165 ~ 6% used \n\n to separate header and body
Of these, 7 delimited header values with \r\n, 17 used \n and 141 were
single value headers containing no delimiters.
Of the same 165 hosts, 3 were header only responses, not one delimited
lines in the body of the response with \r\n and the remaining 162 used \n.

154 ~5.5% responded with a header and a body not separated by a double
newline.
These were all headerless responses which were dealt with in a previous
patch [2].

Whilst this is only a small sample I think it demonstrates that the
strings used to separate the header and body, to delimit header values
and to delimit lines in the body of the response may be found in any
combination and that http.lua needs to handle them all - which at
present it does not.

With this in mind I've attached http_nobuf.lua.gz (needs more testing)
which collects the response in its entirety, as a single string, and
then uses patterns to determine the various delimiters.  This allows it
to keep track of the delimiters, to properly construct the returned
header table (without bits of the body attached to some header value)
and to return the body of the response without modification (and without
bits of the header attached).

I've done quite a bit of testing already, but want to let interested
parties look it over, voice any concerns, improvements etc.  There's
some debugging that will need removing and there's a piece of code which
so far hasn't been used and might be removed (or else refined if I ever
find a response which contains both header and body with no
separation).  Also, the dechunking code seems to work very well, but
there's a vague possibility it could break in certain circumstances.

Regards,

jah



[1] - http://seclists.org/nmap-dev/2008/q3/0454.html
[2] - http://seclists.org/nmap-dev/2008/q2/0892.html

Attachment: http_nobuf.lua.gz
Description:


_______________________________________________
Sent through the nmap-dev mailing list
http://cgi.insecure.org/mailman/listinfo/nmap-dev
Archived at http://SecLists.Org

Current thread: