Nmap Development mailing list archives
[NSE][patch] More httpspider blacklist extensions, revamp function
From: Daniel Miller <bonsaiviking () gmail com>
Date: Wed, 13 Jun 2012 15:57:03 -0500
Hi list,I was running into a problem with my XenServer instances, which host a MSI installer for XenCenter on a simple web server. Running any of the scripts that involve spidering resulted in downloading this 43MB file multiple times. I added "msi" to the list of default blacklisted extensions in httpspider.lua, and this solved the problem.
Of course, I couldn't stop there. I added more executable extensions ("msi", "bin"), archive extensions ("tgz", "tar.bz", "tar", "iso"), and a new category, document extensions (pdf, {doc,xls,ppt}{,x,m}, od[fsp], ps, xps).
I also noticed that the blacklist function being created in Crawler:addDefaultBlacklist() was bloated, containing 4 local tables declarations, nested for loops, and string concatenation in the innermost loop. I converted it into a closure over a new table which only requires one level of for loop, and already contains the properly formatted match patterns. Also, I moved the url:getPath() call out of the loop, added a string.lower(), and cached the result in a local variable for doing the string.match(). Previously, uppercase extensions in a URL would not have been matched.
Patch attached. Dan
Attachment:
httpspider-blacklist.patch
Description:
_______________________________________________ Sent through the nmap-dev mailing list http://cgi.insecure.org/mailman/listinfo/nmap-dev Archived at http://seclists.org/nmap-dev/
Current thread:
- [NSE][patch] More httpspider blacklist extensions, revamp function Daniel Miller (Jun 13)
- Re: [NSE][patch] More httpspider blacklist extensions, revamp function Patrik Karlsson (Jun 15)