Nmap Development mailing list archives
Re: [RFC] Improve NSE HTTP architecture.
From: Djalal Harouni <tixxdz () opendz org>
Date: Sun, 19 Jun 2011 21:09:02 +0100
On Thu, Jun 16, 2011 at 05:17:50PM -0700, Fyodor wrote:
On Tue, Jun 14, 2011 at 02:46:55PM +0100, Djalal Harouni wrote:Currently there are more than 20 HTTP scripts, most of them are discovery scripts that perform checks/tests in order to identify the HTTP applications. These tests can be incorporated into the http-enum script to reduce the size of the loaded and running code, and to achieve better performance. Of course this will reduce the number of the HTTP scripts, but writing an entire NSE script for a simple check that can be done in 5-10 Lua instructions is not the best solution either.Reducing the total code size and optimizing performance is indeed very important. But of course we also have to keep user interface factors in mind. Right now, many http discovery scripts such as html-title and http-robots.txt run by default with -A or -sC. If we moved them into http-enum and users had to know about them and specify special arguments, I think that would dramatically reduce usage of the functionality.
I agree.
This proposal relies on some of the Nmap information that should be exported to NSE scripts: * User specified script categories selection "--script='categories'".That would be easy to add, but I worry about what scripts would do with the information. For example, suppose we have http-enum do vuln checks if the 'vuln' category was selected. Well, then what if the user just specified script names specifically (which may or may not be in vuln category)? What if user specified --script=all? Maybe rather than try to reimplement the category selection functionality, the script(s) could be made to work with it. For example, if the shared work is done in a library anyway, maybe you could have a small http-enum-vuln script which users could enable by name or category or whatever.
Yes another small script like http-enum-vuln, that will load 'vuln' or 'exploit' fingerprints or matches is a good solution, this way we avoid the one-script-per-vuln, especially if that check is only 5 Lua instructions. And loading fingerprints based on their categories should be done by a library code. So I'll say: a script that will load the 'intrusive', 'exploit', 'dos" and 'vuln' fingerprints and matches, can be a popular script. My main point on this is to use the same NSE categories, and not extra categories like 'attack', etc. The 'app' field in the fingerprint table can be used to identify the application type.
5) Crawler and http-enum: then http-enum with its matching code and other HTTP scripts can be in a situation when they will not yield since there are no network operations. A solution in the http-enum matching code (this is the big code) would be to use coroutines and make them yield explicitly.Have you experienced this problem or is it just speculation? It is probably worth trying to reproduce it (if you haven't already) before spending much time trying to fix it.
It's rather based on speculation.
So currently we consider that the crawler which is a discovery script and other discovery scripts like http-enum must run in the same dependency level.For what it is worth, I had been assuming that the crawler would be a library. A script which needs spidering services would activate the library and tell it what information is needed. The spider library would store (probably up to some limit) results so that it may not have to make as many (or even any) requests when the next script asks for similar information.
I agree, and perhaps we'll also have a special full capable crawler script.
6) Improve HTTP fingerprints and http-enum: -------------------------------------------This one seems pretty independent from some of your other suggestions. So, if this is desired, at least it could be implemented at any time. I do agree with you that it is often best to combine many similar http tasks in one script and that there is room to enhance http-enum to do a lot of that. I do think we should try to avoid bloating things such that users need to specify extra arguments to effectively use scripts. At least important/common scripts like http-enum stuff. Required options are more reasonable for obscure/special-purpose scripts.* http-brute: the design of this script can be improved a lot. If the crawler and http-enum script are running, then a dynamically registred match table by the http-brute script that checks the returned status code and the 'www-authenticate' header field, will be used by the http-enum script, to discover multiple protected paths, which can be saved in the registry by the match misc handler, and later the http-brute script will try to brute force them. So in this situation the http-brute will depend on the http-enum script.I agree that it would be great for http-brute to be able to use information from enumeration/spidering scripts/libraries. Though of course the user should be able to use it to brute force a specific page instead if desired.
We can make the http-brute insert fingerprints or matches dynamically, which will be processed by http-enum, a match handler will save the paths in the registry for later use, without changing the current behaviour when a user specifies the path.
* http-auth: we have already said that this can be converted into a general match in the http-matchers.lua file. The downside of this is that we will remove this script. If we don't want to remove the script we can modify it to make it register that match dynamically.Well, a key feature of that script is that it runs by default and includes a piece of information which is quickly and easily determined (whether authentication is required at the root of the given web server). So we wouldn't want to remove this script until we have a way to replicate that behavior, I think. So the combined script would have to run by default, I guess.
Perhaps we can reproduce the adding targets feature for this specific purpose. As I've said before, scripts should be able to register fingerprints and matches dynamically, so perhaps we can add: httpenum.lua library: -- A global variable set to true to activate the behavior -- e.g. when httpenum.lua is loaded by one of the http-enum scripts, -- this should be automatic for other scripts, without specifying -- script arguments. httpenum.NEW_FINGERPRINTS -- This function will check the NEW_FINGERPRINTS variable -- before inserting new fingerprints. httpenum.add_fingerprints(my_fingerprint,...) Script rules can call this function to insert new fingerprints. So with this solution we do not remove the current http-auth behaviour, and we'll make it more smarter.
* http-date: we can also convert this script to a simple general fingerprint or make the script register the fingerprint dynamically. fingerprint { categories = {'discovery', 'safe'}, probes = {path='/', method='HEAD'}, matches = { status_code = 200, header['date'] = "(.+)", output_handler = function(#header.date_1#) -- parse #header.date_1# end,Well, besides being default, http-date offers some nice features such as telling the user how much the remote time differs from local time. And we don't win much from eliminating this script since it is only 44 lines long (including documentation and empty lines).
Ok.
I guess deciding when it is better to split or combine scripts is a very tough decision. We faced that last week with Gorjan's ip-geolocation script. At first he combined several geolocation providers into one script, but later split it into five scripts. Which is better? I don't know. Each approach has advantages and drawbacks. I guess a key is to identify the general factors we should use when deciding whether to split or combine scripts. Because if we have some folks busily combining scripts while others are busy splitting them up, we don't make much progress.
A standard which can help us to make the right decision should be added to the Nmap NSE doc. Thanks. -- tixxdz http://opendz.org _______________________________________________ Sent through the nmap-dev mailing list http://cgi.insecure.org/mailman/listinfo/nmap-dev Archived at http://seclists.org/nmap-dev/
Current thread:
- [RFC] Improve NSE HTTP architecture. Djalal Harouni (Jun 14)
- Re: [RFC] Improve NSE HTTP architecture. Patrik Karlsson (Jun 15)
- Re: [RFC] Improve NSE HTTP architecture. Ron (Jun 16)
- Re: [RFC] Improve NSE HTTP architecture. Djalal Harouni (Jun 18)
- Re: [RFC] Improve NSE HTTP architecture. Djalal Harouni (Jun 18)
- Re: [RFC] Improve NSE HTTP architecture. Ron (Jun 16)
- Re: [RFC] Improve NSE HTTP architecture. Fyodor (Jun 16)
- Re: [RFC] Improve NSE HTTP architecture. Djalal Harouni (Jun 19)
- Re: [RFC] Improve NSE HTTP architecture. Patrick Donnelly (Jun 20)
- Re: [RFC] Improve NSE HTTP architecture. Djalal Harouni (Jun 20)
- Re: [RFC] Improve NSE HTTP architecture. Djalal Harouni (Jun 19)
- Re: [RFC] Improve NSE HTTP architecture. Patrik Karlsson (Jun 15)