Nmap Development mailing list archives
Re: [NSE] Robots rethink
From: "Eddie Bell" <ejlbell () gmail com>
Date: Thu, 5 Jun 2008 12:50:10 +0100
2008/6/4 Fyodor <fyodor () insecure org>:
On Wed, Jun 04, 2008 at 09:25:53PM +0100, jah wrote:On 04/06/2008 19:06, Eddie Bell wrote:Good idea, the amended version is attached. I've also increased the verbose output line length (from 40 to 50) so that less vertical space is taken up.Aye, I think it's great to have the option not to print all the disallow entries, I like the amendment. How about a count of disallow entries for non-verbose results? This would give an idea as to how interesting the robots.txt file might be and whether it's worth running the scan with more verbosity.Yeah, the improvements look great. And too-long output is an important issue. But as a general note, I think the solution of "dump the long stuff in -v and print just a short summary without -v, and maybe print all sorts of crap with -vv" may be overused. The first goal should be to find a way to format the results compactly and readably which works well regardless of verbosity. And if there are a few control variables (such as max number of entries printed), maybe you could just tweak those a bit based on verbosity. There may be some good cases for having completely different output formats based on verbosity level, but they are few and far between. Also, I think it is important that output size be limited even with -v. Because users don't want to be bombarded with 200 lines of robots.txt in their output. So maybe a good way to handle a script like robots.nse is: o Check for the somewhat common case of an emplty robots.txt (for example, that's what you'll find at http://insecure.org/robots.txt) and either print nothing, or print that it is empty in that case. o Print the summary line (that robots.txt exists and has XX entries) in all cases. o Maybe print up to 2-4 lines worth of entries in normal mode, and a higher number like up to 10 lines in verbose mode. That way people see a small sampling even in in normal mode. Note that I haven't had time to even look at your changes very closely, so you may be doing some or much of this stuff already. And don't take this as any criticism of robots.nse. I just thought it made a good example to launch into a discussion of how we handle output verbosity. I believe that one of the reasons Nmap is so successful is that we put a whole lot of work into presenting information to users in a clean, orderly, useful fashion. Certain other port scanners, for example, just print open ports as they are found and leave you with a mess of debug-looking output mixed with open port information which is not even sorted to keep ports from the same host together, much less sorted numerically. Also, many tools simply flood you with data just because they have it available, even when there are few if any practical uses for that data. This causes the important information to get lost in the flood. Whenever new output is added to Nmap (from an NSE script or whatever), try to think of how that information could actually be useful to someone. If you come up blank, it is generally best to leave it out. Cheers, -F
It would be hard to do it with line numbers as the code stores all the entries in a big table which does not account for lines but it can be done with number of entries. For example, No robots file: Interesting ports on scanme.nmap.org (64.13.134.52): PORT STATE SERVICE 80/tcp open http Empty robots file: Interesting ports on insecure.org (64.13.134.49): PORT STATE SERVICE 80/tcp open http |_ robots.txt: is empty Normal mode: Interesting ports on py-in-f99.google.com (64.233.167.99): PORT STATE SERVICE 80/tcp open http |_ robots.txt: has 136 disallowed entries Single verbose (15 entries): Interesting ports on py-in-f99.google.com (64.233.167.99): PORT STATE SERVICE 80/tcp open http | robots.txt: has 136 disallowed entries (15 shown) | /news?output=xhtml& /search /groups /images /catalogs | /catalogues /news /nwshp /? /addurl/image? /pagead/ /relpage/ |_ /relcontent /sorry/ Double verbose or debug (50 entries): Interesting ports on eh-in-f99.google.com (72.14.207.99): PORT STATE SERVICE 80/tcp open http | robots.txt: has 136 disallowed entries (50 shown) | /news?output=xhtml& /search /groups /images /catalogs | /catalogues /news /nwshp /? /addurl/image? /pagead/ /relpage/ | /relcontent /sorry/ /imgres /keyword/ /u/ /univ/ /cobrand /custom | /advanced_group_search /advanced_search /googlesite /preferences /setprefs | /swr /url /default /m? /m/? /m/lcb /m/search? /wml? /wml/? | /wml/search? /xhtml? /xhtml/? /xhtml/search? /xml? /imode? /imode/? |_ /imode/search? /jsky? /jsky/? /jsky/search? /pda? /pda/? Double debug or debug + triple verbose Interesting ports on jc-in-f99.google.com (64.233.187.99): PORT STATE SERVICE REASON 80/tcp open http syn-ack | robots.txt: has 136 disallowed entries (136 shown) | /news?output=xhtml& /search /groups /images /catalogs | /catalogues /news /nwshp /? /addurl/image? /pagead/ /relpage/ | /relcontent /sorry/ /imgres /keyword/ /u/ /univ/ /cobrand /custom | /advanced_group_search /advanced_search /googlesite /preferences /setprefs | /swr /url /default /m? /m/? /m/lcb /m/search? /wml? /wml/? | /wml/search? /xhtml? /xhtml/? /xhtml/search? /xml? /imode? /imode/? | /imode/search? /jsky? /jsky/? /jsky/search? /pda? /pda/? /pda/search? | /sprint_xhtml /sprint_wml /pqa /palm /gwt/ /purchases /hws /bsd? | /linux? /mac? /microsoft? /unclesam? /answers/search?q= | /local? /local_url /froogle? /products? /froogle_ /product_ | /products_ /print /books /patents? /scholar? /complete | /sponsoredlinks /videosearch? /videopreview? /videoprograminfo? | /maps? /mapstt? /mapslt? /maps/stk/ /mapabcpoi? /translate? | /ie? /sms/demo? /katrina? /blogsearch? /blogsearch/ | /blogsearch_feeds /advanced_blog_search /reader/ /uds/ /chart? /transit? | /mbd? /extern_js/ /calendar/feeds/ /calendar/ical/ | /cl2/feeds/ /cl2/ical/ /coop/directory /coop/manage /trends? | /trends/music? /notebook/search? /music /browsersync /call | /archivesearch? /archivesearch/url /archivesearch/advanced_search | /base/search? /base/reportbadoffer /base/s2 /urchin_test/ /movies? | /codesearch? /codesearch/feeds/search? /wapsearch? /safebrowsing |_ /reviews/search? /orkut/albums /jsapi /views? /c/ /cbk
Attachment:
robots.nse
Description:
_______________________________________________ Sent through the nmap-dev mailing list http://cgi.insecure.org/mailman/listinfo/nmap-dev Archived at http://SecLists.Org
Current thread:
- [NSE] Robots rethink Eddie Bell (Jun 04)
- Re: [NSE] Robots rethink Kris Katterjohn (Jun 04)
- Re: [NSE] Robots rethink Eddie Bell (Jun 04)
- Re: [NSE] Robots rethink jah (Jun 04)
- Re: [NSE] Robots rethink Kris Katterjohn (Jun 04)
- Re: [NSE] Robots rethink Eddie Bell (Jun 04)
- Re: [NSE] Robots rethink Fyodor (Jun 04)
- Re: [NSE] Robots rethink Eddie Bell (Jun 05)
- Re: [NSE] Robots rethink Eddie Bell (Jun 04)
- Re: [NSE] Robots rethink Kris Katterjohn (Jun 04)