Nmap Development mailing list archives
Re: [RFC][patch] XML structured script output (evaluation of nse-structured3 patch)
From: David Fifield <david () bamsoftware com>
Date: Wed, 13 Jun 2012 22:43:59 -0700
On Tue, May 29, 2012 at 03:30:25PM -0500, Daniel Miller wrote:
I'm attaching an update to this patch, since the Lua 5.2 update changed a few things. Also, this update includes modifications to the XSL stylesheet so that the output should look the same as it did before.
I think that using stdnse.format_output as the conduit for XML output is the wrong idea. On the one hand, it's nice because you're already getting a semi-structured table. On the other hand, as you've seen, scripts use it as a text formatting function (which it is), and many scripts are going to be uselessly outputting <elem>unstructured string</elem> until they are rewritten to add even more structure to their structured output. Also, it bothers me somewhat that the machine-readable keys are prose-looking strings like "Not valid before", which are subject to typos, capitalization changes, and localization tweaks. However I think we could live with these problems. format_output's input isn't particularly rich. It is good at formatting text output, but I don't think we want it to limit what we can do with XML output or have to extend it in weird ways. For example, I want there to be <error> and <vuln> elements outside of the usual output table, and I don't want an API that makes that difficult. Really, what we want from XML script output is some reversible representation of a Lua table. For example, take quake3-info. I think that a nice Lua representation of the output would be { players = { { name = "cyberix", frags = "20", ping = "4" }, }, options = { capturelimit = "8", dmflags = "0", elimflags = "0", fraglimit = "20", gamename = "baseoa", } } And this in turn might look like this in XML: <script id="quake3-info"> <dict> <list key="players"> <dict> <elem key="name">cyberix</elem> <elem key="frags">20</elem> <elem key="ping">4</elem> </dict> </list> <dict key="options"> <elem key="capturelimit">0</elem> <elem key="dmflags">0</elem> <elem key="elimflags">0</elem> <elem key="fraglimit">20</elem> <elem key="gamename">baseoa</elem> </dict> </dict> </script> Then, if I wanted to find all the servers on which cyberix is playing, I could use a crazy xmlstarlet command like this: xmlstarlet sel \ -t -m '//port/script[@id="quake3-info"]//list[@key="players"]/dict[elem[@key="name"]="cyberix"]' \ -v '../../../../../../address[@addrtype="ipv4"]/@addr' -n quake.xml Notice that the XML output doesn't have to correspond exactly to the text output. What I'm thinking is that we start allowing script to return a table, not just a string. Tables will be pretty-printed and indented to be copied to normal output, and turned into XML as shown above. Scripts that return a string will not have any structured XML output written at all. But: I think there should be a way to specify a human-readable string and a machine-readable table/XML blob at once. Suppose, for the moment, that we allow a script to return a {string, table} pair. Then we show the string in normal output, and write the table to XML. Scripts that don't care very much can return just a string or just a table--we'll synthesize text output by pretty printing if we get just a table. Maybe that will catch on and people will prefer their normal output to look like that. But cases where we want normal output and XML output to look different include nfs-ls, whose normal output looks like this: | NFS Export: /mnt/nfs/files | NFS Access: Read Lookup NoModify NoExtend NoDelete NoExecute | PERMISSION UID GID SIZE MODIFICATION TIME FILENAME | drwxr-xr-x 1000 100 4096 2010-06-17 12:28 /mnt/nfs/files | drwxr--r-- 1000 1002 4096 2010-05-14 12:58 sources | -rw------- 1000 1002 23606 2010-06-17 12:28 notes As a Lua table it might look like this: { { export = "/mnt/nfs/files", access = {"Read", "Lookup", "NoModify", "NoExtend", "NoDelete", "NoExecute"}, files = { {perm = "1755", uid = "1000", gid = "100", size = "4096", mtime = "2010-06-17 12:28", name = "/mnt/nfs/files"}, {perm = "1744", uid = "1000", gid = "1002", size = "4096", mtime = "2010-05-14 12:58", name = "sources"}, {perm = "0600", uid = "1000", gid = "1002", size = "23606", mtime = "2010-06-17 12:28", name = "notes"}, } } } Which would lead to XML like this: <script id="nfs-ls"> <list> <dict> <elem key="export">/mnt/nfs/files</elem> <list key="access"> <elem>Read</elem><elem>Lookup</elem><elem>NoModify</elem>... </list> <list key="files"> <dict> <elem key="perm">1755</elem><elem key="uid">1000</elem>... </dict> <dict> <elem key="perm">1744</elem><elem key="uid">1000</elem>... </dict> <dict> <elem key="perm">0600</elem><elem key="uid">1000</elem>... </dict> </list> </dict> </list> </script> Here I think it is very important both to (1) isolate individual datums like the uid in the XML output, and (2) preserve a compact normal output that looks like the output of ls. So my idea is basically this: Scripts that don't have complex output can continue to return a string, or else return a table that will be formatted in a reasonable fashion. Scripts with specialized output needs can build up a string and a table output simultaneously, and return them both. In many cases, like in nfs-ls, the string can be derived from the table by the script in one postprocessing step. (Sort of like how ssl-cert.nse builds up a text output from the cert table. In processing XML, I want something closer to the cert table than to the text output.) One downside is that dictionary tables don't preserve ordering of elements. Scripts that just return a table won't be able to control the ordering of their output. I propose that we ignore this for simplicity. The alternative of making an array containing tiny name-value tables, while reasonable, is so cumbersome that I can't see people actually doing it. I'm going to call this "proposal beta" on the wiki page. David Fifield _______________________________________________ Sent through the nmap-dev mailing list http://cgi.insecure.org/mailman/listinfo/nmap-dev Archived at http://seclists.org/nmap-dev/
Current thread:
- [RFC][patch] XML structured script output Daniel Miller (May 21)
- Re: [RFC][patch] XML structured script output Daniel Miller (May 24)
- Re: [RFC][patch] XML structured script output Djalal Harouni (May 27)
- Re: [RFC][patch] XML structured script output Daniel Miller (May 27)
- Re: [RFC][patch] XML structured script output Daniel Miller (May 29)
- Re: [RFC][patch] XML structured script output Fyodor (Jun 03)
- Re: [RFC][patch] XML structured script output (evaluation of nse-structured3 patch) David Fifield (Jun 13)
- Re: [RFC][patch] XML structured script output (evaluation of nse-structured3 patch) Daniel Miller (Jun 14)
- RE: [RFC][patch] XML structured script output (evaluation of nse-structured3 patch) Rob Nicholls (Jun 29)
- Re: [RFC][patch] XML structured script output (evaluation of nse-structured3 patch) Daniel Miller (Jun 29)
- Re: [RFC][patch] XML structured script output (evaluation of nse-structured3 patch) Patrick Donnelly (Jun 30)
- Re: [RFC][patch] XML structured script output (evaluation of nse-structured3 patch) Daniel Miller (Jun 30)
- Re: [RFC][patch] XML structured script output Daniel Miller (May 27)
- Re: [RFC][patch] XML structured script output (output diff) David Fifield (Jun 13)