Nmap Development mailing list archives

Re: [NSE] A Lua implementation of NSE--detailed review


From: David Fifield <david () bamsoftware com>
Date: Fri, 16 Jan 2009 22:03:28 -0700

On Tue, Jan 06, 2009 at 11:06:50PM -0700, Patrick Donnelly wrote:
In all, the switch to Lua does not bring any sweeping changes to how
scripts are run, just provides a easily maintained platform where it
is very simple to add functionality and to obtain information.

I've now been through every part of the new code at least once so I'm
ready to make moderately informed comments. I'll first say that my
overall impression is favorable. I admit I was (and remain a little)
skeptical because the NSE rewrite had a "not invented here" feeling.
I've found a lot to like in the new code though, and I'm in favor of
integrating most of it.

I. NSE Initialization, Scanning, and How it Works

When Nmap starts, NSE now immediately begins all initialization
including the loading of scripts chosen by the --script switch. Any
problems in loading NSE will stop Nmap before any host scanning
begins. Besides reducing headache when a script fails to load or a
missing library is not found, this has the advantage of separating the
invariant set of scripts that will run for all host groups.

This is a great improvement.

NSE is started via the open_nse (and closed using close_nse) procedure
in nse_main.cc. This function begins by opening all standard Lua
libraries and adding all Nmap standard C libraries to package.preload
(to be required in nse_main.lua).

I was confused by all the places the standard NSE library is loaded.
(I mean pcre, bit, openssl, etc.) It happens
 * at the top of nse_main.lua,
 * in the C function init_main, and
 * twice in the C function script_updatedb (once through
   set_nmap_libraries and once inside the Lua code embedded in a
   character array.
Is it necessary at each of these places? Will they all need to be
updated when a library is added or removed? What does script_updatedb
need, for example, with the pcre and openssl modules?

nse_main.lua ends by returning a function that scans a host group. NSE
saves this function in the Registry for when NSE actually begins
scanning. This ends the initialization stage.

I found nse_main.lua pretty easy to follow. I would like to see more
function-level comments on functions like check_rules, get_chosen_scripts,
run, and especially the anonymous main function--the fact that it
doesn't have a name means a block comment is especially important.

Next nse_main.lua is loaded and called with the private nse library
(different from the one discussed in Section IV) and the array of
rules (--script) are passed as arguments. The private nse library is
used to access some functionality needed through C such as
keyWasPressed() or nsock_loop().

When NSE (the returned function by nse_main.lua) is used to scan a
host group, a new nse library is created for scripts (replacing the
old one).

This is confusing, to have one "nse" library with the functions
  { fetchfile_absolute, dump_dir, nsock_loop, key_was_pressed, ref,
    unref, updatedb, scan_progress_meter }
and an unrelated "nse" library containing the functions
  { new_thread, push_handler, pop_handler, get_host, get_port }.
The internal library needs a different name.

But apart from that, I think introducing a brand new "nse" library and
API is too big a change. While there are no scripts that use it, can it
be removed? I'm not opposed to adding them after the rest of the
changeover, but it should be after some API discussion and perhaps they
should be in the "nmap" namespace.

Immediately afterwards, NSE begins to test hostrules and portrules
against every Script that was chosen during initialization. If a
thread is created (the rule function existed and returned true), then
the thread is placed in its corresponding runlevel table. Finally for
each runlevel, in sorted order of course, NSE passes the runlevel
group to a run function which handles the actual scanning.

The run function creates running and waiting queues which contain
threads ready to be run and threads waiting to be moved to running.
Again, NSE makes use of Lua's lexical scoping to create functions that
manipulate the values in the running and waiting queues. Particularly,
the function (used by C), WAITING_TO_RUNNING, moves a thread from the
waiting queue to the running queue.

These parts were easy to read, in that respect superior to the current
code.

II. Host Timeout Management

Before, NSE would immediately start the time out clocks for all hosts
in the current runlevel group and stop the clock only when no other
thread exists for that host. A problem arises when a thread is waiting
to acquire sockets or is blocked on a mutex and is not technically
running (indirectly or otherwise) against the host. The target should
not be charged time when none of its threads are working.

I think I understand this. You make it possible for a thread to indicate
when it yields whether its timeout clock should keep running while it is
yielded. The problem is a real one, one that we have lived with until
now. I think the implementation makes the idea of timeout clocks intrude
too much upon the basic management of coroutines though.

There are only two places (out of ten) that I found where nse_prepare_yield
is called with charge set to false. Would it be possible to just stop
the timeout clock in C++ code in those two places? Then always make sure
the clock is restarted when a coroutine is resumed. The management of
script coroutines is justifiably complex; adding timeout management to
those parts of the code makes it harder to understand.

These are the following ways a thread may yield and whether its host
will still be charged time.

o A thread that was unable to acquire a socket connection (because the
quota for threads with open sockets is full) will not be charged time
it is suspended.
o A thread blocked on a mutex or condition variable (see nse.condvar)
will not be charged time.
o A thread blocked on a network operation is charged time.

These look like the right conditions to me.

III. Yields propagated up to parent thread back to NSE

When a script thread yields, NSE resumes operation and checks the
status and return values of the thread. When a thread uses its own
coroutines for collaborative multithreading, a function that yields
the thread expecting NSE to resume operation will fail. Specifically,
the callback mechanisms believe the thread it yielded earlier was
actually one being run by NSE (when in fact it was a child coroutine
of the script thread). When it attempts to move this child coroutine
into the running threads for NSE using process_waiting2running, the
procedure silently fails because it cannot find the thread in the
waiting queue (only threads NSE creates are ever in the waiting and
running queues).

Thanks for explaining this. I think I understand now. You're right it is
subtle. It seems right to allow scripts to run coroutines, and have the
coroutines yield on behalf of their parent.

To solve this problem I have modified the coroutine.resume and
coroutine.wrap functions to propagate the yield up to the parent
thread so NSE may assume control, but only when NSE yields the thread.
In order for NSE to distinguish between normal returned (or yielded)
values from a script's coroutine, a special value, NSE_YIELD (a
table), is yielded whenever NSE yields the thread. This table also
contains (for now) two values at indices 1 and 2 which are useful for
tracking resources for the threads and establishing the base (parent)
thread for callbacks in C.

There are two procedures that are used to facilitate the use of NSE_YIELD:

int nse_prepare_yield (lua_State *L, int charge);
void nse_restore (lua_State *L, int index, int number);

nse_prepare_yield is used to place the value NSE_YIELD on the stack,
to indicate to NSE that the thread's host should be charged time
(towards a timeout) while the thread is suspended, and to return an
integer index that is passed to nse_restore when the thread is ready
to be resumed. The value left on the stack, NSE_YIELD, must be yielded
by the caller.

nse_restore is passed the integer returned by nse_prepare_yield and
the number of arguments on the stack of L that should be passed to the
thread you are resuming. L is allowed to be the thread you intend to
resume.

I have trouble understanding this, even with your explanation. Does it
mean NSE C++ code has to do this
    udata->yield.thread_index = nse_prepare_yield(L, 1);
    udata->yield.thread = L;
    return lua_yield(L, 1);
every time it wants to yield? I see in a few places it's a little more
complicated than that. Is it possible to bundle the similar code into an
nse_yield function? (That name would have a nice congruence with the
magic NSE_YIELD value.) A little extra complexity is tolerable to solve
this problem though.

IV. The NSE API

The NSE API is accessible in the nse namespace. The following
functions are available:

As I said above, I would like this library removed for this part of the
integration, and a separate proposal made for its inclusion.

== nse.hosts ==
== nse.condvar(object) ==
== nse.get_host() ==
== nse.get_port() ==
== nse.new_thread(func, ...) ==
A coroutine run manually by a script thread will propagate the yield
through the parent back to NSE. To avoid this, you may use
nse.new_thread to create a thread which is autonomous from the parent.

The function launches a new thread (coroutine) that is managed by NSE.
It inherits the host and port of the parent thread but these values
are not passed to func. Instead the extra parameters to new_thread are
passed. All errors are ignored and return values discarded by NSE.

Does this mean that if I just do a plain yield in a coroutine from a
script it gets propagated to NSE (with the NSE_YIELD mechanism)? I
thought that only happened with nsock/mutex-type operations. I think I
would really prefer it to be that way. If it's not, Lua programmers will
have to relearn some concepts: an NSE coroutine is not a normal Lua
coroutine; if you want a normal coroutine, you have to use nse.new_thread.

== nse.push_handler(func) ==
This function pushes func on a handler stack. If the thread ends
normally or aborts due to an error, all functions on the handler stack
are called from the top of the stack down.
== nse.pop_handler() ==
This function pops a function from the thread's handler stack.

These function names are too generic. Maybe name them after C's atexit,
or something else to convey that they are exit handlers.

V. Host and Port Userdata

NSE uses host and port userdata to easily, and safely, interface with
Nmap's internal classes that represent these objects. A Script may
access its host or port userdata by using the calls nse.get_host and
nse.get_port, respectively. When a new host group is processed,
new_host (lua_State *L, Target *target) is called for each target in
the group. This host userdata is simply a double pointer (pointer to a
pointer) for the Target. The methods for the userdata provide an easy
way to call the methods of the Target Class (documented later). Once
the userdata is made, every port for the host that is in the state
PORT_OPEN, PORT_OPENFILTERED, PORT_UNFILTERED is created (once) using
new_port(lua_State *L, Target *target, Port *port). These ports are
saved in a table, TARGET_PORTS, in the target userdata's environment
[1]. Host and Port userdata can be used as unique objects representing
the host or port.

I like the general idea of providing safe access to Nmap's data
structures from NSE, but the example of hosts and ports is not
compelling. This kind of information is already made available in the
port and host tables. It creates confusion if there is more than once
source for the same information. For example, should I say
  host.name
or
  host:HostName()
It depends if "host" is a host table or a host object. What's more
confusing is that if "host" is the current host table, and I do
  host.name = "blah"
the host.name and nse.get_host():HostName() aren't even the same
anymore.

I would like this part of the new code to be held back and have some
further design. For one thing, it would be nice to fix the defect of
Nmap's inconsistent method name capitalization in the interface we
present to script writers.

I was thinking, maybe there's a way to preserve compatibility with
presenting hosts and ports as tables, while also providing safe access
to internal data structures and a really slick way to set host and port
data. As it is now, the host and port tables are just dumb copies of
Nmap's data structures; if you modify them ("blah"), nothing else takes
any notice. What if instead of making copies, the table given to scripts
had an __index that hooked into Nmap code to get the data? So you would
say "host.name" and Nmap would internally make the Target->HostName()
call.

This avoids having to create a new parallel API, and could also be a
better way to set host and port information than exists currently.
Whereas now to set the service name of a port you have to do
  nmap.set_port_version(host, { number = 1234, protocol = "tcp", name = "foobar" }, "hardmatched")
it could become as easy as
  port.version.name = "foobar"

If anyone wants to follow up on this we should probably start a separate
thread. But anyway I think this part of the code bears more discussion
so I'd like it help back from the integration. I see that you use some
of these methods in nse_main.lua; I hope it won't be too hard to change
them to the table representation or find another way to replace them.

That's all I have to say, for this post anyway. You don't have to start
tearing apart your code just yet; I want to allow time for more opinions
than just mine. By the way, I know that when I get into code review mode
I sometimes come across as more critical than I intend to be. I do think
this is a fine bit of code you've developed, and from what I've seen
worthy of inclusion.

David Fifield

_______________________________________________
Sent through the nmap-dev mailing list
http://cgi.insecure.org/mailman/listinfo/nmap-dev
Archived at http://SecLists.Org


Current thread: