Nmap Development mailing list archives
[GSoC] Performance / Optimization Project
From: Alexandra Binca <alexandra.binca () gmail com>
Date: Tue, 24 Mar 2015 23:16:31 +0200
Hello, My name is Alexandra Binca, I study for a Master's degree at Politehnica University of Bucharest. I am very good with algorithms and data structures (I like a lot to train on sites with algorithmic challenges). I want to apply this year to GSoC for the nmap organization, more specifically to the optimization task. I did some profile on nmap using valgrind (with callgrind) and kcachegrind and studied the hotspots a bit. I am confident that over the summer I can reduce the execution time of some parts of nmap by implementing better data structures and algorithms. For the application period I focused on two main things in order to demonstrate my ability to work with nmap and C++: 1. Optimization of --exclude lists. The current algorithm implemented in nmap to deal with --exclude lists has a O(N) complexity for decision (is a certain address excluded?) and does not cover ipv6 addresses (nbase/nbase_addrset.c). I implemented a new data structure (a binary decision tree) that reduces the lookup complexity to O(logN), where N is the exclude list size. The algorithm is generic (can work on spaces of any dimension) and uses a fair amount of memory, thus by using it instead of addrset nmap gains speed and the ability to extend it to ipv6 without wasting huge amounts of memory (as the current idea would do). If you want to take a look, the code is here [1] (along with benchmark and test functions). 2. General hotspots of nmap. In one of my tests, most of the time was being spent in reading various configuration files. More specifically, it seems that parsing /etc/services is really slow. At the moment I am trying to reduce the execution time of the function that loads /etc/services by at least 20%. I noticed a lot of calls to sscanf and strtod, maybe it is possible to replace them or rewrite parts of the code to make it faster. Is it possible to setup a wiki account or something like that were I could list the things I tried to do / did over the application period? Also, feedback would be really appreciated :-). TL;DR Benchmark for binary decision tree for IPv4-like ranges (it is not possible to run this test with the current implementation because the complexity is too high): RandomTest (D = 4) begins. add areas speed for 1000 instances: 0ms build tree speed for 1000 instances: 2ms contains speed for 1000000 instances: 215ms RandomTest (D = 4) passed. [1] https://github.com/alexandrabinca/binary_decision_tree Best regards, Alexandra BINCA _______________________________________________ Sent through the dev mailing list https://nmap.org/mailman/listinfo/dev Archived at http://seclists.org/nmap-dev/
Current thread:
- [GSoC] Performance / Optimization Project Alexandra Binca (Mar 24)