IDS mailing list archives
RE: TippingPoint Releases Open Source Code for FirstIntrusionPrev ention Test Tool, Tomahawk
From: "Brian Smith" <bsmith () tippingpoint com>
Date: Tue, 9 Nov 2004 20:59:10 -0600
I'm the author of Tomahawk; I appreciate all the interest in this. I'd like to jump in and clarify what the tool does and what I think it's good for (and not so good for). As most people have figured out by now, tomahawk is a high performance, inline version of tcpreplay. What it does is pretty simple: it loads one or more tcpdump packet captures (I'll call these pcaps) and replays them through an IPS. To do this, it uses a PC with 3 NICs, one for management and two for data that connect to the IPS, like this: +---------+ +-----------+ | | | | | A <----> eth0 | | IPS | | PC eth2 <---> mgmt | B <----> eth1 | | | | | +---------+ +-----------+ The replay is "interface aware"; that is, packets are played through the IPS in an order that's consistent with what the IPS would have seen had it been on the network at the time the trace was captured. For example, suppose that eth0 and eth1 are the data interfaces. A TCP three way handshake will be replayed by sending the SYN out eth0, and SYN-ACK out eth1, and an ACK out eth0. Tomahawk will wait for the SYN to arrive at eth1 before sending the SYN-ACK, just as in a real client/server communication. If a packet is dropped by the IPS, tomahawk will retransmit the packet, up to a set number of times (timeout and retry count are controlled via the command line). When the replay is finished, either because all the packets made it through or because the number of retransmissions was exceeded, tomahawk reports whether the replay completed or timed out. If a single trace was replayed using tomahawk, replay performance would be almost completely dominated by latency through the IPS. For example, if the IPS latency were 10 ms, then you would get 100 packets/sec through the IPS. In fact, you'll get somewhat better performance than this, since tomahawk transmits windows of data. For example, if there are 3 packets to send out eth0, tomahawk will send all three at once. Nonetheless, performance will be dominated by the latency. To ramp up the bandwidth, tomahawk can replay multiple copies of the same pcap in parallel. Each copy is given its own block of IP addresses. This allows considerable parallelism, since each copy can do its own windowing. The code is fairly well optimized for this type of operation. In practice, this allows a single PC to replay about 300 Mbps through an IPS and to simulate a good size chunk of a class B network. You can create higher loads by aggregating traffic from several tomahawk servers through a gigabit switch before going through the IPS. This allows you to create a gigabit test bed for a few thousand dollars. If you model the IPS as a FIFO queueing device, you can see that the network performance reported by tomahawk is roughly the maximum bandwidth the IPS can sustain at zero loss. This is because the bandwidth transmitted by tomahawk is exactly the same as the bandwidth received from the IPS. For example, suppose the device under test has a simple FIFO queue and can process 500 Mbps of traffic. Suppose further that the tomahawk test jig can generate an aggregate of 1000 Mbps of traffic. If tomahawk attempts to replay more than 500 Mbps of traffic through the device, the queue on the device will begin to fill. This will automatically cause tomahawk to back off, almost instantaneously, because it will stop transmitting packets while waiting for the packets in the queue to be received. So that's what it does. Now, what's it good for? The most obvious use is to play an attack pcap using tomahawk and verify that the IPS will block the attack. If tomahawk reports that the replay completed, the attack made it through (i.e., all the packets made it through), regardless of what the alert log states. As others have noted, as a pure coverage test, this test is only as good as the attack pcaps used, and these are not easy to find. But one thing it is good for is repeatability testing. Repeatability testing checks if the IPS is deterministic. If you take a sample set of attacks, an IPS will block some and miss some. For example, given a set of 20 attacks, an IPS may block 18 out of 20. You can't infer much about the attack coverage of the IPS from that, because the sample size is too small (sort of like doing an exit poll of 20 people). But if you replay those attacks a thousand times, you'd better see 18000 blocks and 2000 completes. Otherwise, the IPS is only blocking attacks some of the time. In real deployments, lack of repeatability can show up as leakage. In a worm storm, an IPS may get barraged with hundreds or thousands of attacks per second. If just one of them leaks through, the worm can spread to the network on the far side of the IPS. Another useful thing you can do with tomahawk is check whether the IPS will block legitimate traffic. In this test, you take a sample of clean traffic off your network and replay through the IPS using tomahawk. If the IPS is blocking legitimate traffic, the pcap will time out. There are several details behind this procedure -- after all, the trace may contain legitimate attacks, and these must be removed. I'm happy to discuss these details in another thread, but this note is getting pretty long and I want to describe another class of tests that use tomahawk: performance testing. In order to accurately predict the network performance of an IPS, it is critical that a realistic protocol mix be used. The reasons are simple. When an IPS inspects traffic, different code paths are executed depending on the content. For example, HTTP traffic uses a different code path than DNS. One invokes TCP reassembly and the HTTP decoder, while the other uses the UDP parser and DNS decoder. These different code paths can have very different performance characteristics. For instance, suppose that a hypothetical IPS can process 1 Gbps of HTTP traffic, but only 10 Mbps of DNS traffic. If you do all your lab tests with HTTP traffic, the IPS will look great in lab. When you install it in the network with significant DNS traffic, the IPS will crater the network. Most IPSs have hundreds, or even thousands, of such code paths. The performance of an IPS in a given network will depend on the exact protocol mix present in that network. Tomahawk can be used to replicate the exact protocol mix a given network by replaying a packet trace capture on that network. Another example: tomahawk can be used to test how many connections per second (CPS) an IPS can do. Here's how: I created a little script to open and close a TCP connection to a server, 1000 times. I captured a packet trace of this traffic. The trace has 6000 packets -- each TCP session has 6 packets, three for setup and three for teardown. If you replay this pcap using tomahawk with 250 copies in parallel, you can time how long it takes to open and close 250K TCP sessions. With 3 generic PCs, I can replay 750,000 connections in about 8 seconds over a crossover cable -- about 93000 connections per second. If you replace the crossover with an IPS, you can test the performance of the IPS. You can also use tomahawk to create background traffic for other tests. For example, you can replay traffic from the target network at 500 Mbps then check the latency through the IPS (using a smartbits or equivalent if you have one, ping if you don't). Or you can check the performance of a given workload (e.g., timing how long it takes to copy a large file from an NFS or SMB server). A command line parameter to tomahawk will limit the replay rate of the pcap, allowing you to set the level of background traffic. Once you have a test jig like this, you can combine tests. For example, you can use one tomahawk server to send attacks at an IPS, and measure what effect blocking has on the network performance. Or you can check repeatability under load. Or managability while blocking under load. And so on. As a historical note, I developed the code about 2.5 years ago as part of our quality assurance program, to predict the performance of our IPS in real world environment. We've been using it ever since, so it's pretty stable. We decided to release it because we think that it fills a need. If you want to learn more about the tool, try it out. Like tcpreplay, or any tool, it's not a panacea for all testing problems. If you find it useful, let me know. If you think it can be improved, please post a legitimate criticism or, better yet, improve on it by improving the code or posting a better tool. The IPS industry is just starting to see mainstream acceptance. As a community, we need to start defining tools and benchmark tests that we can use to do objective, apples to apples comparisons of the pros and cons of different products. Tomahawk is just a start in that direction, an attempt to get the ball rolling. I hope we can have a productive discussion on this topic, not just bashing and suspicion, so that other vendors will be encouraged to publish open source versions of their tools. Brian -------------------------------------------------------------------------- Test Your IDS Is your IDS deployed correctly? Find out quickly and easily by testing it with real-world attacks from CORE IMPACT. Go to http://www.securityfocus.com/sponsor/CoreSecurity_focus-ids_040708 to learn more. --------------------------------------------------------------------------
Current thread:
- RE: TippingPoint Releases Open Source Code for FirstIntrusionPrev ention Test Tool, Tomahawk Brian Smith (Nov 12)
- <Possible follow-ups>
- RE: TippingPoint Releases Open Source Code for FirstIntrusionPrev ention Test Tool, Tomahawk Brian Smith (Nov 12)