tcpdump mailing list archives
Re: Portable way to "block" on pcap_next_ex()
From: Guy Harris <guy () alum mit edu>
Date: Mon, 16 Jan 2012 12:25:00 -0800
On Jan 16, 2012, at 6:58 AM, Fernando Gont wrote:
On 01/15/2012 08:56 PM, Guy Harris wrote:For my current app, it's probably just "annoying" (although no big deal). However, I was mostly concern about performance problems in other applications. Put another way, if there's nothing that an app can do without a packet being read, there's no reason for the app to be awaken.Well, presumably, yes, although I assume the folks at LBL had some rationale for starting the timer when the read() is done rather than when a packet arrives (starting the timer on a read() also[,,,] I guess that's the "logic" place to put a timeout? (Although in the pcap case, the timeout is used for a different purpose (performance) rather than "I don't want this call to block forever if there's nothing to read").
Well, more accurately, the *buffer* is used for performance (so that you don't get one wakeup and one read() call per packet, but per *batch* of packets), and the timeout is used to keep from blocking indefinitely waiting for the buffer to fill (as is the case if the timeout is set to 0 on a BPF device). The issue is that, as the timer is started if a read is done with an empty buffer, if no packets arrive before the timeout expires, there will still be no packets in the buffer, and the read will return 0 bytes.
select() and poll() do not work correctly on BPF devices; pcap_get_selectable_fd() will return a file descriptor on most of those versions (the exceptions being FreeBSD 4.3 and 4.4), but a simple select() or poll() will not indicate that the descriptor is readable until a full buffer’s worth of packets is received, even if the read timeout expires before then. To work around this, an application that uses select() or poll() to wait for packets to arrive must put the pcap_t in non‐blocking mode, and must arrange that the select() or poll() have a timeout less than or equal to the read timeout, and must try to read packets after that timeout expires, regardless of whether select() or poll() indicated that the file descriptor for the pcap_t is ready to be read or not.Sorry, what's the point of calling select() in this case,
Multiplexing operations such as, say, socket I/O and packet capture - the usual purpose of select().
and what's the rationale for the timeout value used with select() in this case?
Working around the fact that, on the OSes in question, doing a select()/poll()/etc. on a BPF descriptor doesn't start a timer, so that the select()/etc. will wait for the BPF "store buffer" to fill up before marking the BPF descriptor as readable, even though, with a timeout, a read on the descriptor will block until the "store buffer" fills up *or* the timer expires.
If you have a system where select() works as it should, i.e.:[....]select() will block until either 1) a bufferful of packets arrives or 2) the timer, started when the select() is done, expires, regardless of whether any packets are available to read.This doesn't seem to agree with my tests. I've just checked this on FreeBSD-8.2-release and on a current Ubuntu system,
Sorry, I didn't make it clear enough that, when I said that, I was speaking only of systems using BPF, so it wouldn't apply to Ubuntu (or any other Linux distribution), for example.
and in both cases select() returns "readable" only for each packet that is received.
What do you mean by "only for each packet that is received"? Do you mean that it doesn't return "readable" if there are no packets to read?
In Solaris's case, that would depend on whether my app is actually run before "to_ms" have elapsed since the reception of the first packet, right?Your app can't be run before "to_ms" have elapsed since the reception of the first packet, because "the first packet" means "the first packet received after the getmsg() is done on the DLPI descriptor" - i.e., it's not the first packet received ever, it's the first packet received in a packet batch.Sorry, do you mean: 1) I call select(), and it blocks 2) select() returns "readable" 3) I call pcap_next_ex(), and *this* triggers the "to_ms" timer --> hence this call will probably block for about "to_ms", too.
No, I mean that should *NOT* happen. For Solaris with DLPI, the bufmod manual says "To ensure that messages do not languish forever in an accumulating chunk, bufmod maintains a read timeout. Whenever this timeout expires, the module closes off the current chunk and passes it upward. The module restarts the timeout period when it receives a read side data message and a timeout is not currently active. These two rules insure that bufmod minimizes the number of chunks it produces during periods of intense message activity and that it periodically disposes of all messages during slack intervals, but avoids any timeout overhead when there is no activity." With DLPI, you're reading from a STREAMS device; if there are any data messages available at the stream head, the read (getmsg()) will not block but will return the contents of the first data message. A select() or poll() will indicate that the descriptor is readable if there are data messages available at the stream head, and will otherwise wait until a message is made available at the stream head, any timeout specified with the select() or poll() expires, or some other descriptor is readable. The bufmod module receives individual data messages from what's below it on the stream, and accumulates them in a buffer. When the buffer fills up, it's sent upstream as a single data message. If a data message is received from below and there's no timeout in effect, it starts a timeout; when the timeout expires, the buffer is sent upstream with whatever messages it has in it (there will be at least one, as the timeout isn't started until a data message arrives; there might be more). For Solaris 11 with BPF, it will probably work the same way BPF works on other OSes.- This is the tcpdump-workers list. Visit https://cod.sandelman.ca/ to unsubscribe.
Current thread:
- Portable way to "block" on pcap_next_ex() Fernando Gont (Jan 15)
- Re: Portable way to "block" on pcap_next_ex() Guy Harris (Jan 15)
- Re: Portable way to "block" on pcap_next_ex() Fernando Gont (Jan 15)
- Re: Portable way to "block" on pcap_next_ex() Guy Harris (Jan 15)
- Re: Portable way to "block" on pcap_next_ex() Fernando Gont (Jan 16)
- Re: Portable way to "block" on pcap_next_ex() Guy Harris (Jan 16)
- Re: Portable way to "block" on pcap_next_ex() Fernando Gont (Jan 16)
- Re: Portable way to "block" on pcap_next_ex() Guy Harris (Jan 16)
- Re: Portable way to "block" on pcap_next_ex() Fernando Gont (Jan 15)
- Re: Portable way to "block" on pcap_next_ex() Guy Harris (Jan 15)