Wireshark mailing list archives

Re: Wireshark memory handling


From: didier <dgautheron () magic fr>
Date: Fri, 09 Oct 2009 03:47:16 +0200

Hi,
Le jeudi 08 octobre 2009 à 22:15 +0200, Erlend Hamberg a écrit :
Sorry about the late reply. I am one of the other students in the group. 
Thanks for your answers. I have commented below and would appreciate further 
feedback.

On Monday 5. October 2009 20.23.42 Guy Harris wrote:
The paper says

    Since exhausting the available primary memory is the problem ...

What does "primary memory" refer to here?

That could certainly have been worded more clearly. By primary memory, we mean 
main memory, as your reasing lead you to.

The "problem", as we have understood it, and as we have seen it to be, is that 
Wireshark keeps its internal representation (from reading a capture file) in 
memory. I write "problem" in quotes, because in most use cases I guess that 
this is not a problem at all, and this is also how almost any program 
operates.

We work for an external customer who uses Wireshark and would like to be able 
to analyze more data than is allowed by a machine's virtual memory without 
having to splitup the captured data.

To be able to do this we looked at the two solutions mentioned in the PDF 
Håvar sent, namely using a database and using memory-mapped files. Our main 
focus is 64-bit machines due to 64-bit OS-es' liberal limits on a process' 
memory space. Doing memory management ourselves, juggling what is mapped in 
the 2 GiB memory space at any time, is considered out of the scope of this 
project. (We are going to work on this until mid-November.)

[...]

In effect, using memory-mapped files allows the application to extend
the available backing store beyond what's pre-allocated (note that OS
X and Windows NT - "NT" as generic for all NT-based versions of
Windows - both use files, rather than a fixed set of separate
partitions, as backing store, and I think both will grow existing swap
files or add new swap files as necessary; I know OS X does that),
making more virtual memory available.

So, on OS X (and possibly other modern OS-es), as long as you have available 
harddisk space, a process will not run out of memory, ever? (A process can 
have address space of ~18 exabytes on 64-bit OS X. [1])

This would mean that this problem would only continue to exist on operating 
sytems using a fixed swap space, like most (all?) Linux distros still do.
Linux can use swap files too. It doesn't allocate them on demand, that's
all.

I don't see what you would get with mmaped files vs enough swap. But if
you are using wireshark, ie working interactively, it'd be slow, slow as
in unusable.

Using a DB could be a better option, but you need a 'data silo'
something like http://www.monetdb.nl For it a 100 Millions rows 200,000
columns sparse matrice should be a trivial data set. It would be faster
than wireshark for filtering by an order of magnitude or two. 
Disclaimer: We're using a proprietary data silo and I've no experience
with MonetDB.   

A modified Tshark should be able to upload a capture at around 30,000
packets/second.

No idea what would be better for the interactive front-end: a modified
wireshark or a new application.
No idea if you have enough time to do it either.


For example here we are using a modified wireshark.
It's able to filter simple expressions at around 5-10 Millions
packets/seconds. 

it filters complex expressions at 50,000 to 400,000 packets/second. 

But we never use wireshark if it needs to hit harddisks (for us roughly
3 times the file size), it's too slow.

If we have to use bigger files I would use MonetDB, I don't know if
using wireshark on such big data set would be useful though, at some
point more data is just noise.

Note:
A simple expression is a filter expression with only protocols or
previous expressions. ex:
llc && !arp
is a simple expression
tcp.stream == 0
is not but after that
afp && !(tcp.stream == 0)
is one.

Didier


___________________________________________________________________________
Sent via:    Wireshark-dev mailing list <wireshark-dev () wireshark org>
Archives:    http://www.wireshark.org/lists/wireshark-dev
Unsubscribe: https://wireshark.org/mailman/options/wireshark-dev
             mailto:wireshark-dev-request () wireshark org?subject=unsubscribe


Current thread: