Wireshark mailing list archives

Re: Bzip2 support


From: ronnie sahlberg <ronniesahlberg () gmail com>
Date: Thu, 27 Jun 2019 09:36:08 +1000

On Thu, Jun 27, 2019 at 7:17 AM Guy Harris <guy () alum mit edu> wrote:

On Jun 26, 2019, at 2:03 PM, Jaap Keuter <jaap.keuter () xs4all nl> wrote:

On 26 Jun 2019, at 19:41, Guy Harris <guy () alum mit edu> wrote:

It could probably be done (note that for decompressing capture files that would require the ability to do random 
access I/O,

It (http://sourceware.org/bzip2/manual/manual.html#limits) now says: "Further ahead, it would be nice to be able to 
do random access into files. This will require some careful design of compressed file formats."

gzip format wasn't carefully designed for that, either, but it can be - and has been - made to work.  It requires 
storing dictionary state.

Yepp. BGZIP and its library you can link with does this. I even built
a fuse filesystem to transparently "unzip" these kind of files.

What BGZIP does is that it will restart a new dictionary every ~64k
bytes and also stores an index in a separate file.
The bgzip file itself is compatible with gzip so you can uncompress it
using vanilla gzip
but in order to do random reads/seek in the file you need the index file.

It works, quite well.
The problem I found is that when you restart with a new dictionary
every ~64kb there is not much for the compression engine to work with
so compression ratio is usually (in my cases) quite poor compared to
normal gzip.
___________________________________________________________________________
Sent via:    Wireshark-dev mailing list <wireshark-dev () wireshark org>
Archives:    https://www.wireshark.org/lists/wireshark-dev
Unsubscribe: https://www.wireshark.org/mailman/options/wireshark-dev
             mailto:wireshark-dev-request () wireshark org?subject=unsubscribe

Current thread: