Wireshark mailing list archives

Re: filter for ONLY initial get request


From: Sake Blok <sake () euronet nl>
Date: Wed, 11 Aug 2010 12:12:38 +0200

On 10 aug 2010, at 16:48, Jeffs wrote:
I have come up with the following tshark formula which seems to address my needs.  Since I am not interested in the 
URLs from advertising agencies, videos and other embedded links in web pages, but only the top level domain I use 
this.  Please let me know if anyone sees any gotchas or potential problems with this formula I'm very new to regex 
expressions and could use advice.  This formula will return only the top level domains and strips out links such as 
admin.brightcove.com, advertisingserver.amazon.com, tubemogel.videos.com:

tshark -r test.cap -R http.request -T fields -e http.host | sed -e 's/?.*$//' | sed -e 
's#^\(.*\)\t\(.*\)$#http://\1\2#&apos; | sort | uniq -c | sort -rn | head -n 300 | sed -n -e '/www/p'

If you're only interested in an overview of visited top-level domains, without caring what the specific hosts and/or 
URI's were that were visited. You could use something like

tshark -r test.cap -R http.request -T fields -e http.host | sed -e 's/^.*\.\([^.]*\.[^.]*\)$/\1/' | sort | uniq -c | 
sort -rn | head -n 100

for the top-100 top-level domains (based on individual hits, not user sessions).

Cheers,


Sake


___________________________________________________________________________
Sent via:    Wireshark-users mailing list <wireshark-users () wireshark org>
Archives:    http://www.wireshark.org/lists/wireshark-users
Unsubscribe: https://wireshark.org/mailman/options/wireshark-users
             mailto:wireshark-users-request () wireshark org?subject=unsubscribe


Current thread: