Wireshark mailing list archives

Re: filter for ONLY initial get request

From: Jeffs <jeffs () speakeasy net>
Date: Tue, 10 Aug 2010 10:48:43 -0400

On 8/10/2010 1:36 AM, Sake Blok wrote:

On 10 aug 2010, at 04:53, Jeffs wrote:

On 8/9/2010 10:47 PM, Jeffs wrote:

On 8/9/2010 11:25 AM, Sake Blok wrote:

Have a look at the presentation I gave at Sharkfest'10, it shows you how you can accomplish something quite 
similar with Tshark and some (minor) scripting. You should be able to change the commands to your needs.

http://www.cacetech.com/sharkfest.10/A-6_Blok%20HANDS-ON%20LAB%20-%20Using%20Wireshark%20Command%20Line%20Tools%20and%20Scripting.zip

Question: in the wonderful example in that paper for finding the top 10
requested URLs, with this formula:

tshark -r example.cap -R http.request -T fields -e http.host -e
http.request.uri | sed -e 's/?.*$//' | sed -e
's#^\(.*\)\t\(.*\)$#http://\1\2#&apos; | sort | uniq -c | sort -rn | head

Where does one set the top "10"?  How would I change that to, say, top
"20" or whatever?  Or does uniq -c always just produce the top 10?

I can now answer my own question:


TIP: During my presentation @ Sharkfest'10  I built the command sequence step by step, you might want to try that too 
to get a full grasp of what each step does. That way you will be able to craft your own command sequence a little 
easier.

Cheers,

Sake

I have come up with the following tshark formula which seems to address 
my needs.  Since I am not interested in the URLs from advertising 
agencies, videos and other embedded links in web pages, but only the top 
level domain I use this.  Please let me know if anyone sees any gotchas 
or potential problems with this formula I'm very new to regex 
expressions and could use advice.  This formula will return only the top 
level domains and strips out links such as admin.brightcove.com, 
advertisingserver.amazon.com, tubemogel.videos.com:

tshark -r test.cap -R http.request -T fields -e http.host | sed -e 
's/?.*$//' | sed -e 's#^\(.*\)\t\(.*\)$#http://\1\2#&apos; | sort | uniq -c | 
sort -rn | head -n 300 | sed -n -e '/www/p'
___________________________________________________________________________
Sent via:    Wireshark-users mailing list <wireshark-users () wireshark org>
Archives:    http://www.wireshark.org/lists/wireshark-users
Unsubscribe: https://wireshark.org/mailman/options/wireshark-users
             mailto:wireshark-users-request () wireshark org?subject=unsubscribe

Current thread:

Re: filter for ONLY initial get request, (continued)
- - Re: filter for ONLY initial get request Jeffs (Aug 07)
    - Re: filter for ONLY initial get request David Alanis (Aug 07)
    - Re: filter for ONLY initial get request j.snelders (Aug 08)
    - Re: filter for ONLY initial get request Jeffs (Aug 09)
    - Re: filter for ONLY initial get request j.snelders (Aug 09)
    - Re: filter for ONLY initial get request Sake Blok (Aug 09)
    - Re: filter for ONLY initial get request Sake Blok (Aug 09)
    - Re: filter for ONLY initial get request Jeffs (Aug 09)
    - Re: filter for ONLY initial get request Jeffs (Aug 09)
    - Re: filter for ONLY initial get request Sake Blok (Aug 09)
    - Re: filter for ONLY initial get request Jeffs (Aug 10)
    - Re: filter for ONLY initial get request Sake Blok (Aug 11)
    - Re: filter for ONLY initial get request Jeffs (Aug 11)
    - Re: filter for ONLY initial get request Thierry Emmanuel (Aug 11)
    - Re: filter for ONLY initial get request Jeffs (Aug 11)
    - Re: filter for ONLY initial get request Thierry Emmanuel (Aug 12)
    - Re: filter for ONLY initial get request Sake Blok (Aug 12)
    - Re: filter for ONLY initial get request Thierry Emmanuel (Aug 12)
    - Re: filter for ONLY initial get request Sake Blok (Aug 12)
    - Re: filter for ONLY initial get request Thierry Emmanuel (Aug 12)
    - Re: filter for ONLY initial get request Jeffs (Aug 12)

(Thread continues...)