nanog mailing list archives

Re: yahoo crawlers hammering us


From: Harry Strongburg <harry.nanog () harry lu>
Date: Wed, 8 Sep 2010 05:54:55 +0000

On Tue, Sep 07, 2010 at 04:19:58PM -0400, Ken Chase wrote:
This makes it look like Yahoo is actually trafficking in pirated software, but
that's kinda too funny to expect to be true, unless some yahoo tech decided to
use that IP/server @yahoo for his nefarious activity, but there are better sites
than my customer's box to get his 'juarez'.

It's not uncommon at all for a web-spider to find large files and 
download them. I don't think there's some conspiracy at Yahoo to find 
warez; they are just opperating as a normal spider, indexing the 
Internet.

~500K/s (4Mbps+) for a 3 gig file is kinda... a bit harsh.

What speed would you like a spider to download at? You could configure 
the speeds to Yahoo's blocks server-side if you care enough. Ideally, 
request your customer doesn't throw large programs on there if you're 
concerned about bandwidth. 4 Mb/s isn't abnormal at all for a spider, 
and especially on a larger file.

Is this expected/my own fault or what?

A little bit of both :)


Current thread: