WebApp Sec mailing list archives

Re: Hit Throttling - Content Theft Prevention


From: "Kurt Seifried" <bt () seifried org>
Date: Wed, 19 Oct 2005 00:41:18 -0600

One effective strategy is to have hidden links (i.e. white text on white background or a 1x1 pixel image stashed somewhere) that regular browsers won't see at all. Have it go to a page with more links that specifically say "do not click this, you will be blocked," etc. These links go to a CGI, the CGI blocks that IP/etc (firewall rules, apache config, whatever), make sure you stick these in various alphabetical orders and at the top and bottom of the pages (many scrappers start at the top of a page or go in alphabetical order).

Alternatively you can monitor web logs and block anyone that requests more then N files in Y seconds. Also since many web scrappers initiate a new TCP connection for each request rate limiting SYN packets is also a quick and dirty way to deal with it.

The trick is to have thresholds high enough for legit web crawlers but low enough to catch the annoying people quickly.

You can also use Apache to redirect/serve content based on agent header, many people don't bother to change default agent strings, serve them a null site/etc. Lots of tricks.

-Kurt

Current thread: