WebApp Sec mailing list archives
Re: Combatting automated download of dynamic websites?
From: "Matthijs R. Koot" <matthijs () koot biz>
Date: Mon, 29 Aug 2005 18:18:00 +0200
Which preventive or repressive measures could one apply to protect larger dynamic websites against automated downloading by tools such as WebCopier and Teleport Pro (or curl, for that matter)? For a website like Amazon's, I reckon some technical measures would be in place to protect against 'leakage' of all product information by such tools (assuming such measures are justified by calculated risk). The data we publish online are important company gems which we want to be accessible by any visitor, but to be protected against systematic download in either non-intentional context (like Internet Explorer's built-in MSIECrawler) or intentional context (WebCopier, Teleport, ...). Consider this: detailpage.html?bid=0000001 detailpage.html?bid=0000002 detailpage.html?bid=0000003 (...) Or with multiple levels: detailpage.html?bid=0000001&t=1 detailpage.html?bid=0000001&t=2 detailpage.html?bid=0000002&t=1 detailpage.html?bid=0000002&t=2 (...) In specific, I was wondering if it's possible and sensible to limit the allowed number of requests for certain pages per minute/hour. At the same time, the data displayed by detailpage.html should be indexable by Google, so the data itself can't be hidden behind a user login and it's not possible to use any client-side scripting as Google doesn't interpret it. I'm using Apache 2 on RedHat 4 Enterprise and know about mod_throttle (which doesn't work with Apache 2) and mod_security (which also offers some 'throttling' functionality, regression, but is only able to work with individual requests and can't remember request
sequences).
I'd also suppose that dealing with proxy servers of large ISPs, like AOL, is a big caveat. Any ideas?
bugtraq () cgisecurity net wrote:
http://www.google.com/search?hl=en&q=apache+prevent+image+hotlinking&btnG=Google+Search http://www.alistapart.com/articles/hotlinking/ Also check out mod_throttle. http://www.snert.com/Software/mod_throttle/ - zeno http://www.cgisecurity.com
Thanks for your reply zeno! But actually, referer-based anti leeching won't do it for me and mod_throttle isn't suitable for Apache 2. I'm in need of a throttling function based on something more advanced like a 'request history stack' to check the order in which pages were requested, probably within a certain time period, et cetera. Maybe it'd be better to move such security measures into the actual web application itself, but I'm still hoping someone knows of a service-based solution (i.e. like the beforementioned Apache module). Matthijs
Attachment:
smime.p7s
Description: S/MIME Cryptographic Signature
Current thread:
- Combatting automated download of dynamic websites? Matthijs R. Koot (Aug 29)
- Re: Combatting automated download of dynamic websites? Jayson Anderson (Aug 29)
- Re: Combatting automated download of dynamic websites? Serg Belokamen (Aug 29)
- Re: Combatting automated download of dynamic websites? bugtraq (Aug 29)
- Re: Combatting automated download of dynamic websites? Matthijs R. Koot (Aug 29)
- Re: Combatting automated download of dynamic websites? Javier Fernandez-Sanguino (Aug 30)
- Re: Combatting automated download of dynamic websites? Eoin Keary (Aug 31)
- Re: Combatting automated download of dynamic websites? Javier Fernandez-Sanguino (Sep 05)
- Re: Combatting automated download of dynamic websites? Matthijs R. Koot (Aug 29)
- Re: Combatting automated download of dynamic websites? Michael Boman (Aug 30)
- Re: Combatting automated download of dynamic websites? Paul M. (Sep 05)
- Re: Combatting automated download of dynamic websites? Eoin Keary (Sep 07)
- Re: Combatting automated download of dynamic websites? Jayson Anderson (Aug 29)
- <Possible follow-ups>
- Re: Combatting automated download of dynamic websites? Tony Stahler (Aug 30)
- Message not available
- Fwd: Combatting automated download of dynamic websites? Mark Quinn (Aug 31)
- Message not available