WebApp Sec mailing list archives

Reverse Proxy and Link Encoding


From: Michael Naef <michael.naef () inf ethz ch>
Date: Sat, 31 May 2003 23:56:20 +0200 (MEST)


Hi all

I have a follow-up question to Dean's inquiry on reverse proxies...

I am looking for a reverse proxy that does not let _any_ client-provided
data through. This would be achieved by parsing all web pages in order to
identify the hyper links contained. Then, all the hyper links would be
replaced by the proxy's address and a suitable encoding. Also, the proxy
would maintain a table with all the encodings and the original link. When
the client requests such an encoded link, the proxy would do a lookup in
the table and retrieve the original link.

Example:

1) Proxy retrieves some web page that contains the link
   http://www.foo.com/
2) Proxy replaces this link in the web page by something like the
   following link: http://proxy/77352102 and sends the resulting
   page to the client.
3) Client hits the link. Proxy analyzes the encoding and does the lookup
   in the table to find the original link. It retrieves the page, parses
   the content, replaces links, and sends the result to the client again.

(Startup: The proxy would have a well-defined collection of possible links
that are already encoded and serve as a starting point.)

I am aware that such a proxy is quite prohibitive with regard to browsing
the web. However, it can be useful in environments that must prevent
potentially hostile traffic (e.g. "hacked" URLs, malformed POST data etc.)  
to leave to the Internet and still allow basic browsing capabilities.

Does anybody know of a proxy that does this (or something similar)? (My 
research has not been successful so far.)


Thanks
myke.





Current thread: