Vulnerability Development mailing list archives

Re: CSS implication


From: "Sverre H. Huseby" <shh () thathost com>
Date: Sat, 23 Mar 2002 10:30:33 +0100

[Jeremiah Grossman]

|   Once XSS code is injected into a persistent environment (HTML
|   Chat/Auction/Mail,etc.)  it will stay there (for a length of
|   time), even if the input filtering problem is fixed at some point.

That's another very good example on why one should not filter HTML on
input, but rather on output.  HTML filtering may be seen as just
another meta character washing, like handling of single quotes in SQL
strings.  Meta character filtering should, in my opinion, be done
right before passing the data to the system that needs filtering.  In
the HTML case this system is the visitor's browser, so filtering
should be done in the output process.

I like to separate sanitzion of input in two parts: Validation and
meta character handling.  Validiation is defined by the application
domain, and may be done immediately when receiving the input.  (Is
this an integer?  A valid mail address?  A family name?)  Sometimes
validation includes the stripping of HTML meta characters.

Meta character (or character sequences) washing depends on the system
your application passes the data to.  Different systems need different
handling of meta characters.  (To have security in depth, one should
treat validation and meta character handling as independent
operations.  Even if a validation rule forbids HTML markup, the meta
character handler should escape any occurrences of such characters.)

I know people have next to religious opinions on this matter, so all
og us will probably never agree on the "correct" way to sanitize HTML
output.  Those who say that HTML filtering should be done at input
time often say that it is hard to remember to always escape the HTML
meta characters for all outputs, so it's better to do it once and for
all.  I agree that it may be hard to remember, but I nevertheless
think that handling meta characters on input is the wrong solution.
One should rather build a framework that would encapsulate the output
object/stream, and take responsibility for the washing.  The same
framework could hide the request object/stream, and force validation
by eg. hiding the raw getPostData("FOO") and instead provide
eg. getIntegerPostData("FOO") and similar for other domain types.

The more security stuff we can hide from the average programmer, the
better.


Sverre.

-- 
shh () thathost com                     Computer Geek?  Try my Nerd Quiz
http://shh.thathost.com/                http://nerdquiz.thathost.com/


Current thread: