WebApp Sec mailing list archives

Re: SQL Injection


From: Alex Russell <alex () netWindows org>
Date: Wed, 16 Jun 2004 10:37:12 -0700

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Tuesday 15 June 2004 1:09 pm, Frank Knobbe wrote:

[ snip ]

Data should be properly validated and/or converted for use in a
safe format. This is highly depending on your use of data. For this
example, let's just use a scenario where you want to avoid
single-quotes, but could allow double-quotes. Your input validation
routines check and convert it (either encoded, escaped, or
substituted). Now the data is safe for handling in database
routines.

This is exactly where the concept of a bi-directional boundary comes 
in handy. At each point, if you know what the system on the other 
side is, you can take pains to interpose yourself as a hedge against 
problems there, but only at such a point as you've already de-mangled 
whatever things you've done internaly to maintain subsystem 
integrity.

However, that same data is not safe for output to a web browser
since it allows double-quotes (nice XSS if that data is printed in
a form field for example). So here we have the scenario were the
data needs to be validated/converted before being sent to the
browser.

But since this is at output time, why not call it output
validation?

The easiest way would be to just HTMLEncode the data before
printing to the browser. That ensures that all, including
potentially dangerous characters like the brackets and
double-quotes, are properly formatted and do not cause issues in
the browser (prematurely terminating tags or form fields, or
including tags to allow scripting). If of course you require
certain HTML behavior (such as text styles -- bold etc), then your
output validation/conversion routines have to be customized.

But the basic fact is that you have TWO checkpoints. Checking data
on input and checking data on output.

Right, but the fact is that the term "output" assumes that it is the 
system itself that is doing this. Very often that's not an option 
(like, say, intercepting all SQL operations within the DB itself), so 
we introduce the concept of the boundary as an entity to watch, but 
we don't use termininology that denotes direction until you start to 
model each boundary. You're entirely right that this is an important 
concept, but I think it's only a building block to a real solution.

Here's a made up example that puts an XML processing system directly 
up against a database (not ususual these days): some source system 
sends a "<" char to my db layer, which decides that this should be 
escaped, so it subsequently becomes "\<". On the way out, the XML 
processing system will surely choke on this, so the db layer "undoes" 
its previous escaping to get the "<" back. At this point (and not 
before) the xml system's filter can use its own escaping rules and go 
from "<" to "&lt;" or even "<![CDATA[<]]>". So there are 3 filter 
points in this example (in order):

        1.) db layer inbound
        2.) db layer outbound
        3.) xml layer inbound

The order here is critical, but the boundaries for each can be managed 
either by a wrapper system (scripting environment, etc.) or 
internally to the systems themselves. In many cases, you can't do it 
in the system to be filtered for or from. The concept of output 
filtering is only half of the game here, and I don't know that 
disambiguating it from "input filtering" is really as useful as 
thinking about the path to and from each subsystem (hence the term 
"boundary filtering", which tries to encapsulate both).

Regards

- -- 
Alex Russell
alex () burstlib org   BD10 7AFC 87F6 63F9 1691 83FA 9884 3A15 AFC9 61B7
alex () netWindows org F687 1964 1EF6 453E 9BD0 5148 A15D 1D43 AB92 9A46
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (Darwin)

iD8DBQFA0IVIoV0dQ6uSmkYRAkhcAKCTGOh4f/Dgd2OsJyfPNYfgMiLfIwCeLjPq
eazBXgR/skfcG/n65LT50Po=
=q7Tm
-----END PGP SIGNATURE-----


Current thread: