WebApp Sec mailing list archives

Re: Input validation


From: Alla Bezroutchko <alla () scanit be>
Date: Sat, 21 Jun 2003 17:37:25 +0200



Kooper, Larry wrote:

When securing a web site against attacks such as SQL injection and XSS, what
approach do you recommend following to validate user input?
1) Attempt to massage data so that it becomes valid
2) Reject input that is known to be bad
3) Accept only input that is known to be good

The problem with solutions 1 and 2 is that you may miss some forms of bad
input.  Another subtle problem with solution 1 and 2 is that sometimes bad
input can be embedded in good input.  For example, if someone searches for
"director's selections" the string "select" would be rejected (as a SQL
command), resulting in "director's ions."

I suggest the following approach - when you input user data, validate it according to business logic. Then, when you output user data, validate it or escape it according to the rules of the system to which you output.

In your example, when you input the search string, you check that it is not empty. Restricting it further than that makes no sense. Why not let users search for things like "director () company com" (special character used), "2001/12/03" (no alphabetic characters at all) and so on? Then, when you use the string to query the SQL database you either escape it or use prepared statements. If you use escaping, your search string will become either "director''s selections" or
"director\'s selections", depending on the SQL server you use.

Here when I validate input I use "accept only known good data" approach and when I perform output I "massage data to make it valid".

This approach has a benefit. When you input data you might not know what will happen to it in the future. Will it be send to the SQL database? Will it be wrapped in XML and used by a web service? Will iut be used in HTML output? Perhaps later someone adds a perl script that will use this data to build shell commands? The developer can't predict that. On the other hand, when you are going to output data, you know exactly where it goes and what rules will be applied to it. So you can format the data accordingly.

Input validation is still important, because it allows to reject known bad data right at the start. If you businesss logic says that user input should be a positive integer number, check it right after you got it from the user. Input validation can protect from many application logic problems.

So, my strategy is, first do input validation (according to business logic), then do output validation (according to the rules of the system you are outputting to).

Alla.


Current thread: