Secure Coding mailing list archives

Top 10 Ajax Security Holes and Driving Factors


From: leichter_jerrold at emc.com (Leichter, Jerry)
Date: Fri, 10 Nov 2006 18:12:54 -0500 (EST)

| FYI, a friend forwarded me a link to this interesting article by
| Shreeraj Shah on Ajax holes,
|  http://www.net-security.org/article.php?id=956
|
| Since much has been written here on SC-L about relatively safe
| programming languages recently, I thought it might be interesting to
| look at the other end of the spectrum.  ;-) Yes, I know Ajax is wildly
| popular these days.  10,000 lemmings can't be wrong, certainly!
The problem isn't "safe" vs. "unsafe" languages, as such.  That confuses
levels of abstraction.

Not so many years ago, we defined "safe" as "can't affect what other
people are doing on the machine".  Hardware isolation of process
memory spaces solved that - at least once OS's began using it.  (There
are still Windows 98 boxes out there...)  That was safety at the
machine code level:  You could assume that your piece of machine code
would correctly implement the semantics given in the manuals, even
though others were running their own (possibly malicious) machine code
on the same machine.

Safe languages take the same notion up to the next higher level of
abstraction:  You can assume that your piece of source language text
will, when compiled and run, implement the semantics given in the
manuals, even in the presence of malicious code within your own
program.

The problem is that these are no longer very good safety properties.
The first is simply not strong enough:  It's fine for an isolated
program, but simply irrelevant when multiple programs share access
to files, network connections, and so on.  Other programs may not
be able to stomp on your memory, but when you can no longer trust
your own code as it comes of the disk because some virus got into
it, you're out of luck.  The second of these closes many loopholes,
but the world is no longer bounded by your machine:  It's the whole
Internet.

A look at bug lists shows that three kinds of problems represent
most of the reported issues (though they may not be the most
important by measures other than simple issue counts):  Ability
to access files outside of a supposedly bounded area; SQL injection;
cross-site scripting.  What do these have in common?  In all cases,
the cause of the bug is that the programmer is dealing with strings
as the fundamental datatype.  He misses ways to encode "../" because
of various games, or he misses ways to "escape" from the quoting
that's supposed to isolate code from data.  The accesses to the
strings themselves are completely safe.  The problem is, these
are not just strings:  These are complex objects with very complex
implicit semantics.  They are being programmed in machine language -
not even assembler.  If you move up a level of abstraction and
realize that today, the issue isn't how to write a code to
manipulate strings in the memory of one machine, it's how to
write a distributed program that runs across a bunch of machines
loosely coupled to each other on the Internet, you'll see that
you need an entirely different notion of a "safety property".

In some cases, the right primitives are completely clear.  The
"incorrect file path sanitization issue" was solved 30 years ago
on a variety of DEC OS's, which provided functions to pick apart
and reconstruct file specifications.  You almost never dealt with
a file spec as a string - oh, you kept it around as a string, but
you never did string operations on it.  To do this, file specs on
DEC OS's had a specified syntax, from which the full semantics
could be read.  We lost this with the Unix vision file specs
should be uniform - just a list of elements with no internal
structure, separated by slashes.  Because of this simplification,
there was no need to provide library routines to manipulate the
things - the basic string routines would do just fine.  This
became such an article of faith among programmers that newer
languages - which often have much more powerful string manipulation
primitives - saw no reason not to let programmers keep treating
file specs as strings.  And so the holes continue to be programmed
in.  (BTW, in the product I work on, we defined a single function
that ... picks apart and glues back together file specifications.
Programmers are specifically told *not* to try to manipulate specs
as strings - they should use the provided function.  Now, we only
have to get it right once.  (Of course, if we get it wrong, *all*
of our code is exposed.  But that's a tradeoff - I'd much rather
audit one reasonably small function that who knows how many lines
of code scattered here and there.)

In other cases, the general form of a solution is clear, but no
one has gotten the details right.  SQL injection is a non-issue if
you build statements with SQL parameters.  But that's a pain to
write, because the abstractions are so poor.  It's so much easier
to just generate the SQL query by pasting strings together - and
in scripting languages, so proud of their ability to express
string substitutions elegantly, it's often the only tool you
have.

If we're going to program the meta-computer that is the Web, we
need appropriate abstractions, and appropriate safe ways to
express them.  But we're so far behind the curve that we are
*still* arguing about yesterday's solved problems, safety down
at the individual program level.
                                                        -- Jerry


Current thread: