oss-sec mailing list archives

Separating code and data

From: "Mehaffey, John" <John_Mehaffey () mentor com>
Date: Tue, 7 Oct 2014 16:40:22 +0000

From: Tim [tim-security () sentinelchicken org]
Sent: Tuesday, October 07, 2014 8:23 AM
To: oss-security () lists openwall com
Cc: Hanno Böck
Subject: Re: [oss-security] Thoughts on Shellshock and beyond

What class of bug is Shellshock? "Weird feature invented in

  pre-Internet era"? How do you conquer this class of bugs?

I am still struggling with this one.  I am trying to create that list here:
http://www.dwheeler.com/essays/shellshock.html#detect-or-prevent

But to be honest, that list is pretty pathetic. This is a challenging class of vulnerability to detect or prevent 
ahead of time. Ideas would be very welcome.



I wouldn't go so far as to say shellshock has a well-defined "class"
of vulnerability or bucket that we can stick it in, but it does
violate one of my own personal (and I think, the most important)
_principles_ of secure software design:  don't mix code and data.

What do I mean by that?  Concrete examples of failures:
  * word docs with macros
  * document markup with embedded script (yes: HTML/JS)
  * OGNL expressions in Struts URL parameters

Any time you design a system to accept executable code as well as data
in the same format/context/whatever, you invite a huge number of
possible attacks.  These attacks may not manifest themselves
immediately or obviously.  It may require a change in the way the
software is used, or implementation bugs to expose the risk, but it
is a highly risky design approach.


People expect office documents to be data, but in fact they can
include a limited form of code as well.  In the case of word docs and
macros, the risk was exposed by implementation bugs and the difficulty
of keeping the language sandboxed.

In the case of HTML/JS, the risk came from the way JS is embedded
inline in so many locations people can't safely allow HTML (a data
markup format) without allowing JS as well.  (If JS were only allowed
as external resources and not as, say, events embedded in attributes,
it would be less mixed and easier to make safe).

In Apache Struts, OGNL is used are used to parse the entire POST body,
variable names and values.  However, OGNL expressions are executable
code, which breaks the whole assumption that POST variables are data.
So the Struts team is now playing whack-a-mole with blacklist blocking
of specific attack vectors.

In the case of shellshock, the "mixing" of code and data came about
because environment variables, normally used to carry data, were
overloaded and used to carry code.  This is very similar to the Struts
case.


David: your item "Create namespaces where practicable" is effectively
an implementation of what I'm talking about here.  By creating
namespaces, you're creating a partition between code and data.  But
the underlying principle is just to keep these two things separate and
*well defined* as separate via whatever mechanism makes the most sense.


Cheers,
tim


I think that separating code and data belongs on David's list of "Most Important
Software Innovations" (www.dwheeler.com/innovation/innovation.html), although
arguably the "Separating Text Content from Format" innovation is an example 
of the class.

From allowing better cache locality (modern architectures now have both an

i-cache and a d-cache) to the security improvements mentioned above, it is a 
software concept that has paid many dividends over the years.

Sincerely,
John Mehaffey
Linux System Architect
Mentor Graphics

Current thread:

Separating code and data Mehaffey, John (Oct 07)
- Re: Separating code and data David A. Wheeler (Oct 07)