Full Disclosure mailing list archives

Re: [WEB SECURITY] noise about full-width encoding bypass?


From: "Arian J. Evans" <arian.evans () anachronic com>
Date: Mon, 21 May 2007 09:04:27 -0700

1. You are missing what I consider to be the major point.

2. I don't know the context of the cert advisory; there are more encoding
types than full under full-width that IDS today don't decode (that are of
interest to us as well), but...

3. The question we need to ask ourselves is one of cannonicalization. In
monolithic J2EE projects and modern cobbled-together web code, PHP is
notoriously dirty for this, there are *multiple* layers of cannonicalization
that often occur specific to particular untrusted entry points. This stuff
is really hard to find (initially) in source code.

You will find that sometimes you can even double-encode your attacks, and
they get decoded/cannonicalized to their common ASCII or UTF-8 (or whatever
format) before they read the parser (query engine, browser, shell script,
smtp relay, whatever parser you are targeting).

It's fair to be skeptical about this though Brian. It's not common to find
where these attacks work, and I find that few people go beyond buzzwords and
encoding-attack-technobafflegab when discussing this subject in the security
"consultant" space.

Guess it's finally time for a paper on this,

--
Arian Evans
solipsistic software security sophist

"I love deadlines. I like the whooshing sound they make as they fly by." -
Douglas Adams

On 5/21/07, Brian Eaton <eaton.lists () gmail com> wrote:

Has anyone had a look at the full-width unicode encoding trick discussed
here?

http://www.kb.cert.org/vuls/id/739224

AFAICT, this technique could be useful for a homograph attack.  I
don't think it's useful for much else.  However, a few vendors have
reacted already, so I may be missing something important.

Here's why I think the attack is mostly harmless:

Let's say an attacker wants to use this technique to hide a SQL
injection attack.  They decide to use a full-width encoding for single
quote, 0xff 0x07.  They successfully bypass the IDS, because the IDS
is only scanning for normal single quotes.  (You can see the encodings
and their graphical representation here:
http://www.unicode.org/charts/PDF/UFF00.pdf)

If the SQL engine is processing queries in Unicode, then 0xff 0x07
will be treated as a normal unicode character, not a single quote.
The sequence 0xff 0x07 is not equivalent to 0x27, the real single
quote value.  No SQL injection occurs.

If the SQL engine is processing queries in UTF-8, then 0xff 0x07 will
be converted from Unicode to UTF-8: 0xef 0xbc 0x87.  Again, the engine
does not recognize 0xef 0xbc 0x87 as equivalent to 0x27.

If the SQL engine is processing queries in ASCII or ISO-8859-1, the
conversion from unicode to the code page used by the engine will fail.
Either the engine will give up on the query, or it might substitute a
question mark (?) for the unconvertible character.

To summarize: I think half-width and full-width unicode characters are
characters that happen to have the same graphical representation as
other characters, but don't carry any special significance outside of
that graphical representation.  The graphical representation can be
important in homograph attacks, but otherwise I don't see this
technique as particularly useful to an attacker.

Any comments on what I may have missed?

Regards,
Brian


----------------------------------------------------------------------------
Join us on IRC: irc.freenode.net #webappsec

Have a question? Search The Web Security Mailing List Archives:
http://www.webappsec.org/lists/websecurity/

Subscribe via RSS:
http://www.webappsec.org/rss/websecurity.rss [RSS Feed]

_______________________________________________
Full-Disclosure - We believe in it.
Charter: http://lists.grok.org.uk/full-disclosure-charter.html
Hosted and sponsored by Secunia - http://secunia.com/

Current thread: