Dailydave mailing list archives
[Fwd: Re: Optimisation Considered Harmful]
From: "I)ruid" <druid () caughq org>
Date: Thu, 23 Jun 2005 22:52:10 -0500
Upon reading this post from another list I'm on, the first couple of paragraphs immediately returned my thoughts to the current discussion here on shellcode. Some of Jerrold's insights on semantics preservation seemed to translate well into what Dave was suggesting could be done with a shellcode mangler/rewriter. -------- Forwarded Message --------
From: Jerrold Leichter <jerrold.leichter () smarts com> To: Ben Laurie <ben () algroup co uk> Cc: Cryptography <cryptography () metzdowd com> Subject: Re: Optimisation Considered Harmful Date: Thu, 23 Jun 2005 07:36:38 -0400 (EDT) | A brief altercation this evening with CERT over the recent hyperthread caching | issues has brought something that's been simmering at the back of my brain to | the forefront. | | The recent hyperthread/cache key recovery trick, followed by DJB's related | (IMO) symmetric key recovery, and preceded by the RSA timing attacks (Boneh et | al?) are all examples of attacks on the same thing: optimisation. | | The problem is that if different paths through your code expose different | availability of optimisation, then there's a timing attack available. I | predict, for example, that someone will find a way to attack something using | pipeline breaks on Pentium processors[1]. | | How do we fix this? Well, its easy to say: we make everything equally crap - | we make sure that all operations are as slow as the slowest possible variant | can be. However, on a modern processor, this is _very_ hard to do. | | Could it be that for safe crypto we should be using Z80s? This is an excellent point, as it reveals a deep and significant parallel between cryptography and compiler/hardware optimization. Consider what it means to create an optimizing compiler (or some kind of opti- mization in hardware - the issues are the same, but I'll talk in terms of a compiler for definiteness.) The input is source code; the output is a bunch of instructions. A compiler's transformation is correct if it's semantics- preserving: The source has some meaning, and the object code correctly represents that meaning. There are an infinite number of possible object codes that preserve the input semantics. Some are "better" than others with respect to some objective function, say size or speed. An optimizing compiler simply chooses a "better" object code than some reference choice. The big issue in compiler optimization - and even more so in some hardware optimization - is defining exactly what semantics has to be preserved. For example, must computations be done even if the compiler can determine that they cannot affect the output? Can the rules of algebra be used to rearrange expressions (possibly breaking carefully crafted error management strategies, since floating point arithmetic doesn't actually obey the rules of algebra)? Do writes to two variables written in order in the source code have to occur in that order in the object code? If two writes are issued in order to the hardware, do they have to hit memory in that order? An understanding of what semantics has to be preserved, and what semantics is "side-effect" and can be tossed to gain performance, has gradually emerged in the hardware and software communities. There have been, and continue to be, missteps along the way. Some of the weaker memory consistency models might have helped the hardware guys, but were just too hard for the software guys to deal with. Newbies to multi-threaded programming are caught by the not-so- obvious memory access semantics present even at the language level in all common programming languages. (Java tried to pin this down exactly and got it completely wrong for several tries.) Enter cryptographic algorithms. On their face, these are simple mathematical transformations. You have to really abuse the implementations (e.g., having multiple side-effect-producing operations in one statement) to get into area where a programmer or hardware developer's warning bell would sound "watch the semantics". And, in fact, from the point of view of input/output transforma- tions, there really are no semantic issues. The problem is that these side- channel attacks broaden "the meaning of the program" to something that has never been considered in previous work that I know of. (The closest you are likely to find is in complaints by real-time programmers that modern machines give you no way to determine how long an instruction sequence will really take: You might take a page fault, or a cache miss, or who knows what along the way, and in some real-time code, you have to *know* that that won't happen. Such code really is sometimes run only on machines like Z80's! What can be done? Well, the first thing that's clearly needed is a more complete specification of the semantics of cryptographic algorithms. Just the input/output transformation - which is all we write down in most analyses - is insufficient. We sometimes state things informally, almost in passing - as in the comments on AES that "table accesses take constant time". We certainly assume things like "the time to add two numbers is independent of the number of carries" - which is probably true on machines today, but may actually have been false at one time. Without a way to write down what matters, we have no way to judge whether a particular compilation/hardware approach is safe. There's some work on abstract program interpretation that might help here (though it's mainly aimed in other directions). Ultimately, performance is likely to suffer. Software and hardware optimiza- tions are essential to get good performance today, and they are all done under the assumption that it's the *average* case that matters: Increasing the variance (up to a certain - fairly generous - point) is fine as long as you drop the mean. All performance benchmarking is on loops that are repeated thousands of times. I can't recall ever seeing a benchmark that reported the variance of timing across instances of the loop. There are only a couple of roads forward: - Develop algorithms that offer reasonable performance even if implemented in "unoptimized" ways. This will be difficult to maintain in the face of ever-increasing hardware optimiza- tions that you can't just turn off by "not using -O". - Live with less performance and hope that raw hardware speeds will catch up. - Use specialized hardware, designed not to leak side-channel information. - ? -- Jerry --------------------------------------------------------------------- The Cryptography Mailing List Unsubscribe by sending "unsubscribe cryptography" to majordomo () metzdowd com
-- I)ruid, CĀ²ISSP druid () caughq org http://druid.caughq.org
_______________________________________________ Dailydave mailing list Dailydave () lists immunitysec com https://lists.immunitysec.com/mailman/listinfo/dailydave
Current thread:
- [Fwd: Re: Optimisation Considered Harmful] I)ruid (Jun 23)