Secure Coding mailing list archives

GCC and pointer overflows


From: karger at watson.ibm.com (karger at watson.ibm.com)
Date: Tue, 06 May 2008 15:49:27 -0400


It's taken me some time to draft a reply, for which I must apologize,
but since Jeremy Epstein mentioned me by name, I must respond.  This is
actually responding to four messages from Jeremy Epstein, Larry Kilgallen,
and Jerry Leichter.


From: "Epstein, Jeremy" <Jeremy.Epstein at softwareag.com>
Subject: Re: [SC-L] GCC and pointer overflows [LWN.net]

Ken, a good example.  For those of you who want to reach much further
back, Paul Karger told me of a similar problem in the compiler (I don't
remember the language) used for compiling the A1 VAX VMM kernel, that
optimized out a check in the Mandatory Access Control enforcement, which
separates information of different classifications (*).  [For those not
familiar with it, this was a provably secure kernel on which you could
host multiple untrusted operating systems.  Despite what some young-uns
seem to think, VMs are not a new concept - they go back at least three
decades that I know of, and possibly longer.  The A1 VAX effort ended
roughly 20-25 years ago, to give a timeframe for this particular
compiler issue.]

I've discussed this anecdote with Jeremy, and unfortunately, I don't
remember that particular optimization problem having happened on the A1
VAX VMM.  It might have been on another system and confused with the VAX VMM.

We did have a compiler optimization problem in zeroing shadow page tables
when switching VMs.  One line of PL/I code compiled into a horrendously 
slow loop.  I had to re-write one line of code into about 10 lines of
very cryptic code (plus a full page of comments, explaining it) to trick
the compiler into generating the right block move instructions to do the job.
I was trying to avoid an assembler subroutine, because the call/return
overhead would have also been a problem here.

Virtual machines do indeed go more than 30 years back. 

The earliest papers on virtual machines are these.  The first paper is
about CP/40, the direct ancestor of CP/67 which was the first
commercially available VMM. CP/40 was done at the IBM Cambridge
Scientific Center on a 360/40 with a custom virtual memory box.  The
second paper is about an even earlier virtual machine monitor done on
a heavily modified IBM 7044X here at Watson Research. The 7044x was a
7044 with an added custom virtual memory box. The term "virtual
machine" was coined by the 7044X people.

1.  Lindquist, A.B., R.R. Seeber, and L.W. Comeau, A Time-Sharing
    System Using an Associative Memory. Proceedings of the IEEE,
    December 1966. 54(12): p. 1774-1779.
  
2.  O'Neill, R.W. Experience using a time-shared multi-programming
    system with dynamic address relocation hardware. in Proceedings of
    the 1967 Spring Joint Computer Conference. 18-20 April 1967,
    Atlantic City, NJ: Vol. 30. Thompson Books. p. 611-621.

From: ljknews <ljknews at mac.com>

At 1:00 PM -0400 5/1/08, Epstein, Jeremy wrote:

Ken, a good example.  For those of you who want to reach much further
back, Paul Karger told me of a similar problem in the compiler (I don't
remember the language)

VAX Pascal, before VMS was on Alpha (and long before Itanium).

-- 
Larry Kilgallen


Actually, the language choice for the VAX VMM was more complex than that.
When the project started the VAX PASCAL compiler version 1 was not suitable
for systems programming use.  As a result, we wrote the VMM prototype
entirely in VAX PL/I.  About the time that the project moved from a research
prototype to a full product development, the VAX Pascal version 2 compiler
became available, and that compiler produced much better code and had
extensions needed for systems programming (such as conformant arrays).
Since PASCAL had stronger typing than PL/I, we switched to PASCAL for
all new code, planning to re-write the PL/I into PASCAL eventually.
In retrospect, we should have just stayed with PL/I, because the re-write
never happened.  We briefly considered using C, but rejected it for two
reasons.  The first reason was all the well-known lack of safety issues,
but the second reason was that the lack of compiler known lengths not only
led to security issues, but ALSO led to performance problems.  Although
the VAX PL/I and C compilers used the same common code generator, the PL/I
compiler generated better code, because the compiler knew more about the
size of data structures and could do better optimization.



From: "Leichter, Jerry" <leichter_jerrold at emc.com>

The VAX VMM effort died with the announcement of the Alpha, in late 1992
- though obviously the death was decided internally once the move to
Alpha was decided, which would have been somewhat earlier.  The origins
of the VAX VMM effort date back to the early 80's.

These dates are not correct.  The A1 VAX VMM effort started just after
the 1981 Oakland conference, as a back of the napkin design in a
Mexican restaurant in Palo Alto.  The prototype was running on a
VAX-11/730 in 1984.  (That included changing the VAX instruction set
to be virtualizable.)  The product version went into highly
successeful external field test in 1989.  The product announcement was
supposed to happen at the 1990 Oakland conference.  The paper at the
conference was disguised as a report of research results to avoid
prematurely revealing the product.  The product was cancelled,
primarily due to internal corporate politics, on Feb 14th, 1990, which
the development team promptly labeled the 2nd St. Valentine's Day
massacre.  As a result, the Oakland paper remained just a report of
research results.

The Alpha was not the cause of cancellation.  Quite the contrary, we
had specifically designed the Alpha to better support secure virtual 
machine monitors than could be possible on the VAX.  Details are 
finally published here:

3.  Karger, P.A., Performance and Security Lessons Learned from
    Virtualizing the Alpha Processor, in The 34th Annual International
    Symposium on Computer Architecture 9-13 June 2007, Association for
    Computing Machinery: San Diego, CA. p. 392-401.
  


As best I can recall, the VAX VMM kernel was written almost entirely in
PL/I.  (Why?  Because the main alternatives available at the time were
assembler and C - way too open-ended for you to be able to make useful
correctness assertions - and Pascal, which even in VMS's much extended
version was too inflexible.  There were almost certainly a few small
assembler helper programs.  Before you defend C, keep in mind that this
is well pre-Standard, when the semantics was more or less defined by "do
what pcc does" and type punning and various such tricks were accepted
practice.)


As I discussed above, the VMM kernel was about 1/2 in PL/I and 1/2 in 
PASCAL V2.  C was explicitly rejected for both security AND performance
reasons.

I know from other discussions with Paul that it was understood in the
high assurance community at the time that, no matter what you knew about
the compiler and no matter what proofs you had about the source code,
you still needed to manually check the generated machine code.
Expensive, but there was no safe alternative.  So any such optimizations
would have been caught.

                                                      -- Jerry


From: ljknews <ljknews at mac.com>

My understanding is that DEC pulled the plug on the VMM project
(called SVS) during a successful field test when they discovered
that while the NSA division that handled trusted computing was
really gung ho about the project, none of the government units
which might actually make purchases were interested in multilevel
secure machines.  Remember that the MicroVAX II was available at
the time and from many perspectives (including that of taxpayers)
it was a lot nicer to use separate machines for various security
classifications.

This is a secure coding mailing list, and not the place to discuss how
to market security.  I will just say that this was the excuse used for
cancelling a product that actually had a very profitable market niche.
The root causes of the cancellation were neither technical nor
profitability.  Lots of DoD organizations had (and still have) needs
for sharing multiple levels of classified information, and separate
MicroVAXes would not solve those problems.  For that matter, neither
would pure isolation kernels, such as MILS.  See this paper for details:

4.  Karger, P.A. Multi-Level Security Requirements for Hypervisors. in
    21st Annual Computer Security Applications Conference. 2005,
    Tucson, AZ: IEEE Computer Society. p. 240-248. URL:
    http://www.acsa-admin.org/2005/papers/154.pdf
  
Getting back to the original question of compilers missing security
checks in the code, there was a case on Multics where the argument
validation code didn't anticipate the possibility of the argument list
containing pointers with the increment and tally option turned on.
What that accomplished was that every time the pointer was used, it
pointed to a different place.  Knowing that the argument validator
touched the pointer N times, I carefully set up the increment and
tally such that N references were to locations that the user had
proper write permission to (therefore passing validation) and the
N+1st reference done by the actual code of the program (as opposed to
the validator) pointed to something the user did NOT have write
permission to, and the operating system thus implemented a function to
patch any location of memory - more than enough to take over the whole
system.  This was on the GE-645 processor, where argument validation
was done by software.  The follow-on Honewell 6180 processor had
hardware argument validation which would defeat this trick.


         - Paul


Current thread: