Vulnerability Development mailing list archives

Re: Plain text files in internet explorer


From: "Daniel Newby" <dnewby () nomadics com>
Date: Wed, 04 Sep 2002 18:35:03 -0500

At 08:42 AM 9/3/2002 -0700, Marc Slemko wrote:
On Mon, 2 Sep 2002, Dan Kaminsky wrote:

> I'm serious; we have an extension <-> filetype LUT in the web server,
> the one component that cares least about the content, and it's breaking
> at precisely this point.  Extensions are file types.  Period.
[snip]

There is no such thing as a filename in a URL, and no such thing as a
filename extention in a URL.  Heck, lets look at the most common
case: a path ending in a trailing "/", such as http://www.example.com/foo/
How do you know if that is plain text, HTML, XML, etc.?

Hear, hear! The misconception that URLs are names in a filesystem is far too common. Print this out and hang it on your wall: a URL is an abstract resource locator.

And this isn't a matter of URLs or extensions being abused by web designers. Web servers can and do serve different content to different clients, based on the headers sent by the client. Consider a URL for a popular stock quote service that looks like this: "http://www.example.com/stock-quote?symbol=IBM&symbol=INTC";. The site designer is going to optimize that page so that everybody gets the most pleasing experience possible. Netscape 4.x might get text/html with HTML that works around its bugs, WebTV might get text/html with a simplified page layout, IE might get text/xhtml, a cellphone might get text/vnd.wap.wml, and an automated data extractor might get text/xml.

Trying to handle this sort of variability using extensions would destroy the WWW. Many links would be to files of the wrong type. Users would be forced to manually deduce where the extension was in the URL, manually substitute the one they think their web browser wants, and pray that the web server could actually handle that extension.


From a security perspective, the more complex the behaviour the
harder it is to analyze, predict, and filter.  Using the MIME type
properly is about the simplest behaviour possible.  Having some
copmlex system of sometimes looking at the MIME type, sometimes
looking at what you think is a filename, and sometimes looking at
the content to guess what to do with it is a nightmare, and has
already been the direct cause of a number of security holes in IE.

It also has implications for personal trust and reputation. Depending on what URL the backend database comes up with, or what keyword data I innocently put in a file, IE might make it silently vanish for some users, or appear to be corrupt. An unsophisticated user would tend to view the problem as incompetence on my part.

Outlook has similar problems. Placing "begin" at the beginning of a line can cause part of the message to vanish, and certain valid Content-type headers will cause attachments to disappear. The latter makes the sender look like a fool who doesn't know how to attach files, or couldn't pay enough attento to. (I read one anectdote from a Un*x user who'd been sending out resumes for a year and couldn't figure out why he was getting an *abysmal* response rate. It turns out that HR people tend to use Outlook, and they didn't see his attached resume. "Please see attached file? Hmm...no paperclip. Ha! This idiot couldn't even remember to attach his *resume*. [delete]")

    -- Daniel Newby, speaking for myself



Current thread: