Bugtraq mailing list archives

Re: HTML email "bug", of sorts.

From: PSE-L () mail professional org (Sean Straw / PSE)
Date: Mon, 20 Aug 2001 21:41:24 -0700

At 15:33 2001-08-20 -0600, Bear Giles wrote:

1) run them through a simple filter for image tags.  With regex,
the pattern could be as simple as "<img ([^>]+)>", case insensitive.
You might need to include some backslash quotes.

.. which immediatley screws up _CODE_ embedded into messages. "Here, joe,the solution to the niggling problem is to replace the code in somefunctionwith <img src..."

KLUNK. This method would have broken valid code - code which may beexpected to be copied and pasted as-is.

For everything that matches, look for any height and width attributes
for the image.  If it's 1, you have a web bug.  Even if it's 2-8 or so,
it's probably still a web bug.

And for code embedded in valid pages, it may not be. How about for imageswithout explicit height and width elements - many clients don't show apreview, or at least show an outline (even on single pixel images) thatthis wouldn't matter in email. In fact, the 'web bug' could just as easilybe a *REGULAR GRAPHIC* (such as a horizontal rule), since you're viewingHTML email, and by the time you realize an image is being loaded - whetherit is visible or not - the request has already been made.

Either comment it out or delete it.  The latter may be preferable
if don't want to break scripts.

Now you're stuck needing to match brackets, which very likely will not workproperly the instant you receive a quoted message:


> the tag <img src="some tag"
> height="1" width="1">

Where does the IMG SRC closing bracket appear when you're using a simpleregexp? What if the second line doesn't appear?

Arguably, if the message body is HTML, the MIME type should indicate asmuch, there should be an opening HTML tag (but there might not be, andemail HTML renderers are pretty lax with this), and gt and lt's that aren'tpart of the HTML coding of the page would be properly escaped. Then again,what stops the spammer from obfuscating their code in the same way? Tryembedding ORDINALS in your page, and a good HTML renderer will render itfine, but most regexps will fail to find a match (I use ordinals to"mailfuscate" mailto urls and even non-URL plaintext email addresses on allof my webpages - it significantly reduces spam which arrives fromweb-spidering spambots).

Besides BGSOUND, page backgrounds and even TABLE backgrounds could utilizean embedded image, in which case, you won't even see it as an IMG SRCtag. Suddenly, your filter needs to fully parse HTML in order to have aprayer of stripping these tags.

Which makes blocking (via RBL, etc) and effectively filtering spam a prettydarn good solution.

Someone mentioned having a port-80 filter on your firewall -- what of dottrackers which reference a specific port number?


        <img src="http://www.somesite.com:110/dot_tracker.file?uniqueid";>

Anyone running a firewall would probably block certain services -- but allthe spammer has to do is run their tracking system on a port for a standardservice which a mail client would be expected to access, and thatfirewalling isn't going to do you much (unless your firewall only allowsaccess for POP3 (110) out to one specific server - joe user is unlikely toconfigure their machine this way, joe poweruser probably won't because theyhave multiple accounts, and joe corporateadmin won't because too many userscheck their various mail accounts from the office, and limiting them inthis fashion would be too grievous).

Sorry if I've pointed out another exploit that the spammers could use tocircumvent such firewall rules.


---
 Please DO NOT carbon me on list replies.  I'll get my copy from the list.

 Sean B. Straw / Professional Software Engineering
 Post Box 2395 / San Rafael, CA  94912-2395

Current thread:

Re: HTML email "bug", of sorts. thomas . rowe (Aug 19)
- Re: HTML email "bug", of sorts. Thor (Aug 19)
- RE: HTML email "bug", of sorts. David LeBlanc (Aug 20)
- <Possible follow-ups>
- Re: HTML email "bug", of sorts. james_kelley (Aug 19)
- Re: HTML email "bug", of sorts. Alex Prestin (Aug 19)
  - Re[2]: HTML email "bug", of sorts. Walter Hop (Aug 20)
    - Re[2]: HTML email "bug", of sorts. Mark Tinberg (Aug 20)
    - Re: HTML email "bug", of sorts. Peter W (Aug 21)
  - Re: HTML email "bug", of sorts. Bear Giles (Aug 20)
    - Re: HTML email "bug", of sorts. Sean Straw / PSE (Aug 21)
    - Re: HTML email "bug", of sorts. Curt Sampson (Aug 21)
- RE: HTML email "bug", of sorts. Ben Yu (Aug 20)
- Re: HTML email "bug", of sorts. Jeffrey W. Dronenburg (Aug 21)