Full Disclosure mailing list archives
Re: Spam with PGP
From: "Jonathan A. Zdziarski" <jonathan () nuclearelephant com>
Date: Wed, 08 Oct 2003 00:18:24 -0400
Bayesian filters have had some amazing successes. The problem we (the company I work for) continue to have, and the reason we continue to choose SA, is that training a thousand users on how to use a Bayes system is pretty much impossible (and we're small compared to many!) Assuming that I give you (I'm do not believe it, but will give it for the sake of argument) that Bayes is the best theoretical solution, the Bayes folks have a problem in implementation. Training users is not easy; think about training your mother or grandmother but multiply by 1000.
This is why two features exist, both which I think are components of any good Bayesian solution: 1. User groups. The ability to clump a large group of users who share similar email behavior together sharing one dictionary and one spam alias. This is ideal for departments within corporations where email is expected to be primarily for company use. 2. A merge tool. A tool that will allow an administrator (or script) to merge the dictionaries from N users over a large span of diversity to create a single seeded dictionary for a new user. This should solve a majority of your problem. Granted, seeded dictionaries still take a "little" bit of learning, but it's a lot easier for granny to get going with one of them.
The point is not that you are wrong; indeed, I'll accept that a perfectly trained Bayes DB may produce better results than any other technology right now, and that a tech savvy user may generate such a perfect Bayes DB. The point is that spam is a global problem- unless your solution can be extended to all users, there is no point IMHO.
Global tools are also an invaluable asset to fighting spam. We're working on a magical blacklisting tool that will capture source ips from incoming spam...when a threshhold is exceeded, all incoming messages from that source ip are marked/learned as spam for all users (system wide) for whatever time period we specify. Mechanisms like this, along with some newer ideas for networking dictionaries, I'm confident will help remove much of the learning curve from Bayesian filters. Note, however, that the learning process does not need to be tech-savvy. For example, we specifically sculpted our tool to be brain dead easy for grandma. You get your mail like normal, and if you get a spam you forward it to grandma-spam () yourdomain com. There are even tools such as SpamSource (for Outlook) that can make this process a simple click of a button. The signature mechanism we use stores the original tokenset in binary format in a temporary database on the server (or in the form of message attachments), which our tool will then use to relearn the message as spam. We're working now to try and find a better way to eliminate the need for checking a quarantine. This is unnecessary anywhere from 99.900% (worst) to 99.99% (best) of the time, and even though there's a button you just click called 'THIS IS NOT SPAM' it would be nice if we could eliminate the need to check quarantine unless alerted to do so under certain statistical conditions. Anyhow, my point is, we're trying to improve the ease-of-use factor, which is a big reason tools like SA are still useful...out-of-the-box functionality...however that doesn't necessarily mean heuristics are not obsolete from a scientific perspective. I think we're getting to a point where enough tools exist to make a deployment just as easy, and hopefully if things continue at the rate they're going, companies like yours that require this level of ease will be able to use Bayesian solutions. _______________________________________________ Full-Disclosure - We believe in it. Charter: http://lists.netsys.com/full-disclosure-charter.html
Current thread:
- Re: Spam with PGP, (continued)
- Re: Spam with PGP Jonathan A. Zdziarski (Oct 08)
- Re: Spam with PGP Gregory A. Gilliss (Oct 07)
- Re: Spam with PGP Jonathan A. Zdziarski (Oct 07)
- RE: Spam with PGP Kurt Weiske (Oct 07)
- Re: Spam with PGP Craig Pratt (Oct 08)
- Re: Spam with PGP Jonathan A. Zdziarski (Oct 07)
- Re: Spam with PGP Bob Apthorpe (Oct 07)
- Re: Spam with PGP Shawn McMahon (Oct 07)
- Re: Spam with PGP Jonathan A. Zdziarski (Oct 07)
- Re: Spam with PGP Devin Nate (Oct 07)
- Re: Spam with PGP Jonathan A. Zdziarski (Oct 07)
- Re: Spam with PGP Devin Nate (Oct 07)
- Re: Spam with PGP Jonathan A. Zdziarski (Oct 08)
- Re: Dealing with spam (was: Spam with PGP) Paul Russell (Oct 08)
- Re: Spam with PGP Kiko Piris (Oct 07)
- Re: Spam with PGP Jonathan A. Zdziarski (Oct 07)
- RE: [inbox] Re: Spam with PGP Curt Purdy (Oct 08)
- Re: Spam with PGP Shawn McMahon (Oct 07)
- Re: Spam with PGP Sebastian Niehaus (Oct 07)