nanog mailing list archives
RE: Spam with no purpose?
From: Paul Jakma <paul () clubi ie>
Date: Sat, 3 Apr 2004 01:00:58 +0100 (IST)
On Wed, 31 Mar 2004, Michel Py wrote:
1. Reduce the efficiency of Bayesian-like filters: Trouble with this kind of email is that they are a) of sufficient length b) contain only "real" words c) contain none of the words regularly used by spammers such as the v. word.
Good bayesian filters do not score on single words alone, they also score on "phrases" (ie multiple words). Random strings of words will result in neutral scores (presuming those words are also used in non-spam), while the phrases will be slightly higher. Re-used gibberish (ie apparently random) strings of words will result in "phrases" from that gibberish having high scores. Also, a good bayesian filter should prune its database regularly of phrases (including one word phrases) that have not had their score updated recently, further reducing "pollution" by random words and phrases. noise is just noise. the spam specific stuff will still be statistically significant, hopefully. regards, -- Paul Jakma paul () clubi ie paul () jakma org Key ID: 64A2FF6A warning: do not ever send email to spam () dishone st Fortune: It's currently a problem of access to gigabits through punybaud. -- J. C. R. Licklider
Current thread:
- Re: Mail with no purpose?, (continued)
- Re: Mail with no purpose? Richard Cox (Apr 01)
- Re: Mail with no purpose? william(at)elan.net (Apr 01)
- Re: Mail with no purpose? Eric A. Hall (Apr 01)
- Re: Mail with no purpose? william(at)elan.net (Apr 01)
- Re: Mail with no purpose? Eric Brunner-Williams in Portland Maine (Apr 01)
- Re: Mail with no purpose? Richard Cox (Apr 01)
- RE: Spam with no purpose? william(at)elan.net (Mar 31)
- RE: Spam with no purpose? Paul Jakma (Apr 02)
- RE: Spam with no purpose? Todd Vierling (Apr 05)
- RE: Spam with no purpose? Paul Jakma (Apr 05)
- RE: Spam with no purpose? Scott Call (Apr 05)
- RE: Spam with no purpose? Todd Vierling (Apr 05)