Bayesian Email Filtering

Ever since I read Jim Daniel’s article on SitePoint regarding Bayesian spam filtering, I’ve been wanting to get my hands on it. The article concerns a product called Spamnix, which is currently only available for Qualcomm Eudora. He listed a few suggestions for Outlook (and/or Outlook Express) users, but nothing that looked too promising.

Well, yesterday I took another look at Mozilla’s [Thunderbird] mail client. Thunderbird is meant as a running mate for Firebird. Turns out, it has resident Bayesian spam filtering. These Mozilla people think of everything, don’t they?

Bayesian filtering is an adaptive form of spam filtering. This means there’s a training period of about two weeks before it starts to get really accurate. However, since everyone’s email habits are different (spam is in the eye of the beholder), it’s the best solution I’ve seen so far.

Rev. Thomas Bayes pioneered the math involved in Bayesian filters way back in the 1700’s. It has a lot to do with probabilities. Who’d have thought it would be applied to technologies this guy probably never dreamed about? You can learn more about the technique behind Bayesian filtering in an article called “A Plan for Spam“, by Paul Graham.

Anyway, I’ve been playing with Thunderbird now for a couple of days. Just downloaded it at work. It’s an alright little mail program. Every bit as good as Outlook Express, but a bit featurless when it comes to saving attachments. Specifically, it seems to lack a drag n’ drop ability, and for some odd reason you can’t reply to attached emails. So, it looks like I’m not gonna be able to use it for my work email (since my manager sends me thousands of attached support requests every day). The junk mail filter is already catching some spam after only a few hours of training it. It’s pretty dumb though, which is to be expected. I’ll let you know if it gets smarter.