A Plan for Spam

Writes Paul Graham:

To the recipient, spam is easily recognizable. If you hired someone to read your mail and discard the spam, they would have little trouble doing it. How much do we have to do, short of AI, to automate this process?

I think we will be able to solve the problem with fairly simple algorithms. In fact, I’ve found that you can filter present-day spam acceptably well using nothing more than a Bayesian combination of the spam probabilities of individual words. Using a slightly tweaked Bayesian filter, we now miss less than 5 per 1000 spams, with 0 false positives.

Slashdot thread

Comments are closed.