POPfile: sorting the mail

Posted on Sunday, November 16, 2003 @ 10:16 am
Filed Under Computing, POPfile, Security | Leave a Comment

When Sobig's author unleashed his spam (and bounced email) plague on us last August it became clear I needed to automate my mail sorting process; I was spending far too many hours writing rules.  After checking out the sites for a couple filtering products I'd heard of, I decided to see if POPfile met my needs.  I loaded it on my machine, spend a couple hours making setup decisions, and did the necessary configuration of both POPfile and Eudora.

An essential fact:  While POPfile usually functions as a spam filter, its design supports sophisticated sorting of email into a large number of categories.  I'm using it as a mail sorter; the spam filter is important, but the software's smart about all of my mail, and in a real sense the spam folder's just another target for the sorter.

Basic Information

I receive between 50 and 100 e-mails each day, and read about 60% of those (the unread ones are either duplicates or spam). I used to read about 85% of my mail; the change in percentage is largely because of the increasing spam load. (Eudora has a reporting function; these numbers have some relation to reality.) Perhaps 65% of the real mail has baseball content of some sort or other; the rest is on a wide range of topics.

These get sorted into a couple dozen categories; I tinker with these a bit, but they are essentially the same categories I used for sorting e-mail in 1995.  A large percentage of my mail originates from the Society for American Baseball Research list called SABR-L, which has its own folder; the remaining folders group mail in ways which largely reflect my mental prioritizations.  One folder, called "Lists," is the target for mailing lists on miscellaneous topics.  I sometimes ignore SABR-L for months; I check my eBay mail daily.

After reading the POPfile documentation, I decided to see how well it sorted the total daily package.  I set up "buckets" to match the folders, replaced several hundred Eudora rules with twenty-five, and set about teaching POPfile how to sort things. This story begins on August 18.

Here's my report....

First Thousand

Since you train POPfile by correcting its errors, the first few dozen messages are basically all errors and the first few hundred are unreliable.  I took an accounting after message 1,049, which arrived on September 30.

Second Thousand

POPfile weathered its adolescence in the first half of October, and reached message 999 on October 18.

Third Thousand

This test span ended November 4 at 1,008 messages.

Since November 4

I've received 712 messages; 97.6% are being sorted correctly. Not bad, if you ask me. I'll not give you a further breakdown 'til I reach 1,000.

POPfile's principal author, John Graham-Cumming, announced a new version a couple weeks ago, which I've not yet installed. I'll do that in a day or two.


Last changed 11/21/07 @ 10:41 pm

Comments

Leave a Reply


The moderator will not approve anonymous comments (he's tolerant of nicknames, though). Thanks for your cooperation.



XHTML: You can use these tags: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>