PopFile occasional report

Readers will recall that I’ve been using John Graham-Cummings’ PopFile as a spam filter/mail sorter for over a year.  Time, methinks, for another update; I last mentioned the program in July.  I covered the important background information about a year ago, and shan’t repeat it today; you may also want to check the PopFile cross-references below to see what I’ve said before.

I’m still using version 0.20.1, which puts me a couple editions behind.  Since I’m satisfied with my version’s performance and don’t want to fight my way through the Mac upgrade process I’ll likely stay here for a while; John and his team will need to add something compelling for me to change.  (A proper Macintosh install would be helpful….) 

This version’s a little slow on my machine but not to the point it bothers me; your mileage may well vary.

On to specific results, again organized as I’ve done in past entries:

The test period ended November 11, 2004, at 7,952 messages.

  • 93 (1.2%) were sent to the wrong bucket.
    • (Therefore) 98.8% were sent to the right bucket.
    • This is my first report which didn’t include significant training, so 99% looks like the “norm” for my system.  One way to read this stat is that I decide to reclassify about one message per day.  Better than writing rules….
  • 3,219 (40.5%) were spam.  (This is a decrease from the previous 46.6%, which would seem to merit comment.  Not sure what that comment should be, though.)
  • There are areas where the app has, well, issues:
    • 431 messages were auction-related, with 10 false positives and 3 false negatives.  (As you might surmise, I’m again active on eBay.)  There’s enough noise in auction e-mail that some errors are inevitable.  PopFile is very good, though, at spotting eBay and PayPal phishing messages.
    • The sorter has significant problems getting my mailing lists right (407 messages/14 false +/8 false -), mostly because they cover a wide range of territory.
      • On the other hand, last time there were 48 false positives; it’s learning….
    • Vendor mail (118/8 false +/9 false -) is another bucket with some problems.  Again that’s likely because I catch a number of types of messages there.
    • I’ve pretty much abandoned the effort to get the Change Detection mail into the right boxes, and am effectively treating the whole set as one mailbox.  It’s more trouble than it’s worth, I’ve apparently decided; the app’s just refusing to notice how those emails differ.  Since these are mainly baseball-related sites, the issue’s not currently important.  Next spring I may try something.

On the whole, this is excellent performance, with some minor (and predictable) blind spots due to peculiarities that are as much mine as the program’s.  Except for the lack of a good loader for Apple systems, I can heartily recommend the program; the installation issues appear to be unique to the Mac platform, and shouldn’t trouble Windows or Linux users.  Prospective users shouldn’t expect perfection, and some effort is required to train PopFile about your mail system.  But it’s automatic, reliable, and quite impressive.

This entry was posted in Semi-Geekery and tagged . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.