POPFile 0.21.1 got released a little while ago, and I noticed it supports storing it’s word corpus in an SQL database. So I decided to try it out and upgraded a few days ago. A relatively painless procedure documented here.
Just had to install Perl on my Win2K box (ActivePerl from ActiveState) and install a couple of modules.
POPFile with the MySQL backend runs a slower than it did using the flatfile BerkeleyDB. My machine running MySQL isn’t the fastest in the world, so I blame it. I imagine it would be significantly faster with the DB server running locally.
Having the corpus and word matrix stored in an SQL database makes it easy to see what’s going on. You can see what words have been classified, what buckets they belong to and how many times they occur. Makes it easier to dig out some statistics (if you want to do such things) about POPFiles word classification.
There are a few other changes to POPFile that make the 0.21.1 upgrade worthwhile. Switching to a MySQL backend from the default involves a little more work, and if you don’t care about it, stick with the default.
If spam is a problem for you, and you want something simple to help manage it, I definitely recommend POPFile. Since the upgrade, POPFile’s gone through 3000+ messages with only 24 misclassifications so far.
Now if only I could get my Groupwise mail to go through POPFile also…