The other day I linked to this Akismet criticism. If you follow the comment thread, it seems to me the original case of the poster gets weaker and weaker as the argument continues. I know of three cases of false positives in my 37,000+ comments that Akismet has flagged. That’s better than I do by hand because when you sift through a few hundred or thousand of these, it gets more and more possible that you miss some. Comment spam that is trying to get through moderation works the same way as ack-ack did in World War I air battles. Through enough of it up there and you are bound to hit something.
There’s another phenomenon that I’m experiencing. I just had to deal with a few of these this morning. That’s comments that I legitimately cannot tell if they are spam or not. They tend to fit a pattern: they come from Europe, they have as the website link a commercial site but they have a few sentences or paragraphs that are actually applicable to the post it is in reply to. My guess is that they set up these posts, Google or Technorati search on blogs that match it and submit to all of them.
In general, these never got approved. The judgment call that comes back to me is whether I simply delete them, or mark them as spam to be learned as such in the Akismet engine. I’m not always sure what to do, because in some cases I’m not 100% sure they are spam. Ultimately my heuristic comes down to this. The benefit of a doubt is gone with comments. If I can’t tell it is legitimate, it doesn’t go on the site. If in doubt, it is not approved. If I can’t tell it is spam, I don’t report it back to Akismet. In these cases I simply delete it.
Once again, let me highlight that when you post comments on blogs it is up to you to be distinguishable from spam. Be distinctive, have a human voice. If it looks a lot like the 100 spams around it in the moderation queue, it is going into the bit bucket.