The Big Void In WordPress Comment Spam ProtectionBy Angsuman Chakraborty, Gaea News Network
Tuesday, January 16, 2007
I have experiemented with all the available WordPress plugins for comment spam protection including but not limited to Bad Behaviour, Spam Karma 2, Akismet and built-in WordPress features like blacklists and moderation queue. We have to deal with tens of thousands of spam everyday. The key problem with all of these plugins is high rate of false positives along with some false negatives too. In plugins where false negatives are low (like Spam Karma or Bad Behaviour), false positives are unacceptably high. False positive is a condition where a legitimate comment is identified as spam. It is a much bigger problem than false negatives (inability to mark a spam comment as spam). While false negatives are a problem in terms of annoyance, false positives are much bigger problem. It causes you to potentially lose valuable comments, feedback and even business opportunity; I speak from first-hand experience.
In brief my experiences with these plugins are:
Bad Behaviour had in the past prevented legitimate comments from appearing in my blog. Posting of a comment used to silently fail after displaying a blank page. The error was sporadic which made it harder to debug. After several months of sporadic complaints from my users I finally realized Bad Behaviour was the one to blame. There has been few releases since but I haven’t looked at it again. I have had complaints about it from other users about lack of support.
Spam Karma 2 used to a venomous plugin. It has been known to insult legitimate commenters of a blog after misjudging them as spammers. I have been told it improved its potty mouth. But the underlying problem remains. It uses over aggresive techniques which leads to high incidence of false positives. Installation used to be a problem, so much that its author used to ship a version of WordPress with the plugin installed! I have had bad experiences with it in the past and strongly advice my friends and clients against using it.
I personally looked in the code for Bad Behaviour and found several over-aggresive, hard-to-justify logic which leads to their high false positive rates.
I too provide an anti-spam plugin - Referrer Bouncer. Unlike its counterparts Referer Bouncer doesn’t normally give false positives. However it requires active management of the list for best performance which may not be possible for average joe bloggers. Also Referrer Bouncer tackles only one class of spams - referrer spams or spams with a referrer payload. While it is an important category of spam, a lot of spams these days doesn’t come with referrer payload.
Let’s talk about Akismet, a popular anti-spam plugin from the creators of WordPress. Akismet is a blunder in terms of vision and to some extent architecture. Akismet works by relying on individual bloggers to train it to identify spam. While it looks good in theory, in practice there are two types of bloggers - bloggers and sploggers. Spam bloggers or sploggers have made it a game to game Akismet as it is very easy to do so. You can, for example, write a simple script to feed to akismet that a certain legitimate blogger is spammer and then in future all his comments will be marked as spam. The reverse is also true. Today I get several hundred spams a day which have passed through Akismet. I also get some of the legitimate comments marked as spam and held in moderation queue by Akismet. Unfortunately I am unable to even look in my Akismet queue as there are several thousands entries in these queue. My pet blog has over 5000 entries in manual moderation queue which have passed through Akismet. My browser fails to even load that page!
The other problem with Akismet is the size of the Akismet queue. It holds together the comments for manual review and training. Unfortunately any popular blogger is likely to get several thousands of spams in Akismet queue, making it virtually impossible to manually identify spam versus ham. Akismet doesn’t even provide paging of that screen, a minor technological glitch compared to the humongous mistake in vision of relying on any blogger to help it.
Many bloggers prefer stacking anti-spam plugins like Akismet with Spam Karma 2 or Akismet with Bad Behaviour. Unfortunately the effects are even more worse and undetermined in many cases. These plugins haven’t been designed or tested to play well with each other. It requires lots of testing to ensure that you aren’t breaking something. Also adding two plugins, each of which gives false positives, is only going to compound the problem. You will often find in the wild someone praising plugin x or y. In reality most of them don’t understand how these plugins work and they don’t know or don’t care how much legitimate and valuable comments they are missing.
There are two other aspects of spam blogging which you should be aware of. Anti spam plugins like Spam Karma or Akismet rely on MySQL database queries to help it identify and / or store spam, which increase your database load. Akismet relies on communicating with its server to identify spams. So not only you are getting these spams, your server as well your database is being loaded and you are wasting bandwidth in communicating with external servers. It is not a coincidence that many WordPress bloggers are being booted out of their shared hosting environments and forced to go for VPN or dedicated hosting. I moved to dedicated hosting a year ago. After extensive tests I clearly identified that the majority of load on my server is due to spam comment processing.
The key to comment spam prevention is understanding the psyche of a spammer (more on it later). A good spam prevention plugin should at least ensure zero (or extremely close to zero) false positives. If that means few false negatives that is acceptable. Anti-spam plugins should be stackable or at least have their own plugin architecture. Any takers?
Tags: Cases, Friends