The Big Void In WordPress Comment Spam Protection

By Angsuman Chakraborty, Gaea News Network
Tuesday, January 16, 2007

Big voidI have experiemented with all the available WordPress plugins for comment spam protection including but not limited to Bad Behaviour, Spam Karma 2, Akismet and built-in WordPress features like blacklists and moderation queue. We have to deal with tens of thousands of spam everyday. The key problem with all of these plugins is high rate of false positives along with some false negatives too. In plugins where false negatives are low (like Spam Karma or Bad Behaviour), false positives are unacceptably high. False positive is a condition where a legitimate comment is identified as spam. It is a much bigger problem than false negatives (inability to mark a spam comment as spam). While false negatives are a problem in terms of annoyance, false positives are much bigger problem. It causes you to potentially lose valuable comments, feedback and even business opportunity; I speak from first-hand experience.

In brief my experiences with these plugins are:

Bad Behaviour had in the past prevented legitimate comments from appearing in my blog. Posting of a comment used to silently fail after displaying a blank page. The error was sporadic which made it harder to debug. After several months of sporadic complaints from my users I finally realized Bad Behaviour was the one to blame. There has been few releases since but I haven’t looked at it again. I have had complaints about it from other users about lack of support.

Spam Karma 2 used to a venomous plugin. It has been known to insult legitimate commenters of a blog after misjudging them as spammers. I have been told it improved its potty mouth. But the underlying problem remains. It uses over aggresive techniques which leads to high incidence of false positives. Installation used to be a problem, so much that its author used to ship a version of WordPress with the plugin installed! I have had bad experiences with it in the past and strongly advice my friends and clients against using it.

I personally looked in the code for Bad Behaviour and found several over-aggresive, hard-to-justify logic which leads to their high false positive rates.

I too provide an anti-spam plugin - Referrer Bouncer. Unlike its counterparts Referer Bouncer doesn’t normally give false positives. However it requires active management of the list for best performance which may not be possible for average joe bloggers. Also Referrer Bouncer tackles only one class of spams - referrer spams or spams with a referrer payload. While it is an important category of spam, a lot of spams these days doesn’t come with referrer payload.

Let’s talk about Akismet, a popular anti-spam plugin from the creators of WordPress. Akismet is a blunder in terms of vision and to some extent architecture. Akismet works by relying on individual bloggers to train it to identify spam. While it looks good in theory, in practice there are two types of bloggers - bloggers and sploggers. Spam bloggers or sploggers have made it a game to game Akismet as it is very easy to do so. You can, for example, write a simple script to feed to akismet that a certain legitimate blogger is spammer and then in future all his comments will be marked as spam. The reverse is also true. Today I get several hundred spams a day which have passed through Akismet. I also get some of the legitimate comments marked as spam and held in moderation queue by Akismet. Unfortunately I am unable to even look in my Akismet queue as there are several thousands entries in these queue. My pet blog has over 5000 entries in manual moderation queue which have passed through Akismet. My browser fails to even load that page!

The other problem with Akismet is the size of the Akismet queue. It holds together the comments for manual review and training. Unfortunately any popular blogger is likely to get several thousands of spams in Akismet queue, making it virtually impossible to manually identify spam versus ham. Akismet doesn’t even provide paging of that screen, a minor technological glitch compared to the humongous mistake in vision of relying on any blogger to help it.

Many bloggers prefer stacking anti-spam plugins like Akismet with Spam Karma 2 or Akismet with Bad Behaviour. Unfortunately the effects are even more worse and undetermined in many cases. These plugins haven’t been designed or tested to play well with each other. It requires lots of testing to ensure that you aren’t breaking something. Also adding two plugins, each of which gives false positives, is only going to compound the problem. You will often find in the wild someone praising plugin x or y. In reality most of them don’t understand how these plugins work and they don’t know or don’t care how much legitimate and valuable comments they are missing.

There are two other aspects of spam blogging which you should be aware of. Anti spam plugins like Spam Karma or Akismet rely on MySQL database queries to help it identify and / or store spam, which increase your database load. Akismet relies on communicating with its server to identify spams. So not only you are getting these spams, your server as well your database is being loaded and you are wasting bandwidth in communicating with external servers. It is not a coincidence that many WordPress bloggers are being booted out of their shared hosting environments and forced to go for VPN or dedicated hosting. I moved to dedicated hosting a year ago. After extensive tests I clearly identified that the majority of load on my server is due to spam comment processing.

The key to comment spam prevention is understanding the psyche of a spammer (more on it later). A good spam prevention plugin should at least ensure zero (or extremely close to zero) false positives. If that means few false negatives that is acceptable. Anti-spam plugins should be stackable or at least have their own plugin architecture. Any takers?

Discussion

Heather
January 20, 2008: 3:55 am

I agree about Bad Behavior…I\’ve had problems with it too and I\’ve read it sometimes blocks Google. One time it locked me and everyone else out of my blog. My rankings on one blog went up as soon as I disabled it.

October 16, 2007: 5:02 am

Hi Angsuman

Just wondering if you know how to force Wordpress blacklist deletion to take place *before* Akismet kicks in. That would seem to be the best thing for it to do.

January 16, 2007: 2:33 pm

Sorry to keep this so brief, but I’m in a bit of a hurry and can elaborate on this later if you’d like.

I’m currently using the combined efforts of Bad Behavior and Akismet and experience, on average, one false negative per month. I also block the IPs of repeat offenders (more than five spam comments submitted per day) for a maximum of seven days.

Judging by your experience with Bad Behavior, I strongly recommend that you try the most recent version (currently v2.0.9). Most of the recent fixes have been geared toward dramatically lowering the amount of false positives. More information is available on Michael Hampton’s Bad Behavior blog, Lunacy Unleashed.

I also recommend that you try the most recent version of Akismet (currently v1.2.1). The current version introduced a paginated view of spam comments, which dramatically shortens load times, but I’m not sure if it paginates the moderation que, as I have never had a moderation que as large as yours.

January 16, 2007: 9:48 am

[...] « The Big Void In WordPress Comment Spam Protection Is Google AdSense Losing Its Relevance January 16th, 2007 by Angsuman Chakraborty [...]

January 16, 2007: 9:20 am

@ Matt
An even simpler configuration is to prevent commenting at all :)
Seriously though what I am trying to highlight is more about the problem of misidentification of genuine comments as spam. If you care about your readers (and potential clients for some) then you should look closely into what they are doing and not simply the fact that they are preventing spams (along with hams).

Quix0r> Even when I use anonymous surfing I’m able to comment on my blog.

That is not an indicator of success. Check the BB code to see more about what it is doing.

> But you need to know that we plug-in coders (like mine) are doing our job (coding cool plug-ins) for free - so in our free-time. So we don’t earn money from it.

I understand the pain and limitations of free plugin authors. I too offer several popular WordPress plugins for free.

The underlying fact is that as the product is free it also comes with limited support (time permitting) and zero liability. While it is fully understandable from the plugin authors point of view, it may not be acceptable from many bloggers point of view who would be willing to sponsor for high quality plugins and software to maintain high standards for their blogs.

> Well, I don’t want money for my plug-in (see my blog for instance, not adverts here!) because it’s not a commercial one. But I want that you know that I have already spent lot’s of time in my plug-in.

I understand your sentiments fully as I explained above. However the key point is as the platform is maturing so is the need for high quality products which are well supported. It is with this view I released my first paid plugin - Translator Plugin Pro for providing translation of WordPress blogs in 14 languages. But I digress. I am not criticising the free plugin authors per se. I am in the same seat as they are. I am simply pointing out some limitations of current anti-spam products and solutions. There is a poem by poet and nobel laureate, Rabindranath Tagore, which roughly translated in English is:
“I close the doors to prevent lies from entering my mind, I then also close the door for truth”. Replace lies with spams and truth with ham and you can see what I am trying to convey :)

Thanks for all of your insightful comments.

January 16, 2007: 8:40 am

Really? Is BB2 doing so worse things to your blog? I haven’t realized it since I have installed it. Even when I use anonymous surfing I’m able to comment on my blog. :)

Well, yes. BB2 lacks of support and config options. But you need to know that we plug-in coders (like mine) are doing our job (coding cool plug-ins) for free - so in our free-time. So we don’t earn money from it. Well, I don’t want money for my plug-in (see my blog for instance, not adverts here!) because it’s not a commercial one. But I want that you know that I have already spent lot’s of time in my plug-in.

I guess Michael Hampton can say the same on this point. He has a real life and a real job (I hope so?) and BB2 is being developed in his free-time, too. And the same for SK2…

But I will keep BB2,SK2 and Akismet (even about your discovered security/privacy concerns because I have the knowledge to hack the plug-in a little) because I want to help Michael and Dr. Dave for testing their software on my blog.

January 16, 2007: 8:17 am

I use Akismet and BadBehaviour : BB blocks all unusual attempts at the door of the blog and Akismet takes care of the eventual spam that *could* have gotten through. I have to say those two stack pretty well as I used to have thousands of spams everyday and it’s been months since I last got one.

No more checking the Akismet queue, yay !

January 16, 2007: 5:07 am

Thank you for your post, it was an interesting read, although I do not agree with your assesment of Akismet. I have been very happy with it since my blog took off, and started recieving 200+ visitors per day. I have not so far recieved a false positive, but one or two might have slipped my otherwise keen eyes :-)

I am curious to see how Bad Behaviour will turn out. I actually happened to install it yesterday, and one of the things I was immediately annoyed with, was the lack of information/configuration. It has, however, removed several hundred attempted referrer spam visits that I usually suffer from.

I will give it a try for a week or two, and see what happens.

YOUR VIEW POINT
NAME : (REQUIRED)
MAIL : (REQUIRED)
will not be displayed
WEBSITE : (OPTIONAL)
YOUR
COMMENT :