Okay

WordPress and Comment Spam

I would like to take an aside and delve into a related topic that has been on my mind for the past few months: Comment spam.

Though it doesn’t have much to do with content theft, I have several reasons for wanting to cover this. First many of the RSS scrapers and spam bloggers also use this technique to supplement their work. Second, in some cases, the spammed comment contains scraped content, either from your site or others making it an infringement. Finally, it is an issue that is dear to the target readers of this site, bloggers and Webmasters.

Though WordPress’ reputation against spam blogs has been almost impeccable, it has proved to be very vulnerable to comment spam. This has given rise to an entire cottage industry of anti-spam plugins and most of them, in my experience at least, have been ineffective.

This lead me, about a month ago, to disable comments on all old posts. However, I have since backed away from that position because, among other reasons, it simply was not working.

The most effective comment spam plugin I know of, Akismet, is made by Automattic, the operators of WordPress.com. It is a generous gift to the community and it comes at what must be great expense to Automattic since it works by letting their servers filter the millions of comments that get submitted. However, it is not perfect, by Automattic’s own admission, and it does not stop the comment spam from going through, just from appearing on the site.

Unfortunately, WordPress’ problem with comment spam runs much deeper and it renders nearly 99% of all anti-spam methods useless. However, a change on the backend could, potentially, fix that and change the comment spam game forever.

How a Comment Gets Posted (and The Problem With It)

A comment in WordPress works just like any other form.

You most likely have a comments.php file in your template that represents the actual comment form. That is embedded into your post pages via a template call. The comments.php form, upon submission, sends the comment to another file wp-comments-post.php, which sends the comment up the chain of commands and, eventually, places it in the database.

It is a simple form that works like any other. It is also the same process as when you send an email via a Web form or post a forum. However, the problem is that, with WordPress and other spam-prone applications, the backend does not know what the frontend is doing.

Basically, with the default install, wp-comments-post.php has no way to confirm that the comment has come from comments.php or anywhere else on the domain.

Spammers, being the clever lot that they are, simply started calling the wp-comments-post.php without ever visiting the site itself. They simply call the file with a specially-formatted address and, magically, a comment is submitted though the bot never set foot on the actual post page.

This is bad news for WordPress users as nearly all spam counter measures rely on modifications to the comments.php file to work. This includes most captchas, spam questions and even some comment disabling plugins. The spammer simply bypasses those measures, leaving only post-submission filtering to weed out the junk from the real comments.

Though, on most sites, that is a fairly effective approach, sites with large volumes of spam, such as this one, might find it unacceptable. Not only does it mean that some spam is destined to escape the filters and go live, but it can put a strain on the sever, even if, as with Akisment, most of the filtering is done elsewhere.

Furthermore, if email spam has taught us anything, filtering systems are prone to the “better mouse” problem. If one clever spammer finds a way to game the system, the hull will have been breached and all could be flooded.

How Bad Is It?

The problem is rampant. Consider this screenshot taken from my own site stats yesterday.

comments.png

You can see that the wp-comments-post.php file is the fourth most called file on my server (Note: Both share-this.php and the ajax-edit-cooments files are often called multiple times in a single page, thus why they are so high.). A quick check of the comment count shows that there is no reason for that to happen.

There are hundreds of hits per day on that file, most of which never access the site itself.

This has led to a whole slew of solutions to the problem. The first is to simply rename the file. However, spammers have grown wise to that method, detecting the new name in as little as ten hours.

Another is to edit your .htaccess file to block visitors from accessing the wp-comments-post.php file without first visiting your domain. I implemented this myself on Plagiarism Today but, while my comment spam volume decreased some, it did not stop. Spoofing a referrer is pretty trivial and it seems that most comment spammers are already doing that.

Yet another hack involves increasing the time between comment submissions, a method that works to stop spammers that “flood” your comments, but does nothing to stop spammers who post once and then come back at a random time later to post again.

One final method, which I ran across some time ago but have been unable to locate again for this article, involved inputting code into the comments.php file that would then be verified by the wp-comments-post.php file. Though it was a messy edit that involved hacking both files, it would have been, theoretically, effective. Once I locate the hack again, I will try it and see if it does indeed work.

In the end though, short of hacks and server alterations, there is no way to prevent this kind of injection. Since almost all plugins deal only with the comments.php file, there is no simple way to effectively block this kind of abuse.

Fixing the Problem

This problem is not unique to WordPress by any stretch of the imagination. None of this should be taken as a criticism of WordPress or its developers. This problem is present on other blogging platforms, message board applications and nearly anything that accepts input from the outside world and posts it to the Web. WordPress merely happens to be what I use and what I am most familiar with.

That being said, there needs to be a fix for this problem. There needs to be some way for the backend, wp-comments-post.php, to ensure that the comment actually came from the frontend, comments.php.

One solution involves using a generic anti-spam question in the comments file but then hacking the wp-comments-post.php file to die if the answers to not match. Thus, anyone calling the backend directly without knowledge of the question would get an error.

However, a static method, like the one described in the post, could be easily beaten by a spammer just adding the variable to their software. A more random implementation, such as the one described in the comments, would provide more protection but could still be figured out if needed since computers are very good at math.

I am not a programmer, but what seems to be needed is a means for the two files to handshake with one another in a way that a spammer can not crack. One example might be to create a hash of the comment using a key that exists only on the server. Another would be to use a pseudorandom variable such as a random number generator, the time on the CPU clock or anything else the two files could share. Another idea would be to have the backend check the WP log and ensure that, at the very least, the IP address involved visited the post page in question before commenting.

(Note: The above suggestions are offered “off the cuff” and probably would not work. Please post suggestions and ideas in the comments.)

This would not be easy. It might require rethinking the entire comment posting process, but certainly there has to be a way to at least improve the situation so spammers can not, with easy, abuse the system.

I am open to any and all suggestions on the process. Please comment below if you have any thoughts.

Some Brief Good News

I did, recently, run across some good news in this fight. I installed reCAPTCHA on my blog a few days ago as an experiment. Though it didn’t stop the flow of spam comments, it did improve Akismet’s accuracy greatly.

It appears that, for whatever reason, Akismet has an easier time dealing with comment spam when it comes almost solely from the backend. Since I installed reCAPTCHA, I have not had any spam comments go live or enter the moderation queue.

I plan to continue the experiment for at least a few more days to see if that trend continues.

(UPDATE: I just received an email from Ben Maurer, the tech lead on the reCAPTCHA project, he said that reCAPTCHA counts the spam as it eats as spam in WordPress, that could explain why Akismet seems to be so accurate. Still, what intrigues me most is that no spam has gotten all of the way through. It seems logical that reCAPTCHA is blocking the spam that actually uses the form, which was the spam getting through from time to time, while Akismet easily handles the spam directly injected through the backend.)

(UPDATE 2: As my education on reCAPTCHA continues, it appears that the plugin DOES validate against comments injected into the backend. That officially makes this my favorite anti-spam plugin.)

Conclusions

Closing this backdoor will not be easy nor will it obliterate comment spam. However, channeling it through the traditional forms makes it possible to apply various Turing tests to weed out the bots.

In short, it won’t put an end to comment spam or replace filtering, but at least it will add an extra line of defense.

Right now, WordPress users are just one clever spammer away from a tidal wave of spam. If someone can find a way to beat Akismet and other spam filtering plugins, there is no backup plan.

Perhaps now, while the situation is somewhat in hand, it is time we started working on one.

30 Responses to WordPress and Comment Spam

  1. Preston says:

    Go to the WordPress plugin site and search for "Bad Behavior". It is an excellent first line of defense, and when used in conjunction with Askimet it makes a huge cut into spam. It says it prevented 810 attempts on my site in the last 7 days.

  2. Preston says:

    Go to the WordPress plugin site and search for “Bad Behavior”. It is an excellent first line of defense, and when used in conjunction with Askimet it makes a huge cut into spam. It says it prevented 810 attempts on my site in the last 7 days.

  3. JB says:

    Preston,I tried Bad Behavior out about a year ago and it was a mixed bag for me. It did reduce the spam level some but it created a decent amount of false positives as well. It also put a drag on my server.Granted, the plugin has probably been upgraded and I have a new host now so it would probably move faster, but I am uneasy about going back to it. Besides, it seems to only target bots and bots are getting better at not looking like bots.Still, it is a worthwhile suggestion. If anyone reading this has great success with BB, please let me know!

  4. JB says:

    Preston,

    I tried Bad Behavior out about a year ago and it was a mixed bag for me. It did reduce the spam level some but it created a decent amount of false positives as well. It also put a drag on my server.

    Granted, the plugin has probably been upgraded and I have a new host now so it would probably move faster, but I am uneasy about going back to it. Besides, it seems to only target bots and bots are getting better at not looking like bots.

    Still, it is a worthwhile suggestion. If anyone reading this has great success with BB, please let me know!

  5. JB says:

    Maria,SK2 is a great plugin. Especially when you use it with the Akismet addon. However, with it running at full power, I noticed a bad slowdown when posting comments. I got complaints about it previously.I switched to Akismet to offload all of that filtering and it worked well. Now I'm pretty much sold on reCAPTCHA though. I'll be writing more about it soon.Oh, and regarding my email address, it's not in plain text. The address on the right is an image with a link to the contact form. Unless I missed something, it shouldn't be in text anywhere…

  6. Maria says:

    SpamKarma is EXTREMELY effective. I use it in conjunction with Bad Behavior, which reduces the number of spam attempts. Of the attempts that get onto my site, 95% of them are caught and killed by SpamKarma.

  7. Maria says:

    But I think you’re making a HUGE mistake by including your e-mail address in plain text on your site. All the spambots in the world will be grabbing that address. Your e-mail box will soon be overflowing with junkmail. Use a contact form!

  8. Maria says:

    But I think you’re making a HUGE mistake by including your e-mail address in plain text on your site. All the spambots in the world will be grabbing that address. Your e-mail box will soon be overflowing with junkmail. Use a contact form!

  9. JB says:

    Maria,

    SK2 is a great plugin. Especially when you use it with the Akismet addon. However, with it running at full power, I noticed a bad slowdown when posting comments. I got complaints about it previously.

    I switched to Akismet to offload all of that filtering and it worked well.

    Now I’m pretty much sold on reCAPTCHA though. I’ll be writing more about it soon.

    Oh, and regarding my email address, it’s not in plain text. The address on the right is an image with a link to the contact form. Unless I missed something, it shouldn’t be in text anywhere…

  10. Webd360 says:

    All I use right now is Akismet, but I might try out spam karma too, some people say its good and others say it isn't. Guess I'll have to try for myself to find out…

  11. Webd360 says:

    All I use right now is Akismet, but I might try out spam karma too, some people say its good and others say it isn’t. Guess I’ll have to try for myself to find out…

  12. JB says:

    Webd360,SK2 works good, especially with the Akismet plugin, but bear in mind that it will cause a load on your server. With most installs of WP, there is a noticeable slowdown when posting a comment and during spam attacks.If you've got a great server, go for it, if not, I'd be more wary. I have a decent set up but I feel better with Akismet and reCAPTCHA.Thanks for the input!

  13. JB says:

    Webd360,

    SK2 works good, especially with the Akismet plugin, but bear in mind that it will cause a load on your server. With most installs of WP, there is a noticeable slowdown when posting a comment and during spam attacks.

    If you’ve got a great server, go for it, if not, I’d be more wary. I have a decent set up but I feel better with Akismet and reCAPTCHA.

    Thanks for the input!

  14. JB says:

    John Bennett,Definitely give Akismet a try. But definitely also consider reCAPTCHA if you can. I've fallen head over heels for that solution these past few weeks.Let me know how Akismet works for you!

  15. John Bennett says:

    I’m going to have to look into adding this. I’ve had to moderate too much comment spam in the last few weeks and I’m getting tired of it. Akismet might be a good solution for my site for now. Thanks.

  16. JB says:

    John Bennett,

    Definitely give Akismet a try. But definitely also consider reCAPTCHA if you can. I’ve fallen head over heels for that solution these past few weeks.

    Let me know how Akismet works for you!

  17. Preston says:

    Now that's irony in action.

  18. JB says:

    Preson: it is strange. I'm banning the IP behind those two spams and then forwarding the messages on to reCAPTCHA for analysis.

  19. Preston says:

    Now that’s irony in action.

  20. JB says:

    Preson: it is strange. I’m banning the IP behind those two spams and then forwarding the messages on to reCAPTCHA for analysis.

  21. ardamis says:

    I’ve written a quick example of how to use a handshake between comments.php and wp-comments-post.php to deter spammers.

    http://www.ardamis.com/2007/12/15/using-timesta

    I should also mention that renaming wp-comments-post.php is only effective in the long-term if the spammers are prevented from discovering the new file. You can do this by hiding the path in an external javascript or with various other methods.

  22. ardamis says:

    I’ve written a quick example of how to use a handshake between comments.php and wp-comments-post.php to deter spammers.

    http://www.ardamis.com/2007/12/15/using-timestamps-to-reduce-wordpress-comment-spam/

    I should also mention that renaming wp-comments-post.php is only effective in the long-term if the spammers are prevented from discovering the new file. You can do this by hiding the path in an external javascript or with various other methods.

  23. JB says:

    Ardamis: That's a very interesting hack you have there and it is indeed what I was talking about. MIght I ask how it has worked for you?I know that reCAPTCHA, which is what I use right now, uses a form of a handshake but one based upon the CAPTCHA. I've seen others that worked on hidden fields but I could see how the timestamp method would be much more reliable and difficult for a bot to duplicate.I have to say I like what I've read.The only thing I don't like is that every time I update WP I'd have to reapply the hack. Is there any way to make this a plugin?I'd like to hear more about this…

  24. JB says:

    Ardamis: That’s a very interesting hack you have there and it is indeed what I was talking about. MIght I ask how it has worked for you?

    I know that reCAPTCHA, which is what I use right now, uses a form of a handshake but one based upon the CAPTCHA. I’ve seen others that worked on hidden fields but I could see how the timestamp method would be much more reliable and difficult for a bot to duplicate.

    I have to say I like what I’ve read.

    The only thing I don’t like is that every time I update WP I’d have to reapply the hack. Is there any way to make this a plugin?

    I’d like to hear more about this…

  25. Jeff says:

    @JB – I use a combo of WP-SpamFree and Akismet. It's a perfect combo. WP-SpamFree absolutely kills anything from spambots, and Akismet takes care of the junk pingbacks and trackbacks. WP-SpamFree has only been around for a couple months, but it's awesome.

  26. Jeff says:

    @JB – I use a combo of WP-SpamFree and Akismet. It’s a perfect combo. WP-SpamFree absolutely kills anything from spambots, and Akismet takes care of the junk pingbacks and trackbacks. WP-SpamFree has only been around for a couple months, but it’s awesome.

  27. Jeff: Thanks for the tip. I’m downloading it now and will give it a try!

  28. Jeff: Thanks for the tip. I’m downloading it now and will give it a try!

  29. A good plugin is tantan for wordpress. It blocks the spam before it even goes to askimet based on keywords. You can add the keywords to a list manually or by clicking on ones they found in your askimet.

Leave a Reply

STAY CONNECTED