Akismet and Django to stop comment spam

Fighting comment spam - without punishing your users

Subscribe to Site Feed |

So, as some of you may know, I built this blog from scratch in django. While the simplicity and cleanliness of having a blog that I've written the code for myself has so far been great.. there are a few things that were lacking.

Comment spam: my new nemesis..

The most obvious was comment spam. Initially I (stupidly) thought that there was little chance of comment spam due to my form not fitting the structure that wordpress etc use - I was hoping for security through obscurity. Although this worked for the first few weeks after I relaunched the blog I started copping a fairly steady stream of spam. Enough that for a few days I had to turn off the email notifications for comments, to stop the few sparse readers that commented properly from being spammed by my blog as a result of spammers.

So, most people's responses seem to be to deliver up a captcha style image that forces the user to type out the horribly distorted letters of some arbitary word in order to post their comment. Or maybe we provide a simple sum that the user must type the answer to in order to succesfully post.

I HATE these solutions. I have no idea why the real user is being punished because of spammers, this is far from the ideal solution.

The beauty is, there are solutions out there that can help without making our valued users suffer as a result. I personally hate typing captcha text just to share my opinions.

Akismet, my saviour..

Akismet provide free API keys to personal websites, and their comment spam API really does work. What follows is a very short tutorial on how i implemented simple comment spam filtering on my blog, placing possible spam comments into a moderation state, allowing me to manually approve them.

Tutorial: Using Akismet in Django

Step 1: prepare our models.

So, step one we need to add a BooleanField to our comment model that allows us to store the state of moderation. Below is my comment model, as you can see, I've added the could_be_spam boolean field to store my spam state. I've chosen to keep the spam status default as false, relying instead on Akismet to identify spam and mark it accordinly.

    class Comment(models.Model):
        name = models.CharField(max_length=32)
        email = models.CharField(max_length=128)
        url = models.CharField(max_length=128,blank=True)

        could_be_spam = models.BooleanField(default=False)

        body     = models.TextField()
        pub_date = models.DateTimeField('date published',auto_now_add=True)

        ip_address = models.IPAddressField()
        article = models.ForeignKey(Article)

Step 2: Support the Akismet API

You'll need to download and install the Akismet python API implementation somewhere in your python path (or the same directory as the code that uses it..) You can get Akismet for python here. You'll also need an Akismet API key, you can get this by signing up at wordpress.com.

So now that we have this capacity added to our model, all we need to do is support the Akismet API in order to check our new comments at post time. I've added this code in my article posting view. The following snippet shows the Akismet python code that i've used:

    from akismet import Akismet
    a = Akismet('<API KEY HERE>',blog_url='http://soyrex.com/')
    akismet_data = {}
    akismet_data['user_ip'] = comment.ip_address
    akismet_data['user_agent'] = request.META['HTTP_USER_AGENT']
    akismet_data['comment_author'] = comment.name
    akismet_data['comment_author_email'] = comment.email
    akismet_data['comment_author_url'] = comment.url
    akismet_data['comment_type'] ='comment'

    is_spam = a.comment_check(comment.body,akismet_data)

As you can see, I've provided akismet with as much information as i can about my potential spammer, IP address, User Agent, name, email, url and the comment body itself. So now we have a variable named is_spam which tells us whether or not Akismet thinks this comment is spam.

Step 3: We've identified our spam, now handle it..

Following this, we just use a simple if statement to check our spam status and fire off our code:

    if is_spam:
        comment.could_be_spam = True
        send_mail('<subject>', '<msg> ['<email>'], fail_silently=False)
        comment.save()

        return HttpResponseRedirect('<return address>')

In this case, we set the could_be_spam variable we created on our model to True and then use django's send_mail function to send myself an email mentioning that the comment needs moderating.

Step 4: Modify our lists etc to support comments..

The final task that we have to accomplish is setting up our comment listing pages and any other uses of comments to be accessed using the could_be_spam boolean, so that potential spam comments are not displayed in our listings etc. My comment listings on my article model now look like this:

    comment_list = self.comment_set.filter(could_be_spam=False)

Anything else?

So what more? I'll leave it to you how you deal with comments that have been marked as spam. So far my akismet spam filtering has captured 100% of my spam, without ANY false positives (obviously your joy may vary) - however I feel warm and fuzzy, because I'm not punishing my users for trying to share their opinions.

Subscribe to Site Feed |