22.7.14

Bloggers How To Defend Against Content Scrapers


example of content theft from sugarpuffish blog





The problem with blogging is no one gives you a handbook when you create your blog. Even after three years I am still learning the ropes. I wanted to share a problem I discovered and had not seen discussed in the community. I figured if I don't know then there is a good chance others are in the same boat.

Every now and again, I have a quick Google search for my blog and glance over how I am ranking on pages. Last week I noticed two URLs that were cause for concern (pictured above). My posts are appearing on another site, which in turn will draw traffic away from my blog. This makes for one very grumpy Sarah! I spent over an hour researching how I could get the site reported and in doing so uncovered the term "Scraped Content". People who "Scrape" content use automated software that takes blog content from your RSS feed and posts it to their site like it is a new post. 

There is not much you can do to prevent scraping but there are steps which could limit the damage it causes. These are tips I have collected from reading other articles.

Contact
I found this option tricky because the website that has taken my content provides no information. You can use Whois Lookup to find the domain registrar/web host and get in contact with them.

Report to Google
Firstly, submit a DMCA (Digital Millennium Copyright Act) notice to Google (click here). By submitting this form you are letting Google know the breach of copyright and asking for the website to be removed. A DMCA notice is the official route but I also uncovered this form (click here) allowing you to report the scraper site. I understand this will not get content removed but allows Google to gather information and we can only hope that it aids prevention of scraping in the future.

RSS Feed
There are two options available when it comes to your RSS Feed, either shorten the post or add a footer (you can’t do both). In your Dashboard (pictured below) change “Allow Blog Feed" from FULL to SHORT then only the first 400 characters of your blog post are visible, making it less attractive to a content thief. It shouldn't put off regular readers because they can still click through to the post and read it in full. The alternative is to add a footer which includes a link to your blog and/or social media. Use this code <hr /> <a href="http://www.myblog.com">My Blog Name</a>







You can see from my screen shot I added a short copyright message. Remember your message will be visible to genuine readers of your RSS Feed so keep it polite. The idea behind adding the RSS footer is it will be seen on the scraper site so readers are aware content has been stolen.

Submit URL to Google
I stumbled across an article about submitting your post URL to Google as soon as you publish. This may help your post index quicker than the scraper website. It's all to do with the way Googlebots crawl the internet.

Set up Google Authorship
I still haven't got my head around this one because it involves Google+ and linking it to your blog. There are plenty of tutorials out there. I understand that Authorship doesn't prevent theft but it helps build credibility and allows Google to identify your content as original.

Google Alerts
A number of sites suggest using Google Alerts (click here) to find copied content. The suggestion is to set up an alert using keywords or your blog title or a paragraph from your post. You then decide what types of website to search and how often you want a report. For those that publish to their blog regularly, I envisage that your inbox would soon be overflowing so you may not like this method in the long run.

Watermark Images
This is debatable whether it is worth your efforts. I'm not a fan of watermarking because photographs can easily be cropped and marks removed. However, I imagine when it comes to scraping it could be favourable if content is automatically stolen and not edited.

I hope this blog post has been helpful. We may not be able to stamp out scraping but at least use it to your advantage. If you have a blog on Wordpress it looks like you have some plug in options, I suggest researching as I'm only familiar with Blogger. Have you experienced content theft? How did you deal with it? 


Follow me on

10 comments:

  1. I was so sorry to hear that someone stole your content, I am glad you turned it around in to this helpful post, I have heard of it happen to someone else awhile ago, but could not remember what they did to deal with it. Apologies that I wasn't much help xx

    ReplyDelete
    Replies
    1. No need to apologise, I managed to get there in the end with the power of Google :D

      Delete
  2. This is a great post Sarah :) I have blogging friends in the parenting blog community who have experienced this and had content reproduced. Some of then use Digimarc (http://www.digimarc.com/technology/about-digital-watermarking) or similar but it can cost you money. You have to be so on the ball these days to protect your content. It's complicated and frustrating. I'll be checking out some of what you've shared above. Glad you got it sorted xx

    ReplyDelete
    Replies
    1. Thanks for the tip Charlie. My content is still being stolen, these steps may deter but if someone is determined they will continue. Google are not swift with DMCA notices, I'm still waiting for an outcome.

      Delete
  3. Oh I'm so sorry this happened to you, I didn't even know this is happening :( Great post, thanks for all the tips! I've started adding watermark to my photos when I discovered a girl who posted one of my photo as her own, rude! xx

    ReplyDelete
    Replies
    1. It's really annoying for everyone who has work stolen as we put so much love and attention into our blog. That is so frustrating that someone took your photos, I don't blame you for adding watermarks

      Delete
  4. Very great tips Sarah, I'm sure we will all need this help eventually. I'm sorry that this happened to you though!

    ReplyDelete
    Replies
    1. Glad you found this helpful sadly an ongoing battle which I am facing but I am sure I am not alone on this one

      Delete
  5. We never seem to be safe from stealing no matter the forum which is such a shame. Too many lazy and mean-spirited people out there if you ask me! Really sorry to hear your content is still being stolen and hope you get full resolution soon. xx

    ReplyDelete
    Replies
    1. Thank you. It's an ongoing battle, I don't think you can ever completely stop it. I keep issuing DMCA's and Google has been removing them but I keep finding more theft by the same site as they creep up the page rankings.

      Delete

Theme designed by Feeric Studios. Copyright © 2013. Powered by Blogger