small__7460434194Duplicate Content has been a topic of interest in the SEO world for a number of years now and webmasters need to know to what extent duplicate content is allowed and how it will affect their rankings in search engines.  Since the question regarding permissible, duplicate content has arisen many times Matt Cutts, head of Google’s web spam team answered the question in one of his regular video blogs.

The exact question that Matt was asked:

“How does Google handle duplicate content and what negative effects can it have on rankings from an SEO perspective?”

Cutts first response was to explain that 25-30 percent of the content on the web is duplicate content and it is not always spam.  Just as industry reports or college papers borrow content from each other so do websites.  Quoting another blog on your site and then providing a link to the full article is a natural way to duplicate content without appearing as spam.  If Google responded to every instance of duplicate content as spam Cutts notes that it would “probably end up hurting our search quality rather than helping it.”

Crowding Out Duplicates

So how does Google respond to duplicate content?  Rather than foolishly flagging every instance of  duplicate content as spam Google will try and group sets of duplicate content together and ascertain the original source.  When the term or phrase is searched the algorithm will display the original source in the results page and push back any duplicate content.

For example, if you were to quote a paragraph from this article and post it on your website and then search for ‘Matt Cutts, duplicate content’ my article would show up before your website.  In fact there would be dozens of links discussing duplicate content after my link and before yours pushing your website further back in the results page.   Your use of duplicate content from my article is not spam and is not penalized but it is pushed back in the search rankings so a searcher does not keep finding the same content with each link they open from a SERP.

What is considered Spam?

Now,  Google does not flag everything as spam but they still do reserve the right to treat obvious and abusive duplicate content as such.  If you are duplicating the majority of external articles on your own site you are not providing any value to searchers.  Providing value to searchers is what Google is all about!

You may be doing this intentionally by scraping content from other sites or you large_2255499619may be doing it innocently.  Cutts provided an example where an ignorant webmaster wanted to create an RSS auto blog to a blog site.  Auto duplicating content that already exists elsewhere provides nothing useful or valuable to the user.

If there is a large percentage of duplicate content on your site whether you are being malicious or ignorant Google will penalize you.

Don’t Sweat the Small Stuff

When it comes to duplicate content it is ok to leave a few stones unturned. If there are small variations between your website versions (.com vs .net) or if you have an old and new set of terms and conditions on your help page don’t worry about it.

But this small stuff can build up and become a problem, especially when you have dozens of subdomains that have the same ‘about us’ page.  One common mistake that webmasters will make, especially large corporate sites is to have the same content rehashed with small adjustments for each of their subdomains.  Never do this as it will be picked up by Google quickly.

Key Takeaway

There really is no hard answer to a percentage amount of duplicate content on your site that is permissible.  It is a matter of common sense.  Ask yourself two questions:

  1. When you are duplicating content…Does this content provide unique value to the user?
  2. When others are duplicating you…Is It clear that I am the originator of the content?

If either of your answers is a no then you should change the content on the page and make it your own.

Photo Credit: David Hegarty

Jamie Bates