Duplicate Content Penalties – Fact or Fiction?
For years, the existence of the so-called ‘duplicate content penalty’ has been a hot topic of debate and has caused much anxiety. The short answer is: no, there are typically no ‘penalties’ for duplicate content. Having duplicate content can hurt your SEO, though, so it doesn’t mean that you shouldn’t pay attention to best practices in dealing with duplicate content. By putting in the extra effort to avoid duplicate content, you will end up creating a better experience for your users (and search engines) in the process.
What is duplicate content?
Google defines duplicate content as blocks of text either within your own domain or across other domains that are identical or only have minor differences.
To be clear, translations of the same article or page and quote-sized snippets of texts from other sources are not considered duplicate content.
For the most part, duplicate content happens accidentally. Either due to technical issues on the site or because of manual duplication. External content duplication can also occur if your content is being published on a different website.
Does Google penalize you for duplicate content?
Yes, ok, a penalty for duplicate content does exist. The good news is that it rarely affects regular sites. Copied or scraped content on a page might trigger a review by one of Google’s human reviewers. If the reviewer determines that the page/site is in violation of Google’s Webmaster Guidelines and the purpose of the content is to manipulate search engines, a manual action will be issued (the dreaded penalty), and the page or entire site will be ranked lower or removed completely from search results . The takeaway is: don’t steal content. Instead, create great content that offers value to your audience.
How duplicate content hurts your SEO
While manual actions due to duplicate content are rare, duplicate content might confuse your users and search engines, which results in poor site performance.
For example, if you have two or more pages with the same content, the search engine will struggle to figure out which page is the best result, as multiple pages from the same site with the same content will not be shown in the search results.
The result of this search engine confusion is that all pages with the same content lose visibility. The same issue happens for backlinks – other websites might link to different pages on your site with duplicate content, and since backlinks links are another important factor of high search ranking, this adds to the problem.
Find duplicate content
If you’re wondering right now whether you have duplicate content issues on your site, there are a number of ways to check.
You can do a manual check for duplicate content by taking a block of text from a couple of your primary pages, put it in quotes, and do a google search. If you have more than one version show up in the search results, you will have to take a closer look at what causes the duplication issue. You can also check in Google Search Console if the number of pages indexed line up with the number of pages on your site.
There are also a number of tools available such as Siteliner that offer a free scanning of your site once per month, and gives a thorough overview of any duplicate content issues on your site.
Internal duplicate content issues
Common sources of duplicate content within your site can often be fixed by going the extra mile with the technical SEO. Here are a few common examples.
For ecommerce companies, duplicate content can be a big headache. As a general rule, avoid creating separate pages for the same product and create your own, awesome product descriptions that speak to your particular target customer.
If your site uses URL parameters such as for tracking or sorting. These are commonly found on ecommerce sites or via search functions. It’s best practice to ensure your site doesn’t create a new URL for all product variations
Different URL versions of your site
You may have different versions of your sites live and visible to search engines, which causes duplicate content issues, as search engines treat them as duplicates of your site.
- HTTP vs HTTPS
- WWW vs non-WWW
- Trailing slash after the domain extension (.com/ and .com)
Depending on the specific internal duplicate content issue, you can address it by deleting or consolidate pages with duplicate content, setting up a 301-redirect to the preferred version of your site can fix this issue, or using a canonical tag to indicate to search engines which page is the main page.
External duplicate content issues
Sometimes your original content is so amazing that it gets duplicated on other sites!
Scraped content is when other websites steal your content and repost it on their site in the hopes of increasing their website traffic. Copyscape is a great tool you can use to check if your content has been scraped. Although scraped content rarely gets better search rankings than original content, it is a good idea to file a complaint under the Digital Millennium Copyright Act.
Syndicated content is different. Here, your original content gets published on another website with your permission. Medium and the Huffpost are a few examples of sites that syndicate content. Although the syndicated content might get higher search rankings than your original page because of higher authority of the syndication site, there are benefits to having your content posted elsewhere: more powerful backlinks to your site, more traffic to your site as users explore the backlinks and increased exposure of your brand.
Dealing with duplicate content like a pro
When it comes to duplicate content, it’s really quite simple: create great, original content, and make sure the technical aspects of your site are set up correctly. Even though you shouldn’t panic about duplicate content, it is a good reminder to streamline the content on your site, so you don’t create confusion for users and search engines. If you’re wondering how you can better deal with duplicate content issues and SEO, we can help! Konstruct has lots of experience optimizing sites, why not let us worry about duplicate content issues?
Updated: April 9, 2020