If you’ve looked into search engine optimization (SEO) at all, you’ve probably heard the term “duplicate content.” It also might have been mentioned during a website update. But do you know what it really is? Let’s dig into the topic and why it matters.
What Is Duplicate Content?
Duplicate Content is exactly what it sounds like – written material that appears on more than one URL. We’re not talking about content that is thematically the same. Instead, duplicate content is entirely or substantially word-for-word the same.
Sometimes duplicate content happens because of plagiarism and sometimes it’s due to technical SEO or website issues. Either way, it’s an issue that needs to be corrected.
Does Duplicate Content Matter?
Yes, duplicate content can have a significant adverse effect on your SEO rankings. Because Google and other search engines can’t necessarily tell which website originally created the copy, odds are the wrong page will be ranked lower or all of them will be ranked lower than they otherwise would be.
Fortunately, duplicate content doesn’t cause a manual penalty to your website – unless Google determines that the duplicate content is intended to manipulate or deceive visitors. If content was scraped from another website and paired with a similar URL name, similar graphics and images, etc. to trick people into thinking that website B is for brand A… that could lead to a manual penalty.
Otherwise Google penalties tend to focus on spam tactics, deception, and manipulation, and duplicate content just hurts your search rankings.
Is Duplicate Content Bad?
Yes, duplicate content is a problem and can adversely affect your website’s search rankings. If it’s the result of a technical issue, it should be fixed. If you’re using someone else’s content (such as promotional material for a product line you carry or something else), it needs to be rewritten.
If someone else scraped your content, that’s incredibly frustrating. If you can’t get them to take it down (or if it’s not worth the legal fight to do it), then you’re better off rewriting it so their IP theft doesn’t hurt your rankings. Fortunately, this reason for duplicate content is the least likely to occur.
How Do Duplicate Content Issues Happen?
Obviously, plagiarism is one reason for duplicate content issues. Cheap website creators and website content providers whose price is too good to be true have been known to scrape content from other websites in the same industry. Worse, they tend to get away with it – at least long enough to get paid.
The other two reasons for website pages to be flagged by duplicate content checkers are technical.
URL variations are notorious for creating duplicate content alerts. Session Ids can cause the issue as can some types of analytics code and click tracking services. For example, “www.website.com/product_page” could be the same as “www.website.com/product_page&cat=2&color=blue”.
The other type of technical duplicate content issue involves website variations. That includes websites with and without “www” in the URL as well as “http” and “https” versions.
How Much Duplicate Content Is Acceptable?
In 2013, Google’s Matt Cuts commented that “…something like 25% or 30% of all web’s content is duplicate content.” Many people took that to mean that it was OK if 25-30% of a website page was duplicate content – even though the former does not equal the latter.
In more recent years (including as recently as 2022), Gary Illyes and John Mueller have stated that there isn’t a flat percentage that defines duplicate content. In fact, Illyes went onto explain that percentages don’t factor into the determination of duplicate content but rather that checksums are key to the methodology.
However, for average people, a duplicate content checker can be very useful for monitoring whether someone else has scraped your content or if the content on your website was plagiarized.
Here at Efferent Media, checking for duplicate content is one of the things we examine when a new client hires us to evaluate their website or to redesign a website… and the amount of blatant plagiarism we’ve found is discouraging. That’s rarely the client’s fault. It tends to happen when they hire “a friend of a friend” or a cheap service to create their website.
Ideally, when using a duplicate content checker, the result should be 100% new content/0% duplicate content. In the real world, however, a small amount of “duplicate content” can be difficult to avoid.
For example, there are only so many ways to phrase facts like phone numbers and related contact information in the call to action (CTA) at the end of a blog. Similarly, lists of ingredients for products a website sells are an unavoidable form of duplicate content because you can’t change the name of the ingredients or the sequence.
Some duplicate content checkers are also absurdly sensitive. One once flagged a paragraph talking about the world’s largest and smallest rodents. The two paragraphs had completely different phrasing. So, it wasn’t plagiarism but both paragraphs mentioned the names of the rodents and their lengths to establish their rankings as largest and smallest – which was sufficient to trigger that particular content checker.
How Do I Fix Duplicate Content?
Obviously, writing fresh copy is the solution for any plagiarized content. If the duplicate content is because of technical issues instead, two solutions exist, depending upon the situation.
A 301 redirect moves people from one URL to another one. It’s typically used when a page is removed from a website to direct people to a replacement. But it can also be used when there are variants, like versions of a URL with and without a “www.” A 301 redirect also transfers link equity from the original page to another one.
The other option is to add a canonical tag to the duplicate pages. It signals to Google that you know you have duplicate content, it’s there for a reason, you want X page to end up in the search results, and you want to consolidate the link equity from all of the duplicates into the primary/original page.
A canonical can also be a good solution when you give permission for another website to reprint an article you wrote. The other website would add a canonical tag to send the link equity back to your original page, making it clear that the other website is using the content correctly.
Talk to the SEO Experts at Efferent Media
Building your brand and acquiring customers requires high quality, SEO-optimized websites. Our team of Google-certified SEO experts have years of experience in technical, on-page, and off-page SEO to improve your search engine rankings. Call Efferent Media today at (631) 867-0900 to learn more.