Duplicate content is one of the most common SEO pitfalls that many e-commerce sites struggle with. And with the growth of a business riding partially on organic success, both regular assessment and proactive SEO work are imperative to give the business its best chance to succeed.
Having spent a few years working for a solely-online retailer, I noticed a few trends in locations where duplicate content issues consistently popped up on our site and those of competitors. Knowing what to keep an eye on can save a lot of time and troubleshooting down the road – so here are some of the repeat offenders to look out for and how to fix them.
Problem Area: Across Categories
The most blatant form of duplicate content on an e-commerce site is at the category level. This happens if multiple categories target the exact same type of product (for example, “Men’s Boots” and “Boots for Men”). This can happen for a variety of well-intended reasons such as wanting to present the category within two different parent categories, trying to appeal to different markets, or even in the name of targeting keywords for SEO.
However, regardless of intent, if the pages are targeting the same topic, Google will not have a clear picture of which page takes priority and will likely not serve either of these pages to users above sites that offer clear structure.
There are a couple of courses of action that can be taken on this issue depending on whether you are working in more of a preventative or reactive capacity.
If you are lucky enough to be involved in a process before duplicate categories are added, education is the best strategy. Many merchandizers or product teams don’t realize that there is any issue with having multiple landing pages for a product type. Helping them understand what to avoid and why can help reduce the number of duplicates preemptively.
The reactive solution to this issue is to clean up the existing pages to more clearly indicate importance. To locate the most obviously problematic categories, start with a crawl and look for identical or highly similar H1s. Then, if possible, follow that up with a manual check through a list of all live categories, as some duplicate intent (such as with synonyms) would not be as straightforward to find via crawl data.
When searching for duplicate categories on a site, remember that Google also uses on-page content to interpret page intent. So product selection within categories also needs to be considered, as the bulk of the content on most category pages comes from the product titles. Some e-commerce CMSs have a view that allows you to see which products are dual categorized and where; this can be very helpful for locating this particular type of duplicate content.
Once you’ve found the categories, the best course of action is to 301 redirect all but one of the overlapping categories to the one that you have determined to be most valuable (via rankings/traffic/sales/user experience). Alternately, if removing pages isn’t possible (like if they are being used for marketing purposes), you could add noindex tags to the duplicate pages or have them canonical back to the primary page. I would recommend putting a noindex, nofollow tag on a page that is not linked to from the site, such as a stand-alone landing page for an email, and using canonical tags for pages that live within the site’s navigable structure.
For example, if I found these categories and determined that they all contain mostly the same products:
I would then identify what each page is for, where it is linked from, and determine the action to take based on those factors:
Problem Area: Filters or Filter Combinations
Even if categories are targeting different product types you can still end up with identical targeting on certain pages when filters are applied. Of course, this issue will only be a significant problem if the filtered pages have unique URLs and are indexable.
To give an example, the site below sells many types of throw pillows, some of which are specific to indoor or outdoor use, and some of which are multi-use. They have separate categories for “Decorative + Throw Pillows” and “Outdoor Pillows,” which are located in respective /outdoor/ and /decor-pillows/ sections of their site.
At this point these categories are fine. While there may be a bit of overlap in intent, users and Google can both understand how they are differentiated.
However, duplicate content becomes an issue once filters are applied to these categories. Because “Outdoor Pillows” is such a broad term, this category includes a type filter for “Throw Pillow”:
And since some of the “Decorative + Throw Pillows” are multi-use, that category features one for “Outdoor”:
So now we have two self-canonicalized unique URLs targeting the exact same type of item (and even returning the same specific products).
This example is relatively straightforward, but given the limitless possibilities that some sites have for filter combinations, duplicate content from filters can quickly get out of hand if unaddressed.
Much like the cross-category duplicate content issue, the preventative solution to this one lies largely with education and process. If merchandising teams know:
That duplicate content is a thing
That it’s a problem
What to look at before creating a filter
Then they can keep these issues from existing in the first place. It may also be helpful to keep lists of similar categories or commonly overlapping filters to check before adding new filters – On sites I’ve worked on in the past, we consistently saw issues in overlap between filters in indoor and outdoor furnishings as well as between furnishing categories for general consumer and commercial clients.
If these filtered duplicates already exist, the solution is to remove them and 301 redirect the URLs to the page that takes precedence. Depending on what the filter was, it may be better to redirect them to one or the other of the parent category pages. However, before you remove anything, be sure to take a peek at any existing rankings or organic traffic that a page may be getting so that you consider any existing value before making a final decision on which to axe.
Problem Area: Product Descriptions
Every product page needs text on it to explain features and product details and add an element of branding. To meet this need, many e-commerce sites will use an unedited description directly from the manufacturer.
While there are obvious reasons behind this practice, including the implied accuracy of the initial description and efficiency of onboarding processes, this is a big problem when it comes to differentiating your site in the SERPs.
Because the manufacturer sends the same information to all brands that it sells through, this leads to many sites having identical text on their product pages. Even big brands are guilty of this. For example, the product description for a specific bookcase is identical to:
If you are using identical content to compete with a more prominent brand in the SERPs, both Google and users are likely going to prioritize the site with more brand authority.
Another instance where duplicate product descriptions become an issue is when a brand expands the number of platforms they sell through. For example, if an independent company initially sells through their own site, but then decides to list their products on Amazon as well. If this company uses the same descriptions on both sites, it can result in the brand losing the top place in the SERP for their own product, as Amazon has such a high domain authority.
Unique content is the solution to this problem. High effort though it may be, there is no substitute. This can be from internal teams that have writing skills, or it can be outsourced – different solutions work better for different organizations. The important part is ensuring that the content you are creating is both high quality (proper grammar, no misspellings, etc.) and unique to your site.
Problem Area: Title Tags
Duplicate title tags are an extremely widespread problem across the web in general. When SEO teams say “we need title tags on every page,” often developer teams will find the quickest and easiest solution to this problem: rolling out a standard tag across all pages.
However, title tags are supposed to help both users and Google to understand what a page is about at a high level. And clearly, identical tags on every page help no one identify topics.
The solution that many come to on this issue is algorithmically generating title tags based on the page’s H1. If this is feasible, it can be a great way to achieve SEO goals efficiently. However, in most organizations this will require developer resources to accomplish, which can be a significant constraint.
However, if your CMS is configured in such a way that it will accept a bulk upload of meta-data values, this can even be accomplished without dev dependencies by using Excel. If you use Screaming Frog to pull a list of URLs and their corresponding H1s, you can create a template design that will integrate the H1 text.
For example, If I wanted to make title tags that read “Shop *Product Type* | Example Site” I could create the following layout in Excel:
Then, I would use the CONCATENATE function to automatically generate text for all the title tags by inserting the H1 after “Shop” and putting “| Example Site” at the end as shown below.
Apply the formula to the column, and you have a list of unique title tags and their associated URLs:
Problem Area: Blog Posts or Resource Section
Most e-commerce sites these days feature some inspirational or informational content in the form of a blog or resource section. While this can be a great asset, it can also be a bit of a minefield for duplicate content.
Poor quality outsourcing, internal content producers who don’t understand the significance of creative integrity, or even just individuals who don’t know how to cite or refer to a source appropriately can end up producing content that is duplicative of another website’s.
Even if you invest in quality unique content for your site, sometimes it can be “borrowed,” overly quoted, or just straight copied by other sites. And if the site that does this has higher domain authority, the content can be beaten by itself in the SERPs.
Depending on when you are coming into the content creation process, the first step could be an initial audit of existing content to see if any is duplicate. While much of this must be done more or less manually by doing a quick search for exact matches to sections of your content, there are some tools available that can speed the process up a bit (I like Copyscape). If you find content that is duplicate, assess the value of the topic to your site and the extent of duplication, then either remove or refresh the content piece.
If you are lucky enough to be in on the content creation process from the beginning, ensure that the writers know what duplicate content is, and both how and why it can be a detriment to the site.
While many of these recommendations sound relatively simple, I understand they can be much harder to execute in practice. But if you keep tabs on these specific areas, and work to educate teams and integrate SEO considerations into their processes, your site will be one step closer to organic success.