Since the Panda update and refreshes, content consolidation projects have been widely undertaken to both lift the effects of Panda and as a preventative measure. This article will show you three risky but popular content strategies, how to diagnose them and tips for how to approach consolidating.
Pages Targeting Keywords with The Same User Intent
Imagine a site architecture that is roughly the following
Usually, when I see this strategy, it suggests to me that this is a very aggressive strategy because there is no difference in user intent between
A visitor that lands on /cheap-dental-floss or /cheap-dental-floss is looking for exactly the same thing. In fact, in most spaces one of these will have way more search volume then the other. At some level, this is even keyword cannibalization, a case where pages overlap so much, that they internally compete for rankings on your site. Remember, Google has become very good at understanding synonymous phrases and I predict that in the future, aggressive strategies like this will hurt more then they help.
How to diagnose
- Run a crawler on your site (you can try the ninja crawler to be able to view this really easily)
- Check to see if there is any synonymous overlap between URL’s
How to fix it
- Consolidate content by grouping content by user intent.
- In my example case above it would be
Sidebar: you may notice that I left /spearmint-dental-floss, /peppermint-dental-floss, and /mint-dental-floss. The reasoning for that is simply that potential visitors may be looking for something different. Someone looking for mint dental floss may be looking for spearmint, peppermint, or wintergreen. Conversely at the more precise level, you can think about how to tailor an onpage experience for someone looking for peppermint dental floss on /mint-dental-floss
URL Parameters and Session ID’s
These content issues are old ones but I still see them all the time. Usually, when I look at an eCommerce site, it is not a matter of if it has URL Parameter based duplicate content, it is usually a case of where.
How to diagnose
A very simple check exists for this: run a snippet of text, usually 7 words in quotes, from every level of your site architecture and see what is returned. A good percentage of webmasters that run this are surprised by weird URL parameters generating duplicate content.
Another great tool to use for this is a crawler tool, such as the ninja crawler tool mentioned prior. A crawler tool is great for identifying weird CMS errors that may be generating URLs and internal site search pages that may be getting indexed.
How To Fix It
Fixing these types of issues varies site by site but it usually involves some combination of robots.txt, rel=next/rel=prev, rel=canonical, noindex.
There are only so many ways you can slice and dice content on a web site. Recycling content completely or with use of find and replace can go too far really fast. The difficulty with this type of duplicate content is that whole content strategies are often built upon it. For changes like these, aggressive changes to site architecture are usually involved. When undertaking changes like this, make sure to carefully re-plan your internal linking strategy after you clean up pages and to do a crawl to make sure that redirects/removals were properly implemented.
How To Diagnose it
These sites are usually very large sites with very little content that is unique on pages. Although content on these sites tends to not be duplicate, content tends to be very thin the deeper you go on the site. Generally speaking, many of these types of sites tend to generally have low domain authority and low engagement.
How To Fix It
Many of these types of large sites were hit by Panda. The particular actions that need to be taken tend to vary by site. A fix usually involves fairly extensive content creation combined with a content consolidation strategy that involves identifying and redirecting/removing, depending on your philosophy and the situation, low performing and low quality sections of the site. Also, these types of radical changes involve heavy revisions to your internal linking strategy and to the site architecture as a whole.
Some questions to consider:
Auditing content strategy and deciding if content consolidation is something that needs to be undertaken is like peeling an onion. Here are some questions to consider:
Technical Duplicate Content:
- What do the URL’s on your site generally look like?
- How is the site handling sorting and pagination?
- How is internal site search handled?
Low Quality Content:
- Where is your best content? Is it all balled up on a couple key landing pages or does it extend to deeper pages as well? Does the content have a keyword density that is too high for a particular keyword and is this a persistent pattern across pages?
- Were you using automation as part of your content strategy in the past?
- Look at the pages in Google analytics that get the fewest organic search visitors, is this content worth keeping?
- Look at the pages with the highest bounce rate.
- Look at your pages by section in analytics, what are the worst preforming sections on your site.
Other Great Resources:
Tips for Consolidating Duplicate Content, Andrew Kaufman
This article covers some good tips for consolidating duplicate content.
Handling Duplicate Content, Ivan Strouchliak, SEOChat
This article covers a lot of interesting concepts, including block level analysis.