28 Jul 2011

When Is Fixing Duplicate Content Issues Really Worth My Time ?

Duplicate content is a loaded topic, it comes in so many different styles and flavors. One thing is true though, it sucks fixing these issues more often  than not! (very rarely is it as easy as tucking something behind a robots.txt exclusion). It takes up your time analyzing, it takes up your developers time, sometimes lots of resources need to be  spent creating new unique content, and it is just plain frustrating to deal with. Are there ever circumstances where you can look the other way and not have to worry about the duplicate content monster sucking you away from other pressing activities? When is a duplicate content issue a fire and when is it something you can throw on the backburner?

Duplicate Content of Your Landing Pages:

I have a special definition of landing page. Traditionally
‘landing page’ is defined as a page that is getting traffic
from search. Although, I do love and agree with this definition,
I also have another one. A landing page is a page that gets traffic
from search and ranks for keywords. You may think it means the
same thing and you’re right BUT  this is a more complete definition
because the idea that all landing pages rank for keywords queried
is a subtle thing that people  don’t think about as often as they should.
…But I digress!

One thing to note about these special pages is that your site
does not have a ton of them. Just because you have 10,000
pages on your site, doesn’t mean that Google will love them
all long time and rank them for anything. So the few special ones
that they rank, these pages my friend, are precious.

EVEN MORE precious are those with backlinks, ESPECIALLY if you’re
trying to be hot in some kind of ecommerce space.

So,

Duplicate content on these special pages will make you VERY unhappy
because it will split your link equity, meaning that only about half
the links that you want point to your landing pages actually
are and the rest are probably going to that sneaky dup page. This
is huge, light the ultimate fire under your tech guy and get this
fixed!

The types of duplicate content issues that will mess your stuff
up on your sites’ landing pages are

  • Session ID’s – this is an old problem, which was a huge deal a couple years back but it still happens. Three options here are robots.txt, mod_rewrite, and store information in cookies. Also, here is a an informative post by Stoney DeGeyter on the on session ID’s.
  • CMS/ECommerce Platform Churning Out URL’s – This comes in a variety of different flavors and is beyond the scope of my discussion here.
  • For Sale Items on ECommerce Websites -  Ideally, have unique content for these pages.
  • Mirrored Sites – I’ve never run into this issue myself but Ali Husayni recommends a domain level redirect from one version of the site to the other.

Duplicate Content Of Deep Pages Google Indexed:

An easy way to check this is go to Google and type in
site:yoursite.com or if you have Aaron Wall‘s tool bar, click
on the little ‘I’ icon and find it there. You can also diagnose
if this is an issue using your webmaster tools.

If Google is  indexing less pages be very concerned (you can dig into this using webmaster tools). This is crawl fatigue.

What if everything looks O.K, you just have some ‘extra baggage’
getting indexed??

In the past I would have said, ‘if it is not drastic, affecting pages  indexed, rankings or traffic, then you’re O.K. Well, that changed
with the Panda Update, didn’t it? Panda proof your site, don’t let
this fly anymore! Google raised the bar in terms of standards
for webmasters. Duplicate content that was harmless in the past, is not anymore.

  • Printer Friendly Pages – The typical case of this type of duplicate content are printer friendly pages. Get to know rel=canonical and fix these.
  • Product Descriptions From Manufacturer – another classic case here is using product descriptions from the manufacturer. In a Panda world, I don’t recommend it. Pay a little extra to have unique content for your product pages and reduce your risk.

…OKAY, So When Doesn’t Duplicate Content  Matter??

In a nutshell, duplicate content pages don’t matter when Google
is not indexing them. Does a  tree that  falls in the woods
really make a noise, if Google doesn’t index  it??

Keep in mind though, this is the exception rather then the rule with dup content.  As
webmasters and SEO Professionals, we’re living in VERY interesting
times. We can’t afford to be sloppy and  ignore obvious problems like we did in the  past.
If you don’t fix these issues right now, it’s not gonna happen,
and what are you gonna do when the Panda takes a bite of your
traffic?

More Goodies On The Topic

Duplicate Content – By Google Webmaster Tools Help
Here is Google’s official page on duplicate content.

What is Duplicate Content? – By Unknown Author At SEOMoz
Neat and illustrated guide for managing duplicate content using a
rel=canonical and Robots.txt exclusion

Google Lets You Tell Them Which Parameters To Ignore - By Vanessa Fox
A lovely post on Search Engine Land discussing  how to handle Parameters using Webmaster Tools.

Duplicate Content: Block, Redirect or Canonical – By Benj Arriola
Benj does a nice job of discussing the various options shown above,
and the comment on this post are also very good.

Duplicate Content Issues and Search Engines – By Bill Slawski
Bill shares with us an exhaustive list of different kinds of
duplicate content, including both internal duplicate and external duplicate content issues.

Duplicate Content and SEO – By Anonymous Writer
This writer discusses how you can make your content, tags, and
links (for the purpose of block level analysis) more unique. Pretty
good post for those seeking to take preventative measures to fight the dup content monster:)

Gotta ask though, am I being to stringent in my assessment? Are there other cases where you can put off dealing with duplicate content issues?

Keep it real and happy Thursday night  :D :D – ninja bonnie