Although rel=”canonical” has been around for a long time, I think it is one of the more mysterious tools in terms of its implementation and there are a lot of questions around the how and when of implementing it. Here are some answers to some of simpler to more complex facets of rel=”canonical”
Q: when is it the appropriate time to use rel=canonical?
Rel=”canonical” is an HTML attribute that goes in the <head> section of webpages to help deal with duplicate or very similar content by identifying to search engines what the master version of the page is, so that the correct page is fetched and ranked for phrases and now, to also help with sending thin/duplicate signals. Rel=”canonical” is a tool for when you are unable to implement alternatives.
According to Matt Cutts of Google Search Quality Team, it is, “Far better to avoid dupes and normalize URLs in the first place” and, “if you’re a power user, exhaust alternatives first.” Both of these statements indicate that only after other methods of reconciling duplicate content have been exhausted, should rel=”canonical come into play – it is a tool of a tool of last resort.
Some options to try before turning to rel=”canonical”:
- Server side redirect
- 301 redirect
Some Options that are about on par with rel=”canonical”
- robots.txt (this is the definitely more of a sword than a scalpel)
Q: What are some cases where a rel=canonical tag may be in order?
Some cases where rel=canonical is an appropriate tool for the duplicate content job may include some of the following scenarios.
- sorting functionality
- Tracking codes
- Landing pages
- URL’s with Session ID’s
- Mixed case paths
- Links to pages that are not landing pages
- Cases where you are unable to generate 301 permanent redirects
I’ll be honest, the reason why I am including this question is because I see so many sitewide level implementations of rel=”canonical” on new sites without properly considering alternatives. The key thing to keep in mind is rel=”canonical” should be toward the end of the pool of options as you plan your duplicate content action plan.
Q: What happens if I mess up my rel=”canonical” implementation?
There are several ways that rel=canonical can go wrong. In the interest of space I will discuss two of the main ways:
Search Engines Not Honoring Rel=”Canonical” that has been implemented -Google has reserved the right to treat rel=”canonical” as a suggestion rather than a directive, part of the reasoning for this decision, as explained by Matt Cutts, was user error and malicious usage. There are several cases where Google will decline your rel=”canonical” suggestion:
- you implement a rel=canonical to a 404 page
- you incorrectly install the code such as putting it in the <body>, instead of in the <head> area or you fail to close <head>
- Google runs a similarity check on pages with rel=”canonical” code, if the pages are totally different, Google may not honor it.
- There are some cases when internal linking and inbound linking overwhelmingly suggest that another page may be actual page.
You didn’t check the code before pushing live and Search Engines are De-Indexing – I’ve heard of cases where rel=”canonical” was implemented on site with a directive that the homepage was the only non-duplicate page on a site. Be careful. Take a careful look at the code before pushing live on this. Also, Search Engines try to work around having stuff like this happen by conducting checks when you implement the code.
There are a couple more way but I think those are some good ones to consider….
Q: Do Google and Bing Disagree on Rel=”Canonical”? How does this affect my implementation?
Back in 2011, news broke that Google and Bing treat rel=canonical differently. In his article, Managing redirects – 301s, 302s and canonicals, Duane Forrester discusses implementation issues for Bing. The interesting thing is that unlike Google, who gave the green light to implementing rel=canonical sitewide, Bing seems to have a stricter interpretation. According of Forrester,
“Something else you need to keep in mind when using the rel=canonical is that it was never intended to appear across large numbers of pages. We’re already seeing a lot of implementations where the command is being used incorrectly. To be clear, using the rel=canonical doesn’t really hurt you. But, it doesn’t help us trust the signal when you use it incorrectly across thousands of pages, yet correctly across a few others on your website.
A lot of websites have rel=”canonical”s in place as placeholders within their page code. Its best to leave them blank rather than point them at themselves. Pointing a rel=canonical at the page it is installed in essentially tells us “this page is a copy of itself. Please pass any value from itself to itself.” No need for that.”
Based on this snippet, and the rest of the article:
- Bing gives less trust to canonicals applied sitewide. Later in the article Forrester recommends to leave the canonical blank, without specifying a URL, unless you need to.
- b)A self referential rel=canonical sends an indicator to bing that the page is a copy of itself, whereas this is not the case for Google.
Your implementation for your website or your network of websites will depend on a number of factors. Firstly, how big of a traffic source is Bing for you and is Bing organic an traffic source that you are looking to grow in the future? Part of what you will do will depend on that. To be honest though, I tend not to assume much about the future personally. I have taken a fairly keen interest in Bing because their overall search share is growing and to be honest, it sucks making changes that need to be reverted and changed again later. If you are trying to manage duplicate content issues for both search engines, consider using blank canonical tags as a tool to make the project programmatically easier, in some cases at least, and to satisfy requirements for both search engines.
Q: Can you use Rel=”canonical” and rel=prev/rel=next together?
According to Maile Ohye, Google Product Engineer, you can use rel=canonical and rel=next/rel=prev together. Bing also supports rel=prev/rel=next but given prior statements made by Bing team regarding usage of self referential canonical tags, I don’t recommend using them together is a good idea from the Bing point of view, since using them together presupposes that the canonical tag usage will be self referential usage, which is a no-no to Bing.
Q: Can you use rel=”canonical” on subdomains and across domains?
The cross domain canonical tag trailed after the initial release of the canonical. It is worth noting that cross domain canonicalization is a tool for very specific jobs. In Google’s announcement for cross domain rel=canonical, it was discussed in terms usage for site migrations. Site migration usage of the cross domain canonical is more of a corner case, at least in my experience. The main way that I have seen it used is for networkwide duplicate content, in some cases for content that is Panda update related.
- Is there an Advantage of using Rel=”canonical” over 301 redirect? By Matt Cutts – This is a good video by Matt Cutts handling this inquiry.
- A rel=”canonical” corner case, By Matt Cutts – Matt discusses a corner case with rel=canonical handling
- Moving Content? Think 301 not Rel=canonical, By Duane Forrester – I’ll be a spoiler and basically say the answer is no, rel=canonical should really not be used in place of 301 redirect. Anyhow this article goes into some of the reasoning behind that.
- Supporting Rel=Canonical in HTTP Headers, By Pierre Far – Googles announcement about supporting rel=canonical in header