11 Jul 2013

4 Protips: Optimizing Faceted Navigation For SEO

Having clear well thought out navigation is a very important part of successful web design. However sometimes creating easy to use navigation can get rather complicated when there are many different options for the user to choose from.

Take for example an ecommerce site that sells hats. They might have a handful of main categories that organizes their products, but each of those categories may also contain more specific options that need to be organized as well. Lets say I want to look at baseball hats, but I only want to look at ones licensed by the MLB, I want adjustable sizing, and I want to only see hats for the Atlanta Braves (don’t judge me). The category “baseball hats” would be the best place to start my search, but developing sub categories for each of my other requirements would be to exhaustive and make the site’s information architecture way to complicated.

This is where faceted navigation comes into play. With faceted navigation sites with lots of individual pieces of content such as ecommerce sites, can create easy to select options with in their navigation to help their users drill down to the content that they are searching for. Probably one of the best examples of faceted navigation that I know of is seen at Amazon.com.

Even though faceted navigation is a great feature for greater UX, it can be a giant headache for SEO. Here’s why: Lets assume that a faceted navigation menu has 16 options. If a user only selects 3 of any of those options, there is then 3,360 different variations of the same results. Which means 3,360 different URLs with the same exact content! And this is just a small example, most sites with faceted navigation have the potential to create millions of URLs with the exact same content on them.

This type of duplicate content can wreck havoc on a site’s rankings and could potentially trip the Panda algorithm filter. Therefore, it is extremely important to make sure that we optimize these types of navigation as to not allow these types of issues to arise.

Here are a few tips to help optimize faceted navigation.


Use AJAX

What?! AJAX?! Its not often that a SEO will recommend using AJAX. Typically AJAX isn’t preferred by most SEOs because if done correctly, AJAX hides content from search engines, and does not create new URLs. But, that is exactly what we want for faceted navigation. When AJAX is used for faceted navigation it limits new URLs from being generated and hides duplicate content. One down side though, is it often makes it difficult to return later to the same page of results. For example, if there is no new URLs then there is nothing to bookmark or save for later viewing.


Use Canonical Tags

Many times the types of URLs that are generated from faceted navigation are just extensions of existing URLs. Using the example above, I might have been in the “Baseball Caps” category which might have a URL like example.com/baseball-caps/ but as soon as I start to drill down to other options, a new URL is created with perimeters attached, such as  example.com/baseball-caps/?size=fitted&team=braves&mlb=1. These types of URLs are exactly what canonical tags were created for. Adding a canonical tag to pages with faceted navigation should cut down on duplicate content issues.


Block At robots.txt

This is one of my favorite methods for handling faceted navigation because it seems to be the most guaranteed. By blocking URLs that are created from faceted navigation we can avoid even the hint of indexing duplicate content. However, in order for this option to work, we need a unique perimeter to call out in the robots.txt file. Many times this perimeter is something such as “sort-by”. This will allow us to quickly call out these URLs with the use of wildcard directives like so:

User-agent: *
Disallow: /*sort-by*

When the above directives are used the following URL will be ignored by the search engines: example.com/baseball-caps/?sort-by=1&size=fitted&team=braves&mlb=1


Use Meta Robots Tag

If you are using any type of server side processing such as PHP or Python, you can easily write a small script that identifies when a faceted URL is loaded and then apply a meta noindex tag in the markup’s head section. In PHP this script would look like:




In this example, we are assuming that the URL contains the ‘sort-by’ URL variable. However, the same script can be used with any unique variable used.

What are your favorite ways to handle faceted navigation? Let us know in the comments!

Comments

  1. twitter_seriocomic July 11, 2013 at 6:54 PM

    Great post Joe!

    A couple of points I picked up on when reading this:

    1. AJAX – done right, the dynamic content loading should be triggering a new URL using the history.api via PushState, so that could negate the upside you highlighted.

    2. We need to remind ourselves that blocking at robots.txt doesn’t necessarily prevent indexation, only crawl/discovery. But it’s still better than most other approaches, even if it’s a ‘blunt force’ one.

    3. If you’re using PHP or another server-side language to parse the URI, then rather than modify the HTML, consider sending the directive straight to the browser via HTTP Headers:

    header(“X-Robots-Tag: noindex”, true);

    1. Martin September 9, 2013 at 5:36 PM

      Hi,

      If you want to hide both your filterered content from bots _and_ make bots ignore the link itself, would you place the filter params in the hash; pushstate a clean URL (without hash) and then ajax load that URL? I suppose this suggests the server is able to tell apart Ajax requests from regular, full page requests (if you do not want to Ajax load the whole page including global navigation, that is)?

    2. Traian November 18, 2013 at 8:16 PM

      #3 seems intriguing. Can you please expand a bit on it?

  2. Carlos Estevez July 18, 2013 at 11:55 AM

    Great post, only one thing about using Ajax and bookmarks: you can use relative to page urls with # so you’d be able to bookmark it

  3. Jared September 26, 2013 at 9:56 AM

    What if crawlers are indexing Ajax requests?

  4. Jaimie Sirovich February 15, 2014 at 12:24 AM

    @Joe As you probably noticed, Google (finally) published some guidelines (http://googlewebmastercentral.blogspot.com/2014/02/faceted-navigation-best-and-5-of-worst.html) on this topic. Lucky me, the stuff I’ve published and we’ve personally discussed in the past is mostly right :)) A lot of people mention rel=canonical for this, and I’ve always disagreed with its use in this way, but it’s in the Google post. The problem with it is crawl priority, but she also recommends using nofollow in concert with it. In some ways this post raises more questions. Oh well. Regardless, big thanks to Maile for breaking the silence.

Leave a Reply