<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Internet Marketing Ninjas Blog</title>
	<atom:link href="http://www.internetmarketingninjas.com/blog/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.internetmarketingninjas.com/blog</link>
	<description>Internet marketing blog</description>
	<lastBuildDate>Wed, 22 Feb 2012 20:26:01 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>SEO Worst Practices</title>
		<link>http://www.internetmarketingninjas.com/blog/search-engine-optimization/seo-worst-practices/</link>
		<comments>http://www.internetmarketingninjas.com/blog/search-engine-optimization/seo-worst-practices/#comments</comments>
		<pubDate>Wed, 22 Feb 2012 15:08:49 +0000</pubDate>
		<dc:creator>Rick DeJarnette</dc:creator>
				<category><![CDATA[Search Engine Optimization]]></category>

		<guid isPermaLink="false">http://www.internetmarketingninjas.com/blog/?p=2748</guid>
		<description><![CDATA[SEOs love to talk and write about industry best practices, as well they should. To best serve their clients, following... <a href="http://www.internetmarketingninjas.com/blog/search-engine-optimization/seo-worst-practices/" class="read-more">read on&#160;&#187;</a>]]></description>
			<content:encoded><![CDATA[<p>SEOs love to talk and write about industry best practices, as well they should. To best serve their clients, following industry-established SEO best practices should always be the goal.</p>
<p>But what about those shady types on the fringes of the SEO community who advocate, hmmm, let’s call it a less reputable route? Not surprisingly, these SEO are also typically the same ones who claim they can absolutely get you to be the #1 spot in the search engine results pages (SERPs). Guaranteed, no less! There’s a reason the old adage, “If it’s too good to be true, it probably isn’t” remains so relevant in our lives.</p>
<p>If you run a business and are shopping for SEO services and consulting, if any of the following techniques are mentioned (and please do ask!), you can be confident that the consultant in question is following well-known (in professional circles) SEO Worst Practices. The results of such efforts used on your site will likely backfire, causing your website to be penalized (lowered in rank) or, if excessively egregious, perhaps even purged from the index. This applies to both Google and Bing.</p>
<p>Let’s take a look at a few top choices in SEO worst practices:</p>
<hr />
<h2>Keyword stuffing</h2>
<p>If the grand plan for getting your site to #1 includes adding ~150 highly searched for (but largely irrelevant) terms and phrases into the &lt;meta&gt; keywords tag, not only is this poor form in terms of webpage spam, but it’s also a hopelessly out-of-date and obsolete technique. First of all, stuffing the &lt;meta&gt; keywords tag is a tactic that was old 10 years ago. Because so many sites used this tag in an attempt to increase their page rank for targeted keywords by repeating those words countless times (or, alternatively, expand the relevance of the page to keywords not otherwise used on the page), the search engines long ago abandoned using the &lt;meta&gt; keywords tag for keyword relevance. Google has come right out and stated it does not use the &lt;meta&gt; keywords tag for keyword relevance. Bing has taken a more nuanced position in that the tag is actually not really used today, but states there are hundreds of ranking factors that are considered, and one day, if the intentional spamming of this tag finally dies off due to neglect, it might eventually become useful again. Maybe. But not today, not soon, and no promises beyond that.</p>
<p>The act of keyword stuffing not only occurs in the &lt;meta&gt; keywords tag, but can also occur in &lt;title&gt; tags, &lt;img&gt; alt text, heading tags, anchor text, even sometimes boldly in plain body text!</p>
<p>The search engines crawl web pages and see what is in the code. They see the text within the &lt;body&gt; tag, as well as the page metadata. They see when a word is repeatedly used to excess, and they can mitigate any attempted beneficial manipulation in their page ranking assessment. Keyword stuffing is dumb, clumsy, ineffective, and amateurish SEO.</p>
<h3>Keyword stuffing caveat</h3>
<p>If the use of repeated terms is legitimate to the business of the page, the search engines will understand that and accommodate that in their search for what otherwise is web spam. For example, if you are an attorney who can help clients with income taxes, tax deductions, tax penalties from the IRS, interest on back taxes, business taxes, rules for excise taxes, estate tax planning (you see the point), the repetition of the word “tax” is not spam because the phrases in question are normal word usage, not artificially done for web spam.</p>
<p>The difference between this sample usage and keyword stuffing is intent. The search engines spend a huge amount of time and resources trying to parse legitimate from illegitimate intent. If you are incorrectly identified as a spammer and your site suddenly tanks in the rankings, and it’s not due to larger algorithm changes like Google’s series of <a href="../../../../../pandaupdate/">Panda updates</a>, then you can appeal a penalty in <a href="http://support.google.com/webmasters/bin/answer.py?hl=en&amp;answer=35843">Google</a> and/or <a href="https://support.discoverbing.com/default.aspx?st=1&amp;website=bing&amp;tenant=oss&amp;brand=bing&amp;as=1&amp;timestmp=634648324963712225&amp;acty=ProductList&amp;ctl=oss%2fcontent%2fbing_support_home&amp;wf=OSS&amp;trl=OSS%7EProductList&amp;c=oss_bing&amp;ln=en-us&amp;productKey=bing&amp;sub=free">Bing</a>. Just note that if you were penalized for violations of the official webmaster guidelines of <a href="http://www.google.com/support/webmasters/bin/answer.py?hl=en&amp;answer=35769">Google</a> or <a href="http://onlinehelp.microsoft.com/en-us/bing/hh204434.aspx">Bing</a>, your entire site must be cleaned up of all web spam techniques and republished before you apply for reconsideration.</p>
<hr />
<h2>Hidden content</h2>
<p>Search engines don’t like seeing pages that have hidden content (content that’s crawlable in the code, but doesn’t show in the browser window). That’s considered to be the equivalent of telling the search engines, “Here, this extra bit of content is just for you.” They consider that to be a maliciously manipulative attempt to earn page ranking credit for material not actually shown in the page. As Martha Stewart might say, that’s not a good thing.</p>
<p>Webmasters use a multitude of easily detected techniques in the effort to hide content from display. They use code like <em>&lt;div style=&#8221;display: none;&#8221;&gt;</em> to hide entire passages of body text. They use style attributes to make text the same color as the background, rendering it invisible, or configure the font size to be so small that it’s unreadable. There are many such “tricks” used to stuff extra text in a page. Of course, the search engines see all of this coding and can interpret that it’s intended to be hidden from view. If the intent of the usage is malicious, penalties can ensue.</p>
<h3>Hidden content caveat</h3>
<p>The use of <em>&lt;input type=&#8221;hidden&#8221;&gt;</em> controls are not by themselves suspicious, as some controls are not revealed in the default view of a page. The issue is always intent. If passages of text are hidden, especially if they contain keyword stuffing as mentioned earlier, this is what raises the red flags for search engines.</p>
<hr />
<h2>Cloaking</h2>
<p>Cloaking is where the web server uses the identity of the user agent making the request for the page to determine which version of the content is returned. For example, if an IE 6 user agent requests a page and then the Googlebot user agent requests the same page, but the content is different, that indicates there is user agent filtering (cloaking) occurring for the purposes of manipulating what content search engines see. The goal of malicious cloaking is always to artificially inflate the rank of the page the user sees. Cloaking can show users normal pages and serve search engine crawlers keyword-stuffed pages. Alternatively, cloaking can serve search engines nicely optimized, relevant pages but serve users junk sales pitches for illicit pharmaceuticals, porn, or other such content that would not rank otherwise for that query.</p>
<h3>Cloaking caveat</h3>
<p>Pages that filter for mobile browsers to show abbreviated or custom-formatted versions of the desktop page are not considered to be malicious cloaking. Again, it comes down to intent. Is the web server attempting to maliciously manipulate the search rankings for a given page? The use of cloaking on search engine user agents is not a good idea. The search engines will detect it and penalize a site employing that technique accordingly. Filtering user agents for mobile devices, as long as the content stays similar, is relevant to the search query, and useful to users, is not a problem.</p>
<hr />
<h2>Intent is key</h2>
<p>All of these SEO worst practices are based on the intention to deceive, be it the search engine crawler or the human end user. As long as businesses create great sites that are of value to human visitors, have compelling content, and are easy to navigate (meaning they are also easy to crawl), they will be assessed accordingly. Those are the sites that earn backlinks and citations from other sites, and that is what is needed for white hat SEO.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.internetmarketingninjas.com/blog/search-engine-optimization/seo-worst-practices/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Using and Optimizing Images: Search and Social Optimization Cheat Sheets</title>
		<link>http://www.internetmarketingninjas.com/blog/search-engine-optimization/image-seo/</link>
		<comments>http://www.internetmarketingninjas.com/blog/search-engine-optimization/image-seo/#comments</comments>
		<pubDate>Mon, 20 Feb 2012 15:07:01 +0000</pubDate>
		<dc:creator>Ann Smarty</dc:creator>
				<category><![CDATA[Search Engine Optimization]]></category>

		<guid isPermaLink="false">http://www.internetmarketingninjas.com/blog/?p=2727</guid>
		<description><![CDATA[If you are running a website, there are most likely going to be plenty of images there. While image-specific SEO... <a href="http://www.internetmarketingninjas.com/blog/search-engine-optimization/image-seo/" class="read-more">read on&#160;&#187;</a>]]></description>
			<content:encoded><![CDATA[<p>If you are running a website, there are most likely going to be plenty of images there.</p>
<p>While image-specific SEO is very-well explained <a href="http://www.seosmarty.com/image-seo/">in</a> <a href="http://www.seomoz.org/blog/image-seo-basics-whiteboard-friday">a</a> <a href="http://www.toprankblog.com/2010/06/6-tips-image-seo/">few</a> <a href="http://support.google.com/webmasters/bin/answer.py?hl=en&amp;answer=114016&amp;ctx=sibling">detailed</a> <a href="http://www.zdnet.com/blog/seo/the-definitive-guide-to-seo-for-images-6-steps-to-image-ranking-success/1241">guides</a>, let&#8217;s try to create a very simple and easily-organized guide to using images properly:</p>
<hr />
<h2>1. Free Images You *Can* Use</h2>
<p>First things first: let&#8217;s see where you can find free-to-use images online. That&#8217;s a popular misconception that you can actually use <em>any</em> image you find online as long as you credit the source.</p>
<p>Mind that you can only re-use images with a certain license; <a href="http://www.makeuseof.com/tag/search-credit-properlylicensed-photos-flickr-firefox/">here&#8217;s a quick guide</a> (here&#8217;s another a bit more advanced <a href="http://arstechnica.com/tech-policy/news/2011/08/creative-commons-images-and-you.ars">one</a>) into the three types of the Creative Commons licenses that allow you to re-publish the images on your site under certain conditions:</p>
<table width="600" border="2" cellspacing="0" cellpadding="10" align="center">
<tbody>
<tr>
<td align="center" valign="middle"><strong>License</strong></td>
<td align="center" valign="middle"><strong>Icon</strong></td>
<td align="center" valign="middle"><strong>You Can Re-use</strong></td>
<td align="center" valign="middle"><strong>For <a href="http://boingboing.net/2008/12/03/what-is-noncommercia.html">Commercial</a> Use?</strong></td>
<td align="center" valign="middle"><strong>You Can Modify</strong></td>
<td align="center" valign="middle"><strong>Credit</strong></td>
</tr>
<tr>
<td align="center" valign="middle"><em><strong>Attribution-NoDerivs License</strong></em></td>
<td align="center" valign="middle"><img src="http://www.internetmarketingninjas.com/blog/wp-content/uploads/2012/02/image-seo-03.gif" alt="Attribution License" width="32" height="32" border="0" /> <img src="http://www.internetmarketingninjas.com/blog/wp-content/uploads/2012/02/image-seo-04.gif" alt="Attribution-NoDerivs License" width="32" height="32" /></td>
<td rowspan="3" align="center" valign="middle">Yes</td>
<td align="center" valign="middle">Yes</td>
<td align="center" valign="middle">No</td>
<td rowspan="3" align="center" valign="middle">Yes (Required)</td>
</tr>
<tr>
<td align="center" valign="middle"><em><strong>Attribution-NonCommercial-NoDerivs License</strong></em></td>
<td align="center" valign="middle"><img src="http://www.internetmarketingninjas.com/blog/wp-content/uploads/2012/02/image-seo-03.gif" alt="Attribution License" width="32" height="32" border="0" /> <img src="http://www.internetmarketingninjas.com/blog/wp-content/uploads/2012/02/image-seo-05.gif" alt="Attribution-NonCommercial-NoDerivs License" width="32" height="32" /></td>
<td align="center" valign="middle">No</td>
<td align="center" valign="middle">No</td>
</tr>
<tr>
<td align="center" valign="middle" bgcolor="#66ffcc"><strong><em>Attribution-ShareAlike License</em></strong></td>
<td align="center" valign="middle" bgcolor="#66ffcc"><img src="http://www.internetmarketingninjas.com/blog/wp-content/uploads/2012/02/image-seo-03.gif" alt="Attribution License" width="32" height="32" border="0" /> <img src="http://www.internetmarketingninjas.com/blog/wp-content/uploads/2012/02/image-seo-06.gif" alt="Attribution-ShareAlike License" width="32" height="32" /></td>
<td align="center" valign="middle" bgcolor="#66ffcc"><strong>Yes</strong></td>
<td align="center" valign="middle" bgcolor="#66ffcc"><strong>Yes</strong></td>
</tr>
</tbody>
</table>
<p>The two popular (and most effective) sources of Creative Commons images you can re-publish are:</p>
<ul>
<li><a href="http://www.flickr.com/creativecommons/">Flickr Creative Commons</a> (here&#8217;s the most <a href="http://compfight.com/">usable search tool</a> to easily search through Flickr photos by license)</li>
<li><a href="http://www.google.com/advanced_image_search">Google *Advanced* Image Search</a> (Oddly, there&#8217;s no &#8220;Usage Rights&#8221; option in the sidebar of general image search interface but luckily you can access it using the &#8220;Advanced&#8221; search option)</li>
</ul>
<hr />
<h2>2. Image File Name</h2>
<p><strong>An image file name</strong> is crucial when it comes to ranking an image in Google Image search results (<em>though it&#8217;s not the only factor as well. <a href="http://www.seomoz.org/blog/is-optimizing-photos-more-important-than-you-think">See this case study on photo optimization</a></em>). I&#8217;ve seen a huge boost of image search traffic each time I pick a good name for my image. Here&#8217;s what has always worked like a charm:</p>
<blockquote><p>key-phrase.jpg</p></blockquote>
<p><em>Mind that traditionally, search engines read a hyphen in URLs and file names as a &#8220;space&#8221; &#8211; that doesn&#8217;t mean Google won&#8217;t understand an underscore, an actual space or other characters there, but a hyphen is the most natural and straightforward way to go with.</em></p>
<hr />
<h2>3. Image ALT Text and Title</h2>
<p>That&#8217;s another very popular question: what&#8217;s the difference between the ALT and the TITLE attributes when it comes to describing an image?</p>
<p><strong>Most basically, here&#8217;s the difference:</strong></p>
<table width="600" border="2" cellpadding="10" align="center">
<tbody>
<tr>
<td colspan="2" bgcolor="#99FFFF"></td>
<td bgcolor="#99FFFF"><strong>Image &#8220;Alt&#8221; Attribute</strong></td>
<td bgcolor="#99FFFF"><strong>Image &#8220;Title&#8221; Attribute</strong></td>
</tr>
<tr>
<td colspan="2"><strong>Official rule of use</strong></td>
<td>Describes an image for search agents</td>
<td>Gives *additional* information on what an image is about <em>(when it&#8217;s required)</em></td>
</tr>
<tr>
<td colspan="2" bgcolor="#00CCCC"><strong>Screen readers (like JAWS or Orca)</strong></td>
<td bgcolor="#00CCCC">&#8220;Read&#8221; it</td>
<td bgcolor="#00CCCC">Ignore it by default (it is mostly considered <a href="http://accessibilitytips.com/2008/04/14/avoiding-redundant-title-attributes/">redundant</a>*)</td>
</tr>
<tr>
<td rowspan="5"><strong>Browsers</strong></td>
<td><strong>Google Chrome</strong></td>
<td>Is displayed when images are disabled</td>
<td rowspan="5">Pops up when you hover over an image</td>
</tr>
<tr>
<td><strong>FireFox</strong></td>
<td>Is displayed when images are disabled</td>
</tr>
<tr>
<td><strong>Safari</strong></td>
<td>Is ignored</td>
</tr>
<tr>
<td><strong>Opera</strong></td>
<td>Is displayed when images are disabled</td>
</tr>
<tr>
<td><strong>IE</strong></td>
<td>Pops up when you hover over an image if notitle attribute is present</td>
</tr>
</tbody>
</table>
<p><strong>Conclusions</strong>:</p>
<ul>
<li>(Very important!) Use ALT text to describe the image you are using;</li>
<li>Use title <em></em>if you need to give additional information: do <strong>NOT</strong> duplicate it with alt text! (*accessibility rules only advise to use TITLE tags for abbreviations, forms, etc, i.e. where an explanation is really necessary);</li>
<li>(If there are many images on one page) Use <strong>different</strong> alt text throughout the page as it will be displayed as &#8220;text&#8221; in most browsers (when images are disabled) and in the email newsletter (when remote content is loaded on demand):</li>
</ul>
<p><img class="aligncenter" src="http://www.internetmarketingninjas.com/blog/wp-content/uploads/2012/02/image-seo-02.jpg" alt="Image alt text displayed in an email" width="500" height="368" /></p>
<hr />
<h2>4. Image Size and Type</h2>
<p>The good old rule has always been to keep your images less than 100K. My own rule of thumb: as long as I don&#8217;t sacrifice on the image quality, <em>I make it the minimum size I can</em>.</p>
<p>Google also <a href="http://code.google.com/speed/page-speed/docs/payload.html#CompressImages">recommends</a>: <strong>&#8220;the less, the better&#8221;</strong>. Here are Google&#8217;s recommendations as to file types and compressors:</p>
<table width="600" border="1" cellpadding="10" align="center">
<tbody>
<tr>
<td bgcolor="#99FFFF"></td>
<td bgcolor="#99FFFF"><strong>Best used for</strong></td>
<td bgcolor="#99FFFF"><strong>Recommended compressor</strong></td>
</tr>
<tr>
<td><strong>JPGs</strong></td>
<td>All photographic-style images</td>
<td><a href="http://jpegclub.org/">jpegtran</a> or <a href="http://freshmeat.net/projects/jpegoptim/">jpegoptim</a></td>
</tr>
<tr>
<td bgcolor="#99FFFF"><strong>PNGs</strong></td>
<td bgcolor="#99FFFF">Logos, banners, etc (where you need transparent background)</td>
<td bgcolor="#99FFFF"><a href="http://optipng.sourceforge.net/">OptiPNG</a> or <a href="http://www.advsys.net/ken/util/pngout.htm">PNGOUT</a></td>
</tr>
<tr>
<td><strong>GIFs</strong></td>
<td>For very small / simple graphics (e.g. less than 10&#215;10 pixels, or a color palette of less than 3 colors) &amp; for animated images</td>
<td>N/A</td>
</tr>
<tr>
<td bgcolor="#99FFFF"><strong>BMPs or TIFFs</strong></td>
<td colspan="2" bgcolor="#99FFFF">Don&#8217;t use</td>
</tr>
</tbody>
</table>
<p><strong>More great tools to try for any image file type you are using</strong>:</p>
<ul>
<li><a href="http://wordpress.org/extend/plugins/wp-smushit/">WP Smush.it</a> &#8211; a WordPress plugin that uses Smush.it API to perform image optimization automatically. It does all essential image optimization tasks: optimizing JPEG compression, converting certain GIFs to indexed PNGs and stripping the un-used colors from indexed images (except for stripping JPEG meta data) automatically.</li>
<li><a href="http://www.irfanview.com/">IrfanView</a> (desktop) is an awesome free tool that helps you crop, (bulk-)optimize and (bulk-)rename images.</li>
</ul>
<hr />
<h2>5. Image Thumbnails in Social Media</h2>
<p>An image thumbnail generated with the snippet when someone shares your post on a Facebook or Google Plus wall is crucial when it comes to click-through and further shares.</p>
<p>While Google Plus is generally very smart at marking up your page and finding the best thumbnail to go with the update, Facebook seems to only rely on what you &#8220;point&#8221; to it. Besides, when using Facebook&#8217;s &#8220;Like&#8221; button, your readers have almost no control over the shared snippet and often an image that gets to your reader&#8217;s Facebook wall is absolutely random.</p>
<p>To ensure your beautiful, relevant and eye-catching images make it to your fans&#8217; Facebook streams and get lots of attention, we are forced to use <a href="http://developers.facebook.com/docs/opengraph/">Open Graph Protocol</a> to point Facebook to what needs to be grabbed from your page:</p>
<p><img class="aligncenter" src="http://www.internetmarketingninjas.com/blog/wp-content/uploads/2012/02/image-seo-03.jpg" alt="Open Graph" width="542" height="111" /></p>
<p><a href="http://wordpress.org/extend/plugins/wp-facebook-open-graph-protocol/">This</a> WordPress Plugin makes integrating Open Graph very easy for WordPress bloggers. Also, <a href="http://developers.facebook.com/tools/debug">this tool</a> will help you identify how Facebook &#8220;sees&#8221; your page as well as refresh its cache.</p>
<hr />
<h2>Other &#8220;Obvious&#8221; Factors</h2>
<p>The tips and tables above mostly list image-specific factors of making your images search- and social-friendlier. That doesn&#8217;t mean other commonsense practices don&#8217;t matter here:</p>
<ul>
<li>Your images should be surrounded with relevant &#8220;text-based&#8221; content to rank well in image search results;</li>
<li>Your images should be located at powerful pages (in terms of link juice and on-page optimization).</li>
</ul>
<p><em>Have I missed anything? Let&#8217;s help make is an actual &#8220;all-in-one&#8221; guide: add your image SEO tips in the comments!</em></p>
<p><strong>For more useful SEO- and social-media-related content, don&#8217;t forget to f<a href="http://twitter.com/NinjasMarketing">ollow us on Twitter</a> and <a href="http://www.facebook.com/IMNinjas">join us on Facebook</a>!</strong></p>
]]></content:encoded>
			<wfw:commentRss>http://www.internetmarketingninjas.com/blog/search-engine-optimization/image-seo/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>HOW TO: Make the Most of Your TWEET Button + Must-Have Checklist (WordPress)</title>
		<link>http://www.internetmarketingninjas.com/blog/social-media/how-to-make-the-most-of-your-tweet-button-must-have-checklist-wordpress/</link>
		<comments>http://www.internetmarketingninjas.com/blog/social-media/how-to-make-the-most-of-your-tweet-button-must-have-checklist-wordpress/#comments</comments>
		<pubDate>Thu, 16 Feb 2012 17:46:55 +0000</pubDate>
		<dc:creator>Ann Smarty</dc:creator>
				<category><![CDATA[Social Media]]></category>

		<guid isPermaLink="false">http://www.internetmarketingninjas.com/blog/?p=2479</guid>
		<description><![CDATA[Back in 2010 Twitter partnered with Tweetmeme to create an official Tweet button. It resulted in lots of criticism as... <a href="http://www.internetmarketingninjas.com/blog/social-media/how-to-make-the-most-of-your-tweet-button-must-have-checklist-wordpress/" class="read-more">read on&#160;&#187;</a>]]></description>
			<content:encoded><![CDATA[<p>Back in 2010 Twitter <a href="http://techcrunch.com/2010/08/12/twitter-tweet-button/">partnered</a> with Tweetmeme to create an official Tweet button. It resulted in lots of criticism as the official button lacked some obvious customization options (primarily, it forces the user to use Twitter&#8217;s official shortener. Besides, it has no color customization and is somewhat slow).</p>
<p>However the fact that Twitter partnered with the major player in that field to launch the official product has left us with almost no choice. Besides, the Tweet button does have quite a few great features (my favorite one: the ability to recommend your personal as well as your business accounts for people to follow).</p>
<p>Besides, what many people are likely to be unaware of: the Tweet button can be customized with the help of some of the hacks (listed below)</p>
<hr />
<h2>1. Use Your Own URL Shortener</h2>
<p><a href="http://wordpress.org/extend/plugins/twitter-friendly-links/">Twitter Friendly Links</a> is a great and absolutely easy-to-use WordPress plugin that lets you quickly create your own domain shortener. Just have it installed and activated, go to the plugin settings and configure the following:</p>
<ul>
<li><strong>Set the base URL</strong>: this option is *gold* if you have www in your default URLs. Making the base URL www-free will make your shortened URLs even shorter!</li>
<li><strong>Set the Redirection type</strong>. I am using 301 redirect as a link-builder at heart: someone would want to use my tweet to quote or will copy-paste the URL from the URL directly from tweet and use it in a post. So I want this link not to be wasted.</li>
</ul>
<p><img class="aligncenter" src="http://www.internetmarketingninjas.com/blog/wp-content/uploads/2012/01/most-tweet-button-01.jpg" alt="Twitter Friendly Links" width="550" height="318" /></p>
<p>Now to use these pretty branded URLs with your Twitter button you&#8217;ll need this plugin: <a href="http://wordpress.org/extend/plugins/wp-tweet-button/">WP Tweet Button</a> &#8211; it has plenty of cool features and the rest of the post will be dedicated to those features, but so far the most important option for us is the WP Tweet Button plugin ability to support Twitter-Friendly link. All you need to do is to select &#8220;Twitter-Friendly Links Plugin&#8221; from the drop-down on its Settings page:</p>
<p><img class="aligncenter" src="http://www.internetmarketingninjas.com/blog/wp-content/uploads/2012/01/most-tweet-button-02.jpg" alt="Twitter Friendly Links" width="384" height="260" /></p>
<p>Now you are done: Readers are able to tweet your articles using the official Tweet button while using your pretty self-hosted URL shortener.</p>
<p><img class="aligncenter" src="http://www.internetmarketingninjas.com/blog/wp-content/uploads/2012/01/most-tweet-button-03.jpg" alt="Twitter Friendly Links" width="486" height="136" /></p>
<p><strong>Another option</strong>: <a href="http://wordpress.org/extend/plugins/yourls-wordpress-to-twitter/">YOURLS</a> which is also fun but somewhat hard to set up.</p>
<hr />
<h2>2. Promote Your Guest Authors and Contributors</h2>
<p><a href="http://wordpress.org/extend/plugins/wp-tweet-button/">WP Tweet Button</a> has another valuable option: you can set up your &#8220;default&#8221; Twitter username from the admin page but it will be overridden by the author&#8217;s Twitter name he specifies on his profile name.</p>
<p><img class="aligncenter" src="http://www.internetmarketingninjas.com/blog/wp-content/uploads/2012/01/most-tweet-button-05.jpg" alt="WP Tweet Button" width="483" height="95" /></p>
<p>What does it mean?</p>
<p>The post author&#8217;s Twitter name will be included in the Tweet itself. Besides, the author&#8217;s Twitter username will be recommended for following along with your default username:</p>
<p><img class="aligncenter" src="http://www.internetmarketingninjas.com/blog/wp-content/uploads/2012/01/most-tweet-button-04.jpg" alt="WP Tweet Button" width="547" height="420" /></p>
<p>Another great thing about the plugin is that it lets you specify the Tweet text as well as more accounts to &#8220;recommend&#8221; right on the &#8220;Edit post&#8221; level. This means you can get as creative as you want: add the author&#8217;s Twitter account to recommend, his custom hashtag, his name in the Tweet, etc:</p>
<p><img class="aligncenter" src="http://www.internetmarketingninjas.com/blog/wp-content/uploads/2012/01/most-tweet-button-06.jpg" alt="Custom Tweet text" width="283" height="440" /></p>
<p>Here you go!</p>
<p><img class="aligncenter" src="http://www.internetmarketingninjas.com/blog/wp-content/uploads/2012/01/most-tweet-button-07.jpg" alt="Custom Tweet text" width="530" height="196" /></p>
<hr />
<h2>3. *Smartly* Auto-Tweet Your Articles</h2>
<p>Whether to auto-tweet your blog articles or not is up to you. I find it a great option for people on the go: blogging while traveling, scheduling articles for later, etc &#8211; in these cases, the ability to auto-tweet your posts that go live while you are away or busy is a life-saver.</p>
<p>Now, traditionally Twitterfeed is used to auto-tweet, but WP Tweet Button is so much better!</p>
<p><img class="aligncenter" src="http://www.internetmarketingninjas.com/blog/wp-content/uploads/2012/01/most-tweet-button-08.jpg" alt="*Smartly* Auto-Tweet" width="465" height="215" /></p>
<ul>
<li>Auto-tweet your own articles using your self-hosted URL shortener;</li>
<li>Auto-tweet your articles each time you update them.</li>
</ul>
<p>The smart plugin even removed the default Twitter username to avoid you referencing yourself in your own Tweets.</p>
<p><img class="aligncenter" src="http://www.internetmarketingninjas.com/blog/wp-content/uploads/2012/01/most-tweet-button-09.jpg" alt="Auto-tweet articles" width="505" height="216" /></p>
<hr />
<h2>So The Check-List Now!</h2>
<p>If you choose to try the tips above, here&#8217;s an actionable and easy-to-follow to-do list for you:</p>
<table border="1" cellpadding="10">
<tbody>
<tr>
<td><strong>Tool</strong></td>
<td><strong>Settings</strong></td>
</tr>
<tr>
<td rowspan="3" bgcolor="#66FFCC"><strong><a href="http://wordpress.org/extend/plugins/twitter-friendly-links/">Twitter Friendly Links</a></strong></td>
<td>Install and activate</td>
</tr>
<tr>
<td>Set the base URL (www-free)</td>
</tr>
<tr>
<td>Set the Redirection type (301)</td>
</tr>
<tr>
<td rowspan="5" bgcolor="#00CC33"><strong><a href="http://wordpress.org/extend/plugins/wp-tweet-button/">WP Tweet Button</a></strong></td>
<td>Install and activate</td>
</tr>
<tr>
<td>Select &#8220;Twitter-Friendly Links Plugin as your URL shortener</td>
</tr>
<tr>
<td>Set the default Twitter username to reference and recommend</td>
</tr>
<tr>
<td>Ask all your contributors to update their profiles to add their Twitter usernames (to reference them in the Tweets)</td>
</tr>
<tr>
<td>Authorize the plugin to auto-tweet your articles when you publish and/or update them</td>
</tr>
</tbody>
</table>
<p><strong><em>Don&#8217;t forget to follow our official <a href="https://twitter.com/#!/NinjasMarketing">Ninja Twitter account</a> to have articles like this (and better) to be delivered right to your Twitter home page!</em></strong></p>
]]></content:encoded>
			<wfw:commentRss>http://www.internetmarketingninjas.com/blog/social-media/how-to-make-the-most-of-your-tweet-button-must-have-checklist-wordpress/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>A Love Letter to the Site: Command</title>
		<link>http://www.internetmarketingninjas.com/blog/search-engine-optimization/the-site-command/</link>
		<comments>http://www.internetmarketingninjas.com/blog/search-engine-optimization/the-site-command/#comments</comments>
		<pubDate>Wed, 15 Feb 2012 21:44:12 +0000</pubDate>
		<dc:creator>Ninja Jen Van Iderstyne</dc:creator>
				<category><![CDATA[Search Engine Optimization]]></category>

		<guid isPermaLink="false">http://www.internetmarketingninjas.com/blog/?p=2705</guid>
		<description><![CDATA[We have a lot of tools here at Internet Marketing Ninjas. I mean A LOT. We have tools that crawl... <a href="http://www.internetmarketingninjas.com/blog/search-engine-optimization/the-site-command/" class="read-more">read on&#160;&#187;</a>]]></description>
			<content:encoded><![CDATA[<p>We have a lot of tools here at Internet Marketing Ninjas. I mean A LOT. We have tools that crawl sites, compare competitors and create charts and tables full of useful data. I’m such an isolated geek – I actually called someone a “tool” the other day, forgetting that, in some circles, that’s actually an insult. But honestly, the tools our reporting team gets to work with are so awesome and comprehensive that occasionally it makes me a little nostalgic. It’s kind of the way I imagine some people still fondly remember rotary phones when looking at an iPhone. No, the old technology wasn’t better. But we didn’t know of better technology at the time and the old tools remind us of when progress was younger and we made the best of what we had.</p>
<p>That’s how I feel about a site: command.</p>
<p>Whenever I’m taking an initial look at a site, I run a few tools, absolutely, but either by habit or by instinct, one of the first things I still do is run a site: command in Google. It’s a simple string:</p>
<pre>site:yoursite.com</pre>
<p>And maybe I’m a relic, but when I started doing this work, I built links using Internet Explorer and Notepad. I used the linkfromdomain: command in MSN to find out where a site was linking out to (remember that?!). I looked at backlinks manually (which I still like to do) and I used the site: command every day. So maybe I have issues letting go, but I still get a lot of insight from that old trick.</p>
<hr />
<h2>Pages Indexed in Google</h2>
<p>Obviously, a site: command tells me how many pages of a site Google has indexed. Most of the time this is background information that isn’t exactly earth shattering. But sometimes you’ll find Google has indexed more pages than a client thinks they have on their site. That’s a bad sign. That could mean duplicate URLS or indexed search results. Of course the opposite can also happen when, of 10,000 pages, Google has only indexed 3,000. That means something, too. The pages indexed figure doesn’t mean much when it’s right, but if it’s not what you anticipated, then it instantaneously reveals when something is probably wrong.</p>
<h3>Title Tags</h3>
<p>Next up for scrutiny are title tags. How are they laid out? Are they the same on every page? Do they balance keywords and branding? A site: command tells you all of that, which also tells you if someone who knows SEO has worked on these before. Title tags are your most powerful on-page tool, so using those well is one of the most important SEO decisions a webmaster makes. By looking at them, laid out in a site: command, we can tell if those are locked and loaded, of if they are a starting point for better optimization.</p>
<h3>Meta Descriptions</h3>
<p>Right below the title tags are the snippets which are often derived from meta descriptions.  Even though meta descriptions aren’t exactly a ranking factor in terms of keywords, they are important when they convert SERP impression into clicks. But sometimes they get mistreated. They may be keyword stuffed for no good reason or boiler-plated across a whole site. Sometimes they’re missing altogether. Depending on what else we find during an analysis, we may find that rewriting meta descriptions falls low on the SEO priority list. However, for the purposes of usability or conversions, we may want to get some revisions going ASAP.</p>
<h3>URL Structures</h3>
<p>You can also tell a lot about structure from a site: command. I look for things like subdomains, dynamic URLs and secure pages. You can also see tiny little things like underscores instead of hyphens or mixed case URLs. In some instances you can’t see the entire URLs because you see bread crumb links instead, which also helps you get a sense of what a site is doing with their architecture.</p>
<h3>Google+ and Author</h3>
<p>If a site is socially active, particularly in Google+, you can see that in a site: command now, too. This is a bit more of a recent development as is seeing an author’s picture show up next to URLs. Ok, so this may not be strategy-altering information. But it’s pretty cool. If someone has a thriving Google+ presence and a trusted author set up, we’re already off to a REALLY good start.</p>
<hr />
<h2>The Unexpected</h2>
<p>I think this last category in a way encompasses every other one as well — I’m looking for the unexpected. Something that makes me go “Huh.” Maybe it’s the top pages that come up, or assets we can use in link building. Maybe it’s just a URL with no title or meta description. It could be anything. Ultimately the question you want to resolve is whether or not there is anything wildly out of the ordinary. Most of the time, nothing is going to jump out and slap you in the face with its incongruity. But when it does, it’s like wham, bam, thank you, site: command!</p>
<p>OK, so this may be a day late and a dozen roses short of a valentine, but it’s never too late for sentiment. And I know there are a lot of far superior tools for gaining insights, don’t worry, I use those too. A lot of them are based on recommendations from Ninjas Ann Smarty and Bonnie Stefanick, who are both tool gurus. Yet there’s still merit in the classics. A lot of the bigger, better, bad asser tools actually include site: commands in their complex series of functions. But whether your research methods are tried and true or new and shiny, if you don’t know what to make of the data in front of you, you may as well be empty-handed. It’s not so much how you gather your information, it’s how you interpret it and act on it that matters.</p>
<p>So what’s your favorite “old school” SEO method that you proudly (or with a little shame) still use today?</p>
]]></content:encoded>
			<wfw:commentRss>http://www.internetmarketingninjas.com/blog/search-engine-optimization/the-site-command/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>The Ultimate Guide to Blocking Your Content in Search</title>
		<link>http://www.internetmarketingninjas.com/blog/tips-and-tricks/the-ultimate-guide-to-blocking-your-content-in-search/</link>
		<comments>http://www.internetmarketingninjas.com/blog/tips-and-tricks/the-ultimate-guide-to-blocking-your-content-in-search/#comments</comments>
		<pubDate>Tue, 14 Feb 2012 18:18:30 +0000</pubDate>
		<dc:creator>Rick DeJarnette</dc:creator>
				<category><![CDATA[Tips and Tricks]]></category>

		<guid isPermaLink="false">http://www.internetmarketingninjas.com/blog/?p=2690</guid>
		<description><![CDATA[We all work so hard to make sure all of our content is crawled and indexed by the search engines.... <a href="http://www.internetmarketingninjas.com/blog/tips-and-tricks/the-ultimate-guide-to-blocking-your-content-in-search/" class="read-more">read on&#160;&#187;</a>]]></description>
			<content:encoded><![CDATA[<p>We all work so hard to make sure all of our content is crawled and indexed by the search engines. So it’s ironic when sometimes we must also struggle to remove or prevent some otherwise private content from getting into the indexes.</p>
<p>The process of blocking content from search can be frustrating, removal can be slow, and the whole experience exasperating – especially if you don’t know what options you have. Let’s talk about the various options you have for both removing content from the search indexes and how to prevent it from being indexed in the first place.</p>
<hr />
<h2>Find all of the affected URLs</h2>
<p>Before you leap into the URL removal process, look to see which URLs point to the content you want removed. Think in terms of reverse canonicalization. If the content is older, it might be indexed under multiple URLs, such as:</p>
<ul>
<li>xyz.com/mystuff</li>
<li>xyz.com/mystuff/</li>
<li>www.xyz.com/mystuff</li>
<li>www.xyz.com/mystuff/</li>
<li>www.xyz.com/mystuff/Index.htm</li>
<li>www.xyz.com/mystuff/index.htm</li>
</ul>
<p>and many other variations. Identify all of the URLs pointing to the content you want removed so you are ready to remove all references to it. For more information on canonicalization concepts, see this <a href="http://searchengineland.com/why-canonicalization-matters-from-a-linking-perspective-91227">helpful post on canonicalization</a>.</p>
<hr />
<h2>Remove indexed content from search</h2>
<p>There are several ways to tell the search engines the content is no longer available. Let’s jump right in.</p>
<h3>Remove it from the web server</h3>
<p>The easiest way to remove content from the search indexes is simply to remove it from your site. When a search crawler comes back to your site to check the status of your published content, its request for the removed content will result in HTTP status 404 messages, which tells the crawler the file can’t be found. That result kicks off the automatic (albeit slow) process of removing the URL from the index.</p>
<h3>Set the web server to return a 404 (or 410) for the URL</h3>
<p>If you must leave the content on the server, you can configure the web server to still return either the 404 &#8220;File Not Found&#8221; or 410 &#8220;File Gone&#8221; message for the given URL. The process of configuring a specific, non-default HTTP status message for a URL on your web server depends upon the platform used. See your web server documentation for details. Note that this technique won’t work for non-HTML content, such as PDFs and Microsoft Word DOC files.</p>
<h3>Permanently redirect a URL</h3>
<p>Assigning a 301 (aka permanent) redirect to a URL tells the search crawler that the requested URL is no longer available and has been permanently replaced by a substitute (the URL receiving the redirect traffic).</p>
<p>All of the above methods take time for the results to take effect. They are dependent upon waiting for the search crawler to return to the site, request the affected URL in order to receive the actionable HTTP status code, and then for the search engine algorithm to eventually purge the content. If the issue is an emergency, such as when proprietary business or confidential personal information is accidentally exposed, you need immediate action to get that content purged. Here’s how to do that:</p>
<h3>Use the search engines’ webmaster tools to remove specific pages</h3>
<p>Both Google and Bing offer tools for requesting the immediate removal of indexed content. Before you can access them, you must be a registered user of <a href="http://www.google.com/webmasters/">Google Webmaster Tools</a> and <a href="http://www.bing.com/toolbox/webmaster/">Bing Webmaster Center Tools</a> (this alone is reason enough to register your site now before an urgent problem arises).</p>
<ul>
<li><strong>Google:</strong></li>
</ul>
<blockquote>
<ol>
<li>Log in to <a href="http://www.google.com/webmasters/">Google Webmaster Tools</a> and click <strong>Site configuration</strong> &gt; <strong>Crawler access</strong> &gt; <strong>URL removals</strong> tab.</li>
<li>Click <strong>Create a new removal request, </strong>type or paste the URL to be removed, and then click <strong>Continue</strong>. Remember that URLs are case sensitive, so I recommend copying and pasting the URL to be removed.</li>
<li>From the dropdown list, select the type of data removal you want (cache only, cache and SERP, or entire directory), and then click <strong>Submit Request</strong>. Your request will appear as a listing in the tool, where you can monitor the status of the request.</li>
</ol>
</blockquote>
<ul>
<li><strong>Bing (which includes organic SERPs in Yahoo!):</strong></li>
</ul>
<blockquote>
<ol>
<li>Log in to <a href="http://www.bing.com/toolbox/webmaster/">Bing Webmaster Center Tools</a> and click the <strong>Index</strong> tab &gt; <strong>Block URLs</strong>.</li>
<li>Select the type of data removal you want (click either <strong>Block URL and Cache</strong> or <strong>Block Cache</strong>).</li>
<li>Select what to block (page only, directory, or entire site).</li>
<li>Copy and paste the URL to be removed, click <strong>Next</strong>, click <strong>Confirm</strong>, and then click <strong>Finish</strong>.</li>
</ol>
</blockquote>
<p>Note that the search engine-provided URL removal tools are typically intended for urgently needed data removals. In addition to the above techniques, there are other ways to remove content that also proactively prevents it from be indexed in the first place. Let’s explore those.</p>
<hr />
<h2>Block URLs to prevent duplicate content in the index</h2>
<p>The most commonly used method of managing search crawler access to your site’s content is to use Robots Exclusion Protocol (REP) directives. This can be achieved through several methodologies:</p>
<h3>Use a robots.txt file on the site</h3>
<p>The robots.txt file is a plain text file containing crawling exclusion directives aimed at one or more REP-compliant crawlers (or most commonly, generic directives applicable to all REP-compliant crawlers). When the file is uploaded to the domain (or subdomain) root of a website, will automatically be read by REP-compliant crawlers before any URLs are fetched (all major search engine crawlers are REP-complaint). If a targeted URL is blocked by a robots.txt directive, the URL is not fetched.</p>
<p>The robots.txt file (note that, by protocol, this file name always uses lower-case letters) enables webmasters to block crawlers from accessing one or more particular files in a directory, whole directories, or the entire site. (<strong>Note:</strong> Per Google, this is the only approved method for removing entire directories from their index.) It also supports wildcard characters to make it extremely versatile.</p>
<p>The most common robots.txt instruction targets all crawlers (referred to as “user-agents” in REP). It’s followed by a specific directive, such as blocking access to a file, directory, or the site. Sample robots.txt directive code for generic user-agents looks like this:</p>
<pre>User-agent: *
Disallow: /private.htm
Disallow: /offlimits/</pre>
<p>You can also use Allow directives to allow crawlers access to a specific file within an otherwise blocked directory, such as in the following example:</p>
<pre>Allow: /offlimits/index-me.htm
Disallow: /offlimits/</pre>
<p><strong>Note:</strong> <em>The Allow directive takes precedence in any logic conflicts between Allow and Disallow directives</em>, so be careful. It’s an SEO best practice to isolate allowed and disallowed files on a per directory basis to eliminate confusion.</p>
<h4>Wildcards in directives</h4>
<p>The “*”represents all characters in URLs up to the point of usage, meaning that the following directive,</p>
<pre>Disallow: *cars</pre>
<p>would block crawler access to a variety of content such as:</p>
<ul>
<li>/redcars.htm</li>
<li>/roadsters/blue-cars.htm</li>
<li>/cars/black-roadster.htm</li>
<li>/2012/cars/bmw</li>
</ul>
<p>and so on. Note that asterisks are not needed as a wildcard suffix, as the directive, by default, applies to any child content underneath the listed location in robots.txt.</p>
<p>The “$” character is used to filter by file name extension, such as in the following sample:</p>
<pre>Disallow: *.pdf$</pre>
<p>The sample code blocks crawlers from accessing all URLs containing the file type “*.pdf”. By comparison, omitting the $ wildcard would block any file paths containing the string “.pdf”, such as /docs.pdf/newcars.htm.</p>
<p>Wildcards can create very powerful, wide-reaching directives. However, wildcard usage in robots.txt often contains logical coding errors, which can result in unintended crawler behavior. It is extremely common for search engines to receive complaints about wildly incomplete site crawls when in fact a misconfigured robots.txt file is actually to blame, and the crawlers were simply abiding by the directives listed.</p>
<h4>Don’t attempt to hide confidential content with robots.txt</h4>
<p>Some webmasters, in their effort to block search crawlers from accessing their business confidential files and directories, mistakenly list them in robots.txt. What they fail to realize is that the robots.txt file is always in the same location on a site, and is always available to be read, including by people. For example, let&#8217;s say you had a robots.txt file that contained the following code:</p>
<pre>User-agent: *
Disallow: /private/
Disallow: /client-list.php
Disallow: /secrets/</pre>
<p>You can rest assured that your competitors, the ones who know web technologies, are snooping around your site and will see these references. They will then attempt to browse to the listed files and directories to see what’s there, such as a client list or a business expansion plan. Listing such content in robots.txt is effectively advertising where you keep your confidential documents!</p>
<p>To block the snoopers from probing the depths of your website for business intelligence, you can protect the directory by restricting access to authenticated usernames with passwords. If the site structure or functionality prevents you from doing that, make sure you at least have an index page in the directory so the browser doesn’t return a directory listing showing all of the files up for grabs. You may even try renaming the directory to be blocked in robots.txt to a more innocuous name or burying it in a deep subdirectory (but first be sure any such change in the URL path won’t break any functionality within your site!).</p>
<h3>Use a &lt;meta&gt; robots tag on the page</h3>
<p>The &lt;meta&gt; tag (or “element” for you HTML grammarian purists) can be used with REP directives. These directives apply only to the page on which they appear. The following sample code demonstrates a common usage, in which the crawler is disallowed from both indexing the content and following any of the links on the page:</p>
<pre>&lt;meta name="robots" content="noindex, nofollow"&gt;</pre>
<p>Note the name attribute uses the generic value “robots”, which is applicable to all REP-compliant crawlers. You can alternatively choose to specify the exact name of a user agent as well, such as googlebot or bingbot. If you do choose to specify individual user agents, be sure the name is exactly right, or the directive may be ignored by the targeted crawler. Any crawler not identified by a specific or a generic &lt;meta&gt; robots directive will default to crawling the page for purposes of potentially indexing its content and following its links.</p>
<p>The following values for the content attribute can be used in the &lt;meta&gt; robots tag:</p>
<table border="1" cellpadding="0">
<tbody>
<tr>
<td valign="top" width="159"><strong>Value</strong></td>
<td valign="top" width="489"><strong>Function</strong></td>
<td valign="top" width="134"><strong>Supported by</strong></td>
</tr>
<tr>
<td valign="top" width="159">noindex</td>
<td valign="top" width="489">Prevents the bot from indexing the contents of the page, but links on the page can be followed.</td>
<td valign="top" width="134">Bing, Google</td>
</tr>
<tr>
<td valign="top" width="159">nofollow</td>
<td valign="top" width="489">Prevents the bot from following the links on the page, but the page can be indexed</td>
<td valign="top" width="134">Bing, Google</td>
</tr>
<tr>
<td valign="top" width="159">none</td>
<td valign="top" width="489">Equivalent to “noindex, nofollow”</td>
<td valign="top" width="134">Google</td>
</tr>
<tr>
<td valign="top" width="159">nosnippet</td>
<td valign="top" width="489">Prevents the display of the descriptive snippet text for that page in the SERPs.</td>
<td valign="top" width="134">Bing, Google</td>
</tr>
<tr>
<td valign="top" width="159">noarchive</td>
<td valign="top" width="489">Prevents the display of a cache link for that page in the SERP.</td>
<td valign="top" width="134">Bing, Google</td>
</tr>
<tr>
<td valign="top" width="159">nocache</td>
<td valign="top" width="489">Same as noarchive.</td>
<td valign="top" width="134">Bing</td>
</tr>
<tr>
<td valign="top" width="159">noodp</td>
<td valign="top" width="489">Instructs the bot to not use a title and snippet from the <a href="http://dmoz.org/">Open Directory Project (ODP)</a> for that page in the SERP.</td>
<td valign="top" width="134">Bing, Google</td>
</tr>
<tr>
<td valign="top" width="159">notranslate</td>
<td valign="top" width="489">Prevents translation of the page in the SERP.</td>
<td valign="top" width="134">Google</td>
</tr>
<tr>
<td valign="top" width="159">noimageindex</td>
<td valign="top" width="489">Prevents indexing of images on the page.</td>
<td valign="top" width="134">Google</td>
</tr>
<tr>
<td valign="top" width="159">unavailable_after: [date/time]</td>
<td valign="top" width="489">Prevents the page from showing in the SERPs after the specified date/time. The date/time data must be in <a href="http://www.ietf.org/rfc/rfc0850.txt">RFC 850 format</a>.</td>
<td valign="top" width="134">Google</td>
</tr>
</tbody>
</table>
<p><strong>Note: </strong>The attribute and value data fields are not case-sensitive.</p>
<h3>Use the HTTP header X-Robots-Tag on the web server</h3>
<p>For non-HTML-based content, such as TXT, DOC, and PDF documents, there is no way to apply REP directives via &lt;meta&gt; robots tags to them. Assuming you don’t use robots.txt for this, you can instead set REP directives for individual URLs using the HTTP header X-Robots-Tag. This header uses the same content values as shown in the table above for the &lt;meta&gt; robots tag. The following is an example of a commonly used X-Robots-Tag header that applies to all REP-compliant crawlers:</p>
<pre>X-Robots-Tag: noindex, nofollow</pre>
<p>You can optionally identify a specific crawler for a directive, and pair that with a separate directive for all other crawlers not specified, as shown in the following sample:</p>
<pre>X-Robots-Tag: googlebot: noindex, nofollow
X-Robots-Tag: otherbot: noindex</pre>
<p>The process for implementing custom HTTP headers is dependent upon the web server platform used. Review your web server documentation for details.</p>
<h3>REP methodology precedence</h3>
<p>Generally speaking, it’s best to only use one REP method of controlling crawler access for your website. Redundant methods typically result in logic conflicts, crawler access problems, and indexing shortfalls, which can be difficult to resolve.</p>
<p>Note that from the search engine perspective, robots.txt blocking directives take precedence. This is because before a page on a site is accessed, the crawler first checks for the presence of a robots.txt file to see if access is blocked. If so, the page is not fetched. However, to read the directives in either &lt;meta&gt; robots tags or the HTTP header X-Robots-Tag, the page has to first be fetched. If blocking directives are found there, only then is the page discarded. As a result, this means that the URL of the page may get indexed, but no content from that page with blocking directives will be included in the index.</p>
<p>There is one caveat to the precedence of robots.txt directives: when a crawler is specifically given access in robots.txt with an Allow directive but then encounters a blocking directive in either &lt;meta&gt; robots or X-Robot-Tag, the blocking directive overrides the Allow directive.</p>
<p>Lastly, the use of REP directives not only identify what content is off-limits to crawlers, if a new REP directive appears that blocks content that has already been indexed, that content is purged from the index. For more information on the robots.txt protocol, see <a href="http://www.robotstxt.org/">www.robotstxt.org</a>.</p>
<hr />
<h2>Require authentication</h2>
<p>Another method of preventing the search crawler from accessing content is to require authentication for access. If a password is required on a site, the search crawler will not be able to access its content. Note that using Secure HTTP (https) by itself (without requiring authentication) will not block the crawler. This is a common misunderstanding and is one of the reasons why so many duplicate pages are indexed by search.</p>
<hr />
<h2>Password protect a directory on the web server</h2>
<p>Alternatively, instead of requiring authentication to use the site, a webmaster can put content in a password-protected directory on the server to prevent crawler access. This method can be used for web server administrator-related content.</p>
<hr />
<h2>Block dynamic URL parameters</h2>
<p>For sites that use dynamic URL parameters to track referrer data to their pages, content duplication can become a significant problem. So to prevent URLs using specific URL parameters from being indexed, and thus avoid content duplication, you can tell the search engines via their webmaster tools to ignore indexing URLs using specified URL parameters. Here’s how:</p>
<ul>
<li><strong>In Google:</strong></li>
</ul>
<blockquote>
<ol>
<li>Log in to <a href="http://www.google.com/webmasters/">Google Webmaster Tools</a> and click <strong>Site configuration</strong> &gt; <strong>URL parameters.</strong></li>
<li>Click <strong>Configure</strong> <strong>URL parameters</strong>, and then click <strong>Add parameter</strong>.</li>
<li>Type the parameter name, select whether the parameter changes what the user sees in the page, and then click <strong>Save</strong>.</li>
</ol>
</blockquote>
<ul>
<li><strong>In Bing (which also covers organic content found in Yahoo!):</strong></li>
</ul>
<blockquote>
<ol>
<li>Log in to <a href="http://www.bing.com/toolbox/webmaster/">Bing Webmaster Center Tools</a> and click <strong>Index</strong> tab &gt; <strong>URL Normalization</strong>.</li>
<li>Click <strong>Add Parameter</strong>, type the parameter name, and then click <strong>Submit</strong>.</li>
</ol>
</blockquote>
<p>Be careful of what you add to these lists. If your site uses URL parameters to define the page contents rather than to track referrers, you could accidentally purge a large number of pages from the search index.</p>
<hr />
<h2>Canonicalization techniques</h2>
<p>Canonicalization is the process of redirecting unwanted URL variants for a given page to that page’s designated primary URL. Canonicalization effectively blocks those URL variants from the search engine indexes by using 301 permanent redirects. I discuss canonicalization techniques in detail, including the use of the &lt;link&gt; rel=canonical tag and how to set up 301 redirects in the recent blog post, <a href="../../../../../search-engine-optimization/301-redirects/">The Ultimate How-To Guide on 301 Redirects</a>. This post is long enough as is. I’ll refer you there for those details. You’re welcome!</p>
<hr />
<h2>Ineffective methods</h2>
<p>Lastly, I’ll briefly mention what doesn’t work. For years it was a given that text content embedded within images, Flash, and other non-text media on a webpage were deal-breakers. Well, crawlers have come a long way in recent years. But don’t misunderstand me – I am not saying you should feel free to put text content you want indexed within these types of media. They are still very difficult to crawl and parse for content, and search engine success rates are not great. That all said, very difficult is not impossible.</p>
<p>For crawling efficiency purposes, always spoon-feed your content to crawlers as pure, on-page text. But thanks to the use of optical character recognition (OCR) technologies and improvements in crawling JavaScripts and rich Internet application technologies, a portion of this once-lost content is today being crawled and indexed. As a result, you can’t depend upon these technologies to be impenetrable walls shielding content from the prying eyes of search. You can’t rely on it to work, and you can’t rely on it to fail. What a world we live in!</p>
<p>Getting content out of a search engine index can be a frustrating and time-consuming experience, but it can be done. By reviewing and implementing the techniques described above, you can get confidential content purged relatively quickly as well as prevent it from being indexed again.</p>
<p>Be careful out there. There’s very little that’s private anymore on the web.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.internetmarketingninjas.com/blog/tips-and-tricks/the-ultimate-guide-to-blocking-your-content-in-search/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>

