Duplicate content is something that every SEO should be aware of. When a site has had it’s content scraped or copied to another site or even the same site in a different page, the original content has a risk of not preforming well in the results. This is because the search engines are looking for ways to make sure that duplicated pages aren’t showing up for the same queries. Therefore when doing any type of basic SEO audit it is important to quickly identify any duplicate content that may exist. Doing this can sometimes be harder than it seems. This is because the shear volume of the internet allows for some content to be duplicated naturally. So to help the SEO completely identify duplicated content here are a few tricks to leveraging Google to find repeating pages.
Use The intitle: Search Operator
When you use the intitle: search operator you can quickly find pages that are duplicating the title tag of a page you are auditing. For example lets say we want to look to see if this page is being duplicated, we would run the following query: intitle:”Henri de Toulouse-Lautrec – Wikipedia, the free encyclopedia” Some of the results in those results have legitimate reasons to have the same title tag, but if you look further down you can see that some sites are simplely scrapping Wikipedia. This is the type of duplicate content that SEOs should be concerned about.
Use The inurl: Search Operator
Just like the intitle: operator above the inurl: operator looks for strings of text found with in a URL that has been indexed. Therefore we can apply the same methodology as above by using a query like inurl:Tolouse_Lautrec This works because the string “Tolouse_Lautrec” is found in the URL we are auditing. Many times automated scrapper sites will simply replicate every aspect of a domain, even the URL structure, this allows us to easily hunt this scrapped content down.
Use Webmaster Tools Alerts
I almost hate to recommend this one because quite honestly Google’s Webmaster tools has a spotty history at reporting issues. But setting up a verified Webmaster Tools account will enable Google to notify you if they detect a duplicate content issue via internal message. This is a great little tool in theory, however, anyone that is familiar with messages from Google knows that they can be extremely vague and sometimes misleading. So I would take this approach with a grain of salt, but know that when a message does come through, its worth looking into.
Search With “Quoted” Text
This is probably the most used method of duplicate content analysis. This is simply copying a bit of text on the page and pasting it into the query box. Then enclosing the pasted text in quotation marks. When a query is in quotation marks it is telling Google to look for the exact text with no variations. This can be good for duplicate content checks, but can also lead to some suspicious activity. To get the best results you are going to want to grab as much text as possible. This is because it is more probable that text will be duplicated on the web the smaller the sample is. But the larger the text sample, the lease likely the text would have been duplicated on a natural level.
If you have any other ways that you like looking for duplicated content, let us know in the comments. And until next time, happy auditing!