02 May 2011

Google Panda Update – User Behavior and Other Signals.

The Google Panda Update – Did user Behavior Signals tell the Panda to Poop on your site?

The past few months I’ve been reading a lot of interesting theories about the Google Panda Update…and I’ve read a lot of noise (stuff I don’t believe)… even one of the top results in a search in Google for “Google Panda Update” is a page talking about Panda and low quality backlinks….Panda has nothing to do with backlinks…trust me…if it did, I, of all people, would be shouting that my link builders have the solutions to Panda…my link builders can’t help with Panda (but my other teams can)… I haven’t heard Google talk about social signals…but I keep hearing people mention social being a “help” to Panda…um…I don’t think so….this is a content issue…not a internal/external link thing…not a social thing…I totally believe it all has to do with Google’s analysis of User Behavior in relation to the each page of your website (or sets of pages on your site).

I was just read a nice thread over at WebmasterWorld that was started by Brett Tabke called Panda Metric : Google Usage of User Engagement Metrics

There, Brett nicely outlines all the things that Google knows about searches including, how you got to Google, where you’re from, your history, your browser, your tracking data, and cookies. Brett goes on to say:

At this point, Google knows who 70-75% (my guess) the users are and what they are doing on any given query, and can guess accurately at another 15-25% based on browser/software/system profiles (even if your ip changes and you are not logged in, Google can match all the above metrics to a profile on you)….

Finally, after all that data, the user probably types in a query: (if the search didn’t come from off site).

Then there’s the query entry, the SERP behavior, and then the click on a result. At that point, Brett says, Google looks at:

  • AdSense or DoubleClick serve ads?
  • Google Analytics running?
  • What dose the user do while he is there?
  • +1 buttons to come.
  • How long does he stay on the page
  • Does user visit other pages?
  • Does user hit the back button and return to Google, or does he wander off to parts unknown?
  • Toolbar data. Tracking and site blocking.

and then Brett sums it up with:

After all that, we can quantify a “metric” (I call, The Panda Metric). It is an amalgamation of the above inputs. This set of inputs would be relative to this query. They could also be weighted to relative queries.

So far this thread has some great comments by the Admins and Senior Members…

Like Tedster with:

I think some of the delay in recalculating is that Panda works at a very basic level – it’s what Google calls the “document classifier”. I have a feeling that a particular type of routine does not run as often as the rest of the scoring that is built on top of it. My current research – looking through patents, papers, and posts that mention “document classifiers”.

And followed up by TheMadScientist

I think this is probably a good time to clarify ‘document’ can refer to a page or collection of pages, and could easily be both, IMO. E.g. A page is an individual document and can be evaluated individually, but a site (or IMO, even a ‘section’ of a site) is also a document and can be evaluated as collective whole.

I would guess you’re right about classifications not happening as often Tedster and, of course, if only a portion of pages (sub-documents?) are changed you could end up with the same overall evaluation of the document (site) as a whole, even though there have been changes to a portion of it.

This comment is followed up with Walkman bringing up the fact that Matt Cutts, of all people, is outranked by scrapers for his own content, as had been brought up earlier here.

There is a lot of noise in there, but there are some great minds in there as well.

I agree with Brett and others that there are a lot of signals at play here….even though last week  when I wrote about the Google Panda Update, that I theorized  I felt that “people who do a search at Google…go to your site…go back to the same Google search…click on another site…and not return to you” is my theory of what is the biggest factor at play with Panda…. ..but keep in mind, I totally know that there are many additional signals that Google can tweak Panda with from all their data sources, and that they’ll continue to tweak their content analysis algorithms with every new signal that they can collect.

But When Can I Get Out of Panda??

This reminds me of another interesting comment in another Webmasterworld Panda forum thread, where TheMadScientist brings up an interesting theory, that I’m inclined to believe:

IMO it has less to do with the weight of the links changing and more to do with a ‘reverse scoring’ (for lack of a better phrase), meaning I think a page with links pointing to a thin page may have its quality scored lower; when a page where the link(s) are pointing to are determined to be lower quality.

IOW: If Page A links to Page B and Page B’s quality score is low, the overall quality score for Page A is lowered by linking to Page B.

We know link text counts forward (to the page the link is pointing to) I think part of what Panda does is reverses the scoring and the quality score of the linked page counts backwards (to the page doing the linking).

Keep in mind this is ‘speculation only’ ATM, but I really think people are looking in the wrong place when they’re simply looking at link based scoring ‘the old way’ … Simple link weight based scoring is soooo 2000, IMO.

Google used to say that pages in the “Supplemental index” wouldn’t hurt your site…now I wonder if they could hurt your site.

Think of it this way…if you have 100 pages on your website…and if google thinks that 70 of those are “Panda Poop Pages” (Yes, I’m coining a new phrase here…Panda Poop Pages)…and say they score those Panda Poop Pages each a negative 10 score)…then your site can get a negative score overall if all content is added up and scored across your site…beyond that, possibly, if you have an internal page that has 100 links on it going to other pages of your site,  and if 80 of those links go to “Panda Poop Pages”, then that page might have a lowered ranking itself because a user has a 80% chance of going to a “Panda Poop Page” from that page…

If this is the case, then improving things on a page-by-page level, will in turn, now tell Google that the page with 100 links is now only linking to 79 “Panda Poop Pages” instead of 80, and that page will increase ever so slightly…

So when do I think that sites will “come back” from being Pandasized?

I have a feeling that sites that have been Panda Pooped on, will not just get clean overnight…nor see any big “Wa-La!….We’re back”…they’ll see slow steady increases…page-by-page….which will, in turn, help the pages above those…and in turn, help the site as a whole… again, I don’t know….no one has yet to this day said “We came back from Panda” and I don’t think you’ll ever hear that story unless it’s a story about a whole year in time slowly bringing trust/rankings back….there are some stories of pages coming back…but there’s also been stories of things bouncing around in rankings… I had one client who I spoke with today who prior to Panda II  had ranked #4 for a major phrase. After April 11 he dropped to #15…then he dropped to the 50’s…then last Friday he was #12, and today he’s  #8 …and keep in mind, that he hasn’t done a thing with the site since Panda II. …there’s still some bouncing around and threads where people are saying “hey, a page came back”…or is “recovering”…and then the next day it’s “sorry…it fell again”…

There’s another thread by Bill Slawski called “Just What User Behavior Data Does Google Use to Influence Search Rankings?”, where Bill nicely outlines several Google Patents that mentions several of the user behavior data that they might be looking at. One of the ones noted is from the patent “Information retrieval based on historical data

If a document is returned for a certain query and over time, or within a given time window, users spend either more or less time on average on the document given the same or similar query, then this may be used as an indication that the document is fresh or stale, respectively.

For example, assume that the query “Riverview swimming schedule” returns a document with the title “Riverview Swimming Schedule.” Assume further that users used to spend 30 seconds accessing it, but now every user that selects the document only spends a few seconds accessing it. Search engines  may use this information to determine that the document is stale (i.e., contains an outdated swimming schedule) and score the document accordingly.

The past few months I’ve been digesting everything I can on Panda…and I’ve been looking at analytics, usability analyses, on-page analyses, and talking with clients about Panda…. for every site I can find possible reasons…and great possible solutions….at the very least, these clients are getting a great look at things like usability, on-page SEO, analytics analysis…I wish I could tell them…”Just make these changes…and just wait a little bit..and Bam! You’ll be back!”… but I don’t think it works this way….  There’s still more that I have to say about Panda, but I’ll keep those to another post for another day….

Reminder: We Build Pages will be changing names to Internet Marketing Ninjas in a few months.

Here’s a few other recent Panda update posts that I’ve done:

Here are some more of my blogposts about the Panda Update, check em’ out:

Click Here
If you were affected by
Panda or Penguin!