The Google Panda Update – Did user Behavior Signals tell the Panda to Poop on your site?
The past few months I’ve been reading a lot of interesting theories about the Google Panda Update…and I’ve read a lot of noise (stuff I don’t believe)… even one of the top results in a search in Google for “Google Panda Update” is a page talking about Panda and low quality backlinks….Panda has nothing to do with backlinks…trust me…if it did, I, of all people, would be shouting that my link builders have the solutions to Panda…my link builders can’t help with Panda (but my other teams can)… I haven’t heard Google talk about social signals…but I keep hearing people mention social being a “help” to Panda…um…I don’t think so….this is a content issue…not a internal/external link thing…not a social thing…I totally believe it all has to do with Google’s analysis of User Behavior in relation to the each page of your website (or sets of pages on your site).
I was just read a nice thread over at WebmasterWorld that was started by Brett Tabke called Panda Metric : Google Usage of User Engagement Metrics
There, Brett nicely outlines all the things that Google knows about searches including, how you got to Google, where you’re from, your history, your browser, your tracking data, and cookies. Brett goes on to say:
At this point, Google knows who 70-75% (my guess) the users are and what they are doing on any given query, and can guess accurately at another 15-25% based on browser/software/system profiles (even if your ip changes and you are not logged in, Google can match all the above metrics to a profile on you)….
Finally, after all that data, the user probably types in a query: (if the search didn’t come from off site).
Then there’s the query entry, the SERP behavior, and then the click on a result. At that point, Brett says, Google looks at:
- AdSense or DoubleClick serve ads?
- Google Analytics running?
- What dose the user do while he is there?
- +1 buttons to come.
- How long does he stay on the page
- Does user visit other pages?
- Does user hit the back button and return to Google, or does he wander off to parts unknown?
- Toolbar data. Tracking and site blocking.
and then Brett sums it up with:
After all that, we can quantify a “metric” (I call, The Panda Metric). It is an amalgamation of the above inputs. This set of inputs would be relative to this query. They could also be weighted to relative queries.
So far this thread has some great comments by the Admins and Senior Members…
Like Tedster with:
I think some of the delay in recalculating is that Panda works at a very basic level – it’s what Google calls the “document classifier”. I have a feeling that a particular type of routine does not run as often as the rest of the scoring that is built on top of it. My current research – looking through patents, papers, and posts that mention “document classifiers”.
And followed up by TheMadScientist
I think this is probably a good time to clarify ‘document’ can refer to a page or collection of pages, and could easily be both, IMO. E.g. A page is an individual document and can be evaluated individually, but a site (or IMO, even a ‘section’ of a site) is also a document and can be evaluated as collective whole.
I would guess you’re right about classifications not happening as often Tedster and, of course, if only a portion of pages (sub-documents?) are changed you could end up with the same overall evaluation of the document (site) as a whole, even though there have been changes to a portion of it.
This comment is followed up with Walkman bringing up the fact that Matt Cutts, of all people, is outranked by scrapers for his own content, as had been brought up earlier here.
There is a lot of noise in there, but there are some great minds in there as well.
I agree with Brett and others that there are a lot of signals at play here….even though last week when I wrote about the Google Panda Update, that I theorized I felt that “people who do a search at Google…go to your site…go back to the same Google search…click on another site…and not return to you” is my theory of what is the biggest factor at play with Panda…. ..but keep in mind, I totally know that there are many additional signals that Google can tweak Panda with from all their data sources, and that they’ll continue to tweak their content analysis algorithms with every new signal that they can collect.
But When Can I Get Out of Panda??
This reminds me of another interesting comment in another Webmasterworld Panda forum thread, where TheMadScientist brings up an interesting theory, that I’m inclined to believe:
IMO it has less to do with the weight of the links changing and more to do with a ‘reverse scoring’ (for lack of a better phrase), meaning I think a page with links pointing to a thin page may have its quality scored lower; when a page where the link(s) are pointing to are determined to be lower quality.
IOW: If Page A links to Page B and Page B’s quality score is low, the overall quality score for Page A is lowered by linking to Page B.
We know link text counts forward (to the page the link is pointing to) I think part of what Panda does is reverses the scoring and the quality score of the linked page counts backwards (to the page doing the linking).
Keep in mind this is ‘speculation only’ ATM, but I really think people are looking in the wrong place when they’re simply looking at link based scoring ‘the old way’ … Simple link weight based scoring is soooo 2000, IMO.
Google used to say that pages in the “Supplemental index” wouldn’t hurt your site…now I wonder if they could hurt your site.
Think of it this way…if you have 100 pages on your website…and if google thinks that 70 of those are “Panda Poop Pages” (Yes, I’m coining a new phrase here…Panda Poop Pages)…and say they score those Panda Poop Pages each a negative 10 score)…then your site can get a negative score overall if all content is added up and scored across your site…beyond that, possibly, if you have an internal page that has 100 links on it going to other pages of your site, and if 80 of those links go to “Panda Poop Pages”, then that page might have a lowered ranking itself because a user has a 80% chance of going to a “Panda Poop Page” from that page…
If this is the case, then improving things on a page-by-page level, will in turn, now tell Google that the page with 100 links is now only linking to 79 “Panda Poop Pages” instead of 80, and that page will increase ever so slightly…
So when do I think that sites will “come back” from being Pandasized?
I have a feeling that sites that have been Panda Pooped on, will not just get clean overnight…nor see any big “Wa-La!….We’re back”…they’ll see slow steady increases…page-by-page….which will, in turn, help the pages above those…and in turn, help the site as a whole… again, I don’t know….no one has yet to this day said “We came back from Panda” and I don’t think you’ll ever hear that story unless it’s a story about a whole year in time slowly bringing trust/rankings back….there are some stories of pages coming back…but there’s also been stories of things bouncing around in rankings… I had one client who I spoke with today who prior to Panda II had ranked #4 for a major phrase. After April 11 he dropped to #15…then he dropped to the 50’s…then last Friday he was #12, and today he’s #8 …and keep in mind, that he hasn’t done a thing with the site since Panda II. …there’s still some bouncing around and threads where people are saying “hey, a page came back”…or is “recovering”…and then the next day it’s “sorry…it fell again”…
There’s another thread by Bill Slawski called “Just What User Behavior Data Does Google Use to Influence Search Rankings?”, where Bill nicely outlines several Google Patents that mentions several of the user behavior data that they might be looking at. One of the ones noted is from the patent “Information retrieval based on historical data”
If a document is returned for a certain query and over time, or within a given time window, users spend either more or less time on average on the document given the same or similar query, then this may be used as an indication that the document is fresh or stale, respectively.
For example, assume that the query “Riverview swimming schedule” returns a document with the title “Riverview Swimming Schedule.” Assume further that users used to spend 30 seconds accessing it, but now every user that selects the document only spends a few seconds accessing it. Search engines may use this information to determine that the document is stale (i.e., contains an outdated swimming schedule) and score the document accordingly.
The past few months I’ve been digesting everything I can on Panda…and I’ve been looking at analytics, usability analyses, on-page analyses, and talking with clients about Panda…. for every site I can find possible reasons…and great possible solutions….at the very least, these clients are getting a great look at things like usability, on-page SEO, analytics analysis…I wish I could tell them…”Just make these changes…and just wait a little bit..and Bam! You’ll be back!”… but I don’t think it works this way…. There’s still more that I have to say about Panda, but I’ll keep those to another post for another day….
Reminder: We Build Pages will be changing names to Internet Marketing Ninjas in a few months.
Here’s a few other recent Panda update posts that I’ve done:
Here are some more of my blogposts about the Panda Update, check em’ out:
- Thoughts and Solutions from Jim Boykin – Post discussing the background of the Panda Update, including the supplemental
results, caffine, and beyond. This post also looks at, ” if I were Google what I would look at,” and at solutions.
- Google Panda Update – A Overview of Analytics of 5 Panda II Affected Sites – This post discusses the analytics
of five affected sites
- Google Panda Update – Google’s Content Guidence and Jim’s Take – This article goes over the list of 23 questions of
the ‘Google Mindset’ for the Panda Update and outlines Jims thoughts for each.
- Google Panda Update Panda’s Punitive Punishment of Good Content – This post discuses how the Panda will punish your
good content if fyou have bad content as well
- Google Panda Update – Short Clicks and Long Clicks / Pogosticking – this post talks about how Google uses its logs and
click information, particularly for short clicks, long clicks, and pogosticking to help evaluate and rerank search results.
The post further discusses potential implications on the Panda update.
- Losing Clients to Panda. I Just Lost $17,500/Month – Sharing of experience on the Panda Update as well as reflection
on previous updates
- Google Agent Rank and Reputational Scores…It’s About Content and Writers and Panda! – Breakdown of what ‘agent rank’ means
and a view from a Google Panda lens.
- Google Panda Update: Content + Design = Usable, Trustworthy Websites – discussion of website usability and implications
on Google Panda Update
- Google Panda Update: Your Site is Going to Survive (funny) – Jim’s Panda Update remake of, “Country Boy Can Survive”
If you were affected by
Panda or Penguin!
Awesome post Jim,
I had similar speculations since Panda hit the street
and discussed that with a lot of fellow SEOs in many places,
including the SES NYC …
It’s ALL user-behaviour – love watching SERP CTR’s for over a year now – and it’s effects 🙂
I definitely think that user-behavior data plays a role in the Panda updates, though I’m not sure if that data is being used to drive changes directly or to measure the impact of changes.
If the “Panda” the update is named after is indeed Biswanath Panda, and there seems to be a good chance of it, then one of the approaches used is quite possibly based upon a decision tree approach like that described in the paper he co-authored, PLANET: Massively Parallel Learning of Tree Ensembles with MapReduce (pdf). The techniques described in that paper were followed in a study described in Predicting bounce rates in sponsored search advertisements to find “quality” based features for sponsored ads and landing pages.
The PLANET paper notes that a similar approach could be used in other situations involving very large data sets, like organic search. What quality signals might be used to predict the kinds of user behavior data Google might collect about pages?
Bill, thanks for those links!
Wow. That’s pretty much what I wrote last week in my longwinded attempt to analyze why Google Panda impacted Squidoo very little, as opposed to Hubpages and a lot of the other user-submitted content sites.
As one who’s still an SEO journeyman, it feels vindicating to see that others have pretty much drawn the same conclusions.
I can tell you that on panda 1 my site wasn’t affected, panda 2 hit my blog like a ton of bricks – to the point where we are getting close to 0 traffic from google – and it’s a blog that had a great authority before – all original in-depth content, good links inbound, no scraping, etc. It’s good to hear that there might be some upward movement soon.
I agree too that the 1,000s of sites that scrape my content rank for my content and i don’t – seriously wtf.
thanks for the post.
It makes you wonder what will happen on a page where you link to ‘bad’ pages will do if you just skip the links on that page. You won’t link to a page that can negatively influence the specific page…
Furter, the part about the internal clickthrew seems a bit weird, based on the fact that we all use more and more usability components where we don’t show new pages, but show the data on the page itself.
Nevertheless, good article and gave some promising thoughts.
PS. How about crawling google queries, track high ranking pages and link to those the most internal? Based on the article above could be a good influence of your ‘document’ quality? Any thoughts on that matter?
Our biggest site got hit hard when Panda was rolled out here in Australia. We are an IYP site, and a lot of our Google traffic comes from profile pages where visitors are looking for a phone number or address of a particular business. Because they often find what they are looking quickly, our time on site is often low and bounce rate high. It sounds like this could well have contributed to our drop in traffic following Panda.
I have Try +1 buttons its good to out google panda effects….
So do you think Google is zeroing in on the Bounce Rate for a given page or the Click Through Rate for a given impression? or maybe a little bit of both? I think the reason that many information based sites got hit so hard was because even the good ones will often have higher bounce rates than sites that offer products or services. In many cases this can be attributed to a lower quality page or site but in some cases it may just come down to the user getting what they want and leaving. And that isn’t necessarily a bad thing. It looks like Google is walking on a tight rope. Isn’t the whole concept of Adsense based upon the bounce rate?
Well regardless, great post. Maybe the old motto of “Content is King” should be changed in 2011 to “Quality Content is King” 😉
I am very happy when google panda is aplicated, because my blog has reached top position in google serp. I think good content is most important factor to get best position. Thank for inspiring me with your amazing post
I don’t buy it. The fact being is that in my verticals where I have been tracking for past 12 years I see sites ranking for the terms that provide no value to the user. Our website had 34% bounce rate. I get 5-10 emails on daily basis that Professors at university want to use and cite our resources and we still got it.
It could have this metric somewhere in play but in large this is a messy update with lots of false positives.
If you do a search now for “Panda update solutions” there are 15 – 20 listings for an auto generated article on crappy article submissions sites! hahahahaha yeah great job with torching the dup content – keep up the fine work!
Comments are closed.