Business Technology

The difference between Google's Panda, Penguin and Hummingbird

8 min read

02 July 2014

Google never misses a chance to tell users that their goal is simple: to give people the most relevant answers to their queries as quickly as possible. Yet this requires constant tuning to their algorithms.

Many of the changes are so subtle that few people notice them, but one thing the Expedia mishap has told us is that if you’re unaware, it could impact your stock price and value in the long run – not to mention a whopping Google penalty.

So let’s go back to 2011 when Google started naming algorithm updates after animals.

As sites began using “black hat” SEO to boost their way up the rankings, Google struggled to maintain the concept of giving users quality content relevant to their searches. So, it was important for high-quality sites to be rewarded, and that’s exactly what Panda strives to do.

Panda was designed to reduce rankings for low-quality sites which are low-value add for users. Essentially, the update is all about assessing overall website quality.

A famous blog post by Google employee Amit Singhai provides guidelines on what Panda looks for through a set of questions:

  • Would you trust the information presented in this article?
  • Is this article written by an expert or enthusiast who knows the topic well, or is it more shallow in nature?
  • Does the site have duplicate, overlapping, or redundant articles on the same or similar topics with slightly different keyword variations?
  • Would you be comfortable giving your credit card information to this site?
  • Does this article have spelling, stylistic, or factual errors?
  • Are the topics driven by genuine interests of readers of the site, or does the site generate content by attempting to guess what might rank well in search engines?
  • Does the article provide original content or information, original reporting, original research, or original analysis?
  • Does the page provide substantial value when compared to other pages in search results?
  • How much quality control is done on content?
  • Does the article describe both sides of a story?
  • Is the site a recognized authority on its topic?
  • Is the content mass-produced by or outsourced to a large number of creators, or spread across a large network of sites, so that individual pages or sites don’t get as much attention or care?
  • Was the article edited well, or does it appear sloppy or hastily produced?
  • For a health related query, would you trust information from this site?
  • Would you recognize this site as an authoritative source when mentioned by name?
  • Does this article provide a complete or comprehensive description of the topic?
  • Does this article contain insightful analysis or interesting information that is beyond obvious?
  • Is this the sort of page you’d want to bookmark, share with a friend, or recommend?
  • Does this article have an excessive amount of ads that distract from or interfere with the main content?
  • Would you expect to see this article in a printed magazine, encyclopedia or book?
  • Are the articles short, unsubstantial, or otherwise lacking in helpful specifics?
  • Are the pages produced with great care and attention to detail vs. less attention to detail?
  • Would users complain when they see pages from this site?

Singhai also explained that these questions all contribute to how real-life users rate the quality of your site.

Image source

Penguin came out in April 2012 as a tweak to Panda. The underlying idea behind the update was to penalise and decrease sites breaching Google Webmaster Guidelines. And although this algorithm’s primary job is focussing on unnatural links, the Penguin update was launched with the intention of better catching sites deemed to be spamming its search results.

So what is an unnatural link? A Google Quality Guidelines article states that “creating links that weren’t editorially placed or vouched for by the site’s owner on a page, otherwise known as unnatural links, can be considered a violation of our guidelines.”

This was echoed by Google employee John Mueller, who explained that if the Penguin algorithm determines that several links to your site are untrustworthy, then this reduces Google’s trust in your entire site.

But there’s a way to get back on top if you’ve been demoted due to Penguin. Much like its animal predecessor, the algorithm periodically re-evaluates sites after each update. If, by the time it re-runs, you’ve cleaned up your site, you will once again gain Google’s trust.

Google threw a spanner in the works, however, when it introduced Hummingbird, which is a completely separate algorithm. By the time that Google made the announcement, Hummingbird had already been live for about a month. But what makes Hummingbird so important? It was a complete overhaul of the entire Google algorithm.

In that sense, Panda and Penguin are merely filters, where Hummingbird could be seen as a brand new engine. 

Furthermore, the goal is to further understand a user’s query. A great explanation comes from Bill Slawski, who said: “The kind of query where it might potentially work best upon could be something like [What is the best place to find and eat Chicago deep dish style pizza?], where Google might use synonyms and substitute query rules in combination with analysing other non-skip words within the query itself to understand the context of a query term and a potential replacement for that query to reformulate (or replace) the terms being searched upon and provide potentially better results.

“Google might look at the query [What is the best place to find and eat Chicago deep dish style pizza?], and understand that a searcher looking for results for that query would likely be more satisfied with the use of ‘restaurant’ instead of ‘place.’
The use of ‘restaurant’ instead of ‘place’ might be considered as a potential synonym or substitute based upon substitution rules which focus upon co-occurring terms that might show up in search results when those terms are searched upon, or co-occurring terms in query sessions.”

Slawski continued: “Google’s analysis of different search entities such as the relationships between queries might be identified in some cases as improving searcher satisfaction for search results based upon things such as how long someone might dwell on a page when they select it in a set of search results.”