2013 has gone and what a year it has been in the SEO industry. Marketers have had to adjust to Panda’s, Penguin’s and just when we thought there would be a respite Google Hummingbird came along and shifted the playing field to another level. It has been widely stated by experts in the SEO field the Hummingbird algorithm update is just the beginning of many more changes that Google are going to roll out in 2014. One of which is the advancement of machine learning. In my latest blog post we will discover what machine learning is and the impact it could have on your website.

Supporting graphic
Image by : Koushikchakraborty13 Image source

So What Is Machine Learning?

Machine learning is appropriate when there is a problem that needs solving but computers do not have an exact set of rules in place to provide a solution to the question. An example of this is a spammy Email or webpage. Most marketers will be able to identify a spammy page on a website fairly quickly by noticing irrelevant keyword stuffing for example. However, if they were then asked to write series of rules to be applied over every website in the world it suddenly becomes a bit more difficult.

From a marketers perspective it is also worth considering that although as marketers we might think a web page is potentially spam some people who are not as technically savvy may visit the page and actually find the information they want. This highlights the importance of machine learning as it is needed by computers such as Google to determine the best results for users when the answer is not necessarily clear cut.

Supervised Learning

We now know why machine learning is needed by computers and search engines, but how do we teach them? One way which programmers can do this is by a method called supervised learning which is described as follows, “Supervised learning is the machine learning task of inferring a function from labelled training data.” In Lehmann’s terms this means as computers have advanced we can teach them algorithms which allow computers to make judgements based on information humans have given it.

Using the example from above as Google is also a machine it is highly probable Google’s engineers have identified several weighting factors it considers to be important when ranking websites. Once the factors have been determined and rules have been set the computer is shown a series of test web pages some of which are considered spam or low quality and some that are not. This process is repeated and weighting factors modified until the machine is accurate in judging which pages are high quality and which pages are low quality or spammy.

In order to prove the algorithm is working correctly another test is run using the same set of rules the computer has learnt in the previous test, but, applying it to new web pages that it has never seen before. Once the algorithm has been proven on a new set of data it can then be tested on a larger amount of pages making any necessary tweaks to the algorithm until Google’s programmers are happy that it can be used on live websites.

At this point you might be thinking ok that’s all well and good but how does that impact my website I’m sitting pretty on the first page of search results I have nothing to worry about, well you might want to reconsider that.

Google Caffeine, Penguin & Panda

In order to predict what’s going to happen in the future first we must take a trip back to the past. In 2010 Google decided to modify their algorithm, the reason being, because they couldn’t update individual PageRank’s independently of the rest of the graph. Very slow and cumbersome I know. Fortunately Google decided it needing waking up and rolled out the aptly named Google Caffeine. This allowed Google to become a live index with websites changing ranking positions daily based upon authority links gained or lost rather than set updates every week or month.

The impact of the Google Caffeine update had a profound impact on search results as it gave Google the ability to implement machine learning on a far quicker and effective basis. This meant that although websites were being ranked by authority it gave Google the capability to spot websites who were using tactics that could be considered spam and rank them accordingly.

Supporing graphic

Image by Siddartha Thota

Following the Google Caffeine update we have since seen numerous Google Penguin and Panda updates which have targeted the following aspects of websites:

  • Websites with poor quality content & content farms
  • Websites with high ad-to- content ratios
  • Websites that were considered over optimised
  • Websites that used keyword stuffing techniques

As you cannot write a set of rules for the above factors that can realistically be applied across all websites it all but confirms Google is using machine learning to improve its algorithm. This also provides an insight as to why there are continuous Google Penguin & Panda updates. Because when the machine (Google) has been taught an improved set of rules then an update to the algorithm can completed.

A good example of when we potentially saw machine learning put into effect was the Google Penguin 2.1 update where it was considered websites with over 25% of anchor text keyword links would be targeted by the update.

Google Hummingbird – Only The Beginning?

It has been only a couple of months since the Google Hummingbird update has been launched therefore it has been predicted that more machine learning will be tested in the coming year. This will allow Google to continue to update and improve its algorithm at greater speed and accuracy in order to provide the end users with a better service. .. So it might be worth checking your rankings on a far more regular basis!

Can the SEO industry combat machine learning?

Throughout this blog I have discussed machine learning and how Google have used this technique in order to improve their search results, however, as website owners I’m sure you will want to know how to avoid being negatively affected by Google updates in the future.

Here are some of the main digital marketing tactics that you should be considering to take into account in 2014 to combat machine learning.


When discussing a content strategy I think my MD Helen summed it up perfectly when she states ‘I will create the best content on the planet’. The reason behind this is as the machine learning becomes more advanced it will be easier for them to spot a website with low quality content and therefore this will have a negative impact on that websites ranking.

However, when discussing content the first thing you should be thinking about is ‘will your audience find your content engaging?’ Over the last few years people are getting better at searching online therefore if they visit your website from a search query and you don’t provide them with the correct information your website will not convert regardless of what Google thinks.


In the above example I used the fact that Google were targeting websites with a high percentage of anchor text links for a reason. Google is cracking down on it and almost definitely will continue to do so. This is not to say stop building links to your website because relevant links do add value to a website. When you are obtaining links for your site, think first ‘if someone clicked on this link and is redirected to my site are they going to find this useful?’ If the answer is no then it is likely Google’s algorithm at some point will think the same and you can expect negative ranking changes.

As technology continues to advance so will machine learning. This means we can expect Google to update its algorithms on a more frequent basis in 2014 consequently as marketers we will have to up our game and create the best pieces of content in order to promote our websites. This ultimately will give us the best opportunity to have successful digital marketing campaigns.

Follow my contributions to the blog to find out more about Digital Strategy, or sign up to the ThoughtShift Guest List, our monthly email, to keep up-to-date on all our blogposts, guides and events.

Similar Posts