2013-12-04

Hyve.com – SEO Backlink Case Study about Penguin 2.1 Victim is a post from the best Link Building Tools available worldwide.

This analysis was created using our Superhero Plan and the new Competitive Link Detox tool extensively. The Superhero plan allows you to perform professional SEO and backlink analysis for your own or your competitor’s sites. If you only need Competitive Link Detox (CDTOX), that’s also included in all new Link Detox Pro plans and up.

Don’t miss our next FREE LRT Associate Training. Get one of the last spots if you’re up for 8 hours of free professional training with Christoph C. Cemper.

Don’t forget to signup for our notification list to get informed on other Case Studies on Google Penguin updates.



How to analyse your Backlink Profile… the way Google does

I’m proud to present our third Penguin 2.1 case study, written by our newest Certified LRT Xpert, Derek Devlin, who is the head of digital marketing strategy at Made By Crunch; the 2nd Certified LRT Agency worldwide and the first in the UK!

This case study will look into one of UK’s leading hosting providers – hyve.com, which was hit hard by Panda in 2012 and then hit again by Penguin in 2013.

We look forward to your feedback and always appreciate you sharing the work of our Certified LRT Professionals

- Enjoy & Learn!

Christoph C. Cemper

PS: if you really want to dive deep, we also have a 96 page PDF version for download that contains five more chapters for you to download here

Table of Contents

1.0 Organic Visibility

1.1 Hyve.com Observations

1.2 Impact of Penguin 2.0 “Suppression”

1.3 Hummingbird – August 21/22 – Mildly Arousing

1.4 Penguin 2.1 – The Misery Continues

1.5 Organic Visibility Summary

2.0 Competitive Landscape Overview

2.1 Topline Observations of “Cloud Hosting”

2.1.1 Observations from the QDC Analysis

2.1.2 Site-wide Ratio

2.2 Topline Observations of “Cloud Hosting UK”

2.2.1 Observations from the QDC Analysis

2.2.2 Site-wide Ratio

2.3 Top line Comparison Summary

3.0 Assessing the Impact of Disavowed links – Competitor Analysis

3.1 Working with disavow files in CLA

3.1.1 Step One – Upload your disavow:

3.1.2 Step Two – Run the Report:

3.1.3 Step Three – View your Report

3.1.4 Step Four – Use the filter slice function to compare your old / new link graph

3.2 Anchor Text

3.2.1 Anchor Text WITHOUT the Disavow File Used

3.2.2 Anchor Text INCLUDING the Disavowed file

3.3 Link Status (followed versus. no-followed)

3.3.1 Link Status WITHOUT the Disavow File Used

3.3.2 Link Status WITH the Disavow File Used

3.4 Link Type (text, image, redirect, Etc)

3.4.1 Link Type WITHOUT the disavow

3.4.2 Link Type WITH the disavow

3.5 Deep Link Ratio

3.5.1 Deep Link Ratio WITHOUT the Disavow

3.5.2 Deep Link Ratio WITH the Disavow

3.6 Geographic Location of Links

3.6.1 Geographic Location of Links WITHOUT the Disavow

3.6.2 Geographic Location of Links WITH the Disavow

3.7 Power*Trust of Links

3.7.1 Power*Trust of Links WITHOUT the Disavow

3.7.2 Power*Trust of Links WITH the Disavow

3.8 Competitor Summary

4.0 Link Risk

4.1 Link Detox Setup

4.1.1 Factoring your Disavow File into Link Detox

4.2 Link Detox Risk

4.3 Providing Context with CDTOX – Hyve.com’s Risk Profile vs. Competitors

4.3.1 Hyve.com’s Risk Profile vs. “Cloud Hosting” Competitors

4.3.2 Hyve.com’s Risk Profile vs. “Cloud Hosting UK” Competitors

4.4 Hyve.com Link Risk after submission of the Disavow File

4.4.1 Links from De-indexed Sites

4.4.2 Links from Weak & Low Quality Sites

4.4.3 Unnatural Linking Practices

4.4.4 Link Networks & Link Diversity

4.4.5 Abandoned Domains

5.0 Round-Trip Disavow – Building An Action Plan

6.0 Link Velocity Trend

7.0 Findings & Action Plan

7.1 Attributes of the Hyve link profile that need to be corrected

7.2 Did the disavow go far enough?

7.3 Should some links be “un-disavowed”?

7.4 Is Hyve.com worth fighting for?

7.4 In practical terms, where does that this leave Hyve.com

7.5 Results of taking decisive action

7.6 Future Linking Strategies for Hyve.com

8.0 Thanks

In September 2013, I was commissioned to audit the offsite strategy of Hyve.com.

Hyve have built a reputation as one of the UK’s leading Cloud Hosting providers, delivering mission-critical managed hosting to blue-chip clients such as British Airways, LG and Tesco.

Despite this impressive track record, Hyve were not able to escape the wrath of Google’s second generation Penguin Algorithm and like many other legitimate businesses, their website lost a significant amount of organic traffic from Google seemingly overnight following May 22nd 2013 – the Penguin 2.0 Update.

For companies that rely on leads generated from Google this type of hit can have devastating consequences so it’s imperative that the situation was resolved.

Hyve’s in-house team acted quickly, taking steps to disavow 674 domains. However, three months later the sites organic visibility had still not improved and organic visits from Google had flat lined.

It was at this point that Hyve approached me to conduct an offsite audit.

My objectives were to:

Study the impact Penguin had on the site.

Identify hypotheses as to why the site had been hit by contrasting the link graph with top ranking competitors.

Assess the merits of links pointing at the site.

Formulate an action plan for recovery.

Above all, the key question was:

Did the disavow file submitted by Hyve go far enough?

In addition, Hyve senior management wanted to understand the opportunity cost of keeping the existing domain:

Would switching to another domain be a better use of resources?

This case study presents a summary of my findings extracted from the offsite audit.

In addition, I aim to provide you with some tips and strategies to help you get the most out of the advanced features within Link Research Tools.

Before we get into the meat of the analysis, let’s get some background…

1.0 Organic Visibility

After conducting an in-depth analysis of Hyve.com’s organic visibility, I was able to draw the following observations…

1.1 Hyve.com Observations

Hyve was largely unscathed by Google updates in 2011:

Nothing too exciting happened in 2011… traffic was moving up and to the right and the site appeared not to of taken any significant drops in visibility.

Panda hit the site hard in 2012:

2012 was a different story. There was a massive drop followed by a sustained period of low search visibility. This was most likely a big hit from Panda #12 (Point A); however, it is possible that this was a Penguin 1.0 (Point B) pre-cursor.

A lot of sites reported large fluctuations around this time – it’s feasible that Hyve were part of a “test case” segment and that this was actually the cause for the drop. On balance, I would speculate that Panda was most likely to blame here.

March / April 2013 saw a mini-recovery only to suffer another hit – this time from Penguin 2.0.

From March 2013, the site looked to be improving only to be slapped again, this time by Penguin 2.0. Thereafter, search visibility has sustained at an all-time low.

Hyve.com had not been completely de-indexed (184 pages still cited as being in the Google index).

The issues were purely algorithmic because Hyve.com had no manual actions in force.

Considering that the sharp decline in organic visibility coincides with the release date for Penguin 2.0 it’s pretty clear that the site was a victim of the update.

With no official ‘manual’ penalty in play the issue, as with the vast majority of sites that get hit by Penguin, is down to the algorithm.

The Penguin algorithm was designed to catch out sites that utilise “unnatural” linking practices to try and game the search engine. In essence, it looks for commonly used spam techniques, particularly in the way that sites link to one another. We can therefore speculate that Hyve were somehow guilty of spammy or manipulative techniques to try and gain an edge in the rankings.

This explains the loss in visibility coinciding with the update. The site or portions of the site are clearly being suppressed.

1.2 Impact of Penguin 2.0 “Suppression”

Although not as severe as the hit the site took from Panda in 2012, the impact of Penguin 2.0 was still significantly damaging (based on 12 weeks before, versus 12 weeks after the update):

Unique visitors to the site dropped by 12%.

Non-paid visits from Google were down 47%.

Landing pages saw a 22% decrease in visits.

29 pages stopped receiving any traffic at all from Google.

The homepage was the biggest loser, down 50%.

No page made a significant increase in visits since the update.

Branded keyword traffic dropped by 44%.

Visits from unbranded keywords dropped by 48%.

The loss in visits correlated with ranking drops from many keywords; “dedicated web hosting”, “secure ftp server” and “cloud hosting” saw the biggest decline.

For brand terms, such as “Hyve” and “Hyve Hosting” and “Hyve managed hosting” the site still ranked in position 1 so it is surprising that visits from brand keywords dropped. The loss in visits here was most likely attributed to factors outwith search rankings.

You could be forgiven for thinking that the decrease in branded and unbranded keyword traffic was just the result of Google increasing the proportion of “not provided” keyword data in Google Analytics.

However, this appears to have not been the case…

“Not provided” traffic to Hyve.com saw the biggest absolute drop in visits, almost halving from Pre-Penguin 2.0 levels.

All in all, the negative impact suffered by Hyve.com is conclusive – the site is a clear Penguin 2.0 victim.

Since then there were a number of confirmed Google Algorithm updates.

1.3 Hummingbird – August 21/22 – Mildly Arousing…

A lot of Hype and misinformation surrounded the launch of Google’s new core algorithm – Hummingbird. The impact in terms of ranking changes appears to have been negligible.

Looking specifically at Hyve.com, “Hummingbird” had a mildly positive impact:

Overall traffic increased by 24%.

Non-paid visits from Google took a short-lived boost of 52%.

The top 5 pages that lost visits after Penguin started to regain visits.

All this looks encouraging until you put it into perspective…

Organic visits from Google were still down 30% on pre-penguin levels and landing pages with the exception of one, were still pulling fewer visits than Pre-penguin levels.

1.4 Penguin 2.1 – The Misery Continues

Penguin 2.1 provided the opportunity for Hyve.com to recover.

Penguin does not work in real-time; affected sites need to wait for a refresh of the Algorithm to gauge whether the site is still under some form of suppression.

Despite the fact that 674 domains had been disavowed there was no improvement in the sites visibility around October the 4th, the date of Penguin 2.1:

This lack of improvement must mean one of three things:

The suppression still exists – Hyve.com is still considered by Penguin to have too many unnatural links pointing at the site and as a result, the site is still being filtered – The disavow therefore, didn’t go far enough.

The disavow file did go far enough so the suppression no longer exists but the lost Pagerank from getting rid of 674 links in the disavow file has decimated Hyve.com’s ranking potential.

The disavow file was not processed fully or fast enough by Google. LRT have some great news coming up to possibly exclude this possibility in the future with Link Detox Boost. We don’t know about this yet.

Which is the more likely scenario of the first two?

This is the question I intend to answer in due course throughout this investigation.

The fact remains… neither Hummingbird, nor Penguin 2.1 correlated with wide spread recovery of rankings for key terms.

However, I did observe something interesting happening with internal pages…

Over time, internal pages appear to have made a number of ranking fluctuations in keyword verticals where the homepage had previously ranked well prior to Penguin 2.0.

It looks clear-cut that the homepage remains suppressed, it has pretty much tanked for all but one query with no sign of reprieve.

On the plus side though, the internal pages look to still be in the mix as far as competing for the SERPS goes.

When I first analysed the site in September the internal pages had made significant gains in verticals where the homepage had previously ranked well. In November, many of these pages remain placed.

Combining these insights with the knowledge that some internal pages started to gain visits after Hummingbird (one particular internal page that was hit by Penguin is now pulling more visits than pre-Penguin levels), I would speculate that the internal pages are not encumbered in the way that the homepage appears to be.

I believe that it would be worthwhile to try and promote the internal pages to earn positive ranking signals to see how the pages respond. Given that no proactive link building has been carried out in the last 6 months, the visibility of these pages looks encouraging.

1.5 Organic Visibility Summary

Penguin 2.1 presented an opportunity for Hyve.com to recover. Organic visibility failed to return and weighing up all the evidence, I don’t believe that the Penguin ‘flag’ has been lifted from the site.

I am of the opinion though that the algorithmic suppression is not site wide. It appears to be mainly focused on the homepage. Hummingbird did nothing to change this fact but it did have a positive effect on certain parts of the site. It looks like the internal pages have gained the most. One internal page that was hit by Penguin is now bringing in more visits than pre-penguin levels, which supports the theory that internal pages still have the potential to rank.

The homepage suffered the most from Penguin and still continues to rank poorly for the main terms that the site is targeting, with the exception of “cloud servers uk”, which Google still merits a ranking for. This term is at the bottom of the 2nd page so I think that still supports the theory that the filter is page based. It is possible though that it’s working on a keyword level, i.e. suppressing those particular keywords that have been used excessively on link anchor text.

Matt Cutts talks a lot about Google taking ‘targeted action’ and to me it makes sense that Penguin could work on the keyword level. So it is plausible that the keywords where Hyve were overly aggressive with anchor text were suppressed and that the rankings have not returned yet because the anchor text has not been removed to an acceptable level.

It’s common for most of the aggressive anchor text to be pointed at the homepage so it is feasible that there is still a negative ranking factor in place holding down rankings for the main terms and that this is in turn weighing the homepage down.

In the coming sections I intend to look for anomalies in the site and to try and substantiate these claims.

First, let’s look at the competitive landscape within which Hyve is trying to compete.

2.0 Competitive Landscape Overview

I want to assess Hyve.com’s link profile against competition in keyword verticals that the site is trying to rank highly in.

I have therefore selected two keyword verticals to analyse:

Cloud Hosting

Cloud Hosting UK

These keywords also make good candidates for analysis because they saw two of the most significant ranking drops and were two of your most trafficked terms.

2.1 Topline Observations of “Cloud Hosting”

The top ranking sites in the “Cloud Hosting” keyword vertical on Google.co.uk are:

www.rackspace.co.uk

www.elastichosts.com

www.pulsant.com

www.fasthosts.co.uk

Using the Quick Domain Compare tool we can get an overview of how Hyve.com stacks up against these sites.

2.1.1 Observations from the QDC Analysis

First, it’s important to note that we are viewing Hyve.com as it was prior to links being disavowed. Second, QDC works in ad hoc mode pulling links from the LRT database to provide us with sample metrics.

My first impression is that Hyve has a strong Power & Trust profile – only to be out performed by Rackspace and Fasthosts.

Power and Trust are high in equal measure, which is an early indicator that Hyve has some well-trusted and high quality links so not just bad stuff here.

Sites with a significantly lower trust scores have the most toxic link profiles consisting of mainly weak and low quality links, whereas Hyve clearly have some high authority links present.

Hyve.com has the lowest number of keywords ranking according to the visibility indicator.

Based on what I have seen in the analytics this figure is a bit misleading as Hyve is generating most of their traffic from one or two clicks here and there on a number of long-tail / random phrases.

QDC uses data pulled from SEMrush so they will most likely be tracking the head terms in the web hosting niche rather than the more obscure long-tail phrases. This metric is still an insightful measure of performance because it indicates the site ability to compete for the high volume verticals. The higher the number here the likely the site will be able to rank for big terms.

532 domains are linking to Hyve. Whereas, Rackspace and Fasthosts have several thousand domains linking!

Pulsant are third with 442, this would suggest they are doing something different by having far less linking domains but still meriting a place in the top three spots…perhaps a lot of site wide links? We will see.

Breaking the domains down and looking first at unique class C IPs, linking sites decreases down to 350, it’s normal to take a decrease here but still worth assessing what these sites are, that share the same IP address. Do they belong to networks of sites?

The number of .edu and .gov links pointing at Rackspace is an eye opener!

85 .edu and .gov links is a lot for a commercial enterprise to have. It would be interesting to look at these and see if they are earned links or manufactured – most likely the latter. Hyve have 5 .Gov links, which may go some way to explaining the high trust score.

Total links including site wide links (links in the side bar and footer that appear on many pages) for Hyve.com are 115,962 coming from 532 domains.

This seems quite high – I can do a quick check to see how this compares to the other sites in the cloud-hosting vertical by using the Juice tool.

2.1.2 Site-wide Ratio

Here is the site wide ratio of linking domains versus total linking pages.

We can see that Hyve has the second highest out of the top 4 sites in the cloud hosting vertical, so there is a reasonable degree of site wide links in play here.

Very interestingly – Pulsant, who if you remember are ranked third in this vertical with the lowest number of unique domains, don’t have the highest site wide ratio, so it’s not side wide links that are giving Pulsant the edge to place amongst the top ranked sites.

Hyve.com have a higher than average site wide ratio at 217. Fasthosts stand out with the highest ratio at 265.

2.2 Topline Observations of “Cloud Hosting UK”

The top ranking sites in the “Cloud Hosting UK” keyword vertical on Google.co.uk are:

www.rackspace.co.uk

www.nimbushosting.co.uk

www.tsohost.com

www.cloudhosts.co.uk

2.2.1 Observations from the QDC Analysis

Instantly, we can see that the “cloud hosting uk” is a far less competitive vertical.

Rackspace are the only company to appear in both verticals. Holding the top spot for “cloud hosting” and “cloud hosting UK”.

Cloud Hosts who are sitting 4th have weak Power and Strength indicators and in theory should be outranked by Hyve – all other factors being equal.

Hyve have the third highest number of head keywords ranking. We can surmise that far less ranking domains are needed to do well in this vertical. Once again there is an anomaly, this time it’s Cloud hosts with only 17 linking domains! Seems odd!!

Nimbus and TSO have 120 and 194 unique domains linking respectively, still dwarfed by Hyve.com’s 532. Similarly, .edu and .gov domains have not been used to the extent that they were in the “cloud hosting” vertical. Positions 2,3 and 4 have been secured without the need for these “hard-to-get” domain extensions.

With the exception of Rackspace, I’m struck by how few unique class-C links the competitors have. This is a vertical that’s wide open for a half decent site with a diversified and clean link graph.

2.2.2 Site-wide Ratio

I can look and see the extent to which site wide links are a factor in ranking for “cloud hosting UK” by doing the same check with the juice tool for site wide ratio.

Wow! Nimbus Hosting has a very high ratio of total links versus linking domains.

Site wide links are clearly influencing their ranking, I suspect they are using a “powered by” footer link on their clients sites but I will check this later.

Checking back to the QDC, I see that Nimbus have 1.2 million total links:

Hence, the very high site wide ratio. Hyve has the second highest ratio of site wide links but doesn’t look so high as to be a concern. Although, looking at Nimbus it doesn’t seem like Google are seeing this as a negative signal!

Just a side note… It will be interesting to see what anchor text the Nimbus links have… any form of money anchor text would surely of been caught by Penguin?

2.3 Top line Comparison Summary

I’ve taken a high level look at 2 separate keyword verticals that previously sent traffic to Hyve.com pre-penguin.

Based on my analysis I have made the following observations about Hyve.com’s position prior to disavowing links:

The site had a strong link profile in terms of the Power and Trust being passed.

Site wide links although higher than most of the competition, look to be acceptable based on the thresholds being tolerated for Nimbus.

About 1/3 of Hyve.com’s linking domains have duplicate Class C IP addresses, possibly from client sites hosted on Hyve’s own servers? Or is there a network in play here?

Rackspace are the leading authority in both verticals – by a long margin – looking at the metrics it would take a number of years for a new site to even get close to Rackspace.

Cloud Hosting is a competitive vertical, with a high number of unique linking domains and also high authority links required to rank well.

“Cloud Hosting UK” is much less competitive.

I would speculate that it’s possible to rank in the top 4 positions without .gov or .edu links and between 100 – 200 unique class C links.

3.0 Assessing the Impact of Disavowed links – Competitor Analysis

Now that I have a feel for the competitive landscape, I am now going to move on to compare Hyve.com’s link profile in more detail. The objective is to identify anomalies against the top-ranking competition in each keyword vertical.

This has two purposes:

It allows us to identify potential hypotheses as to why Hyve.com saw a decrease in organic search visibility after the Penguin 2.0 update.

It allows us to identify thresholds in the link graph that Google finds acceptable, given that the competing sites are sites that continue to rank well post Penguin 2.0.

This is where things start to get interesting because I am now dealing with two link graphs for Hyve.com.

First, I want to assess how the link graph looked before any links where disavowed – this is effectively the link graph that Penguin demoted, we should expect to see heavy use of keyword anchor texts with commercial intent and also some spammy linking techniques.

Second, I want to look at the link graph with the disavowed links taken into consideration. This is how Google now sees Hyve.com’s links.

Comparing and contrasting the link profile as it was before versus how it is now should provide a lot of insight. Especially when set against the context of high-ranking competitors.

This should tell me if the disavow has gone far enough in remodelling the sites link graph.

For this part of the analysis, I will mainly be using the Competitive Landscape Analyzer (CLA) to do the competitive analysis and the Backlink Profiler (to analyse anchor text).

3.1 Working with disavow files in CLA

It is now commonplace for site owners to disavow links to their site in an effort to cleanse their link profile. Quite often this is a reactive process, carried out because the site has experienced an algorithmic demotion or in the most serious cases, a manual action.

There are also a growing segment of site owners that realise they have to manage the risk associated with their links, they take a more proactive approach disavowing links to stave off any potential penalties that may be lurking around the corner.

Whatever the reason, SEO’s, Site Owners and Webmasters should learn how to analyse their site whilst also factoring the disavowed links into the analysis.

There’s little value in running link reports for your site unless you are able to assess the link graph as Google sees it.

This is where Link Research Tools excels. BLP, CLA and DTOX / CDTOX can all factor your disavowed links into the equation, giving you the complete 360 view of your link graph. The process for working with your disavow files is similar across all the tools in LRT.

Here’s how to do it in CLA – follow these 4 steps:

3.1.1 Step One – Upload your disavow:

Navigate to your “settings” folder and upload your disavow file – (do this before running your CLA report).

CLA and BLP differ from Link Detox slightly, DTOX requests your disavow on the “start report” page, however, with CLA and BLP you need to have already uploaded the file in the “settings” section.

NB: It’s important to make sure that your disavow file is properly formatted according to the convention set out by Google.

Once the file is uploaded, you will see the file saved on your settings page:

3.1.2 Step Two – Run the Report:

Next, you need to run your CLA report.

Navigate to the “Start Report” page > select the Competitive Landscape Analyzer > configure your report options:

Pro Tips for CLA:

TIP: Use your ‘canonical’ domain

To www. or not? The first decision you need to make is whether to compare the www. version of your domain with your competition or the root domain without the www. version.

My advice here is to use the canonical version of your domain; i.e. whatever version your domain resolves to – your “preferred domain” as Google calls it – you have got a 301 redirect in place to set your canonical version…right? If you don’t – read this.

You’ll notice when you use the “Find Competing Pages” tool it will pull in the top 10 competitors from the keyword vertical and engine you specify:

The tool is pulling your competitors from the Google search results so it will pull the canonical version of your competitor’s domains, i.e. the version that Google believes to be your preferred domain.

You can see here – cloudhosting.co.uk are the only site to use the root domain as their preferred version – the rest use www. version:

The result is that you should get a like-for-like comparison and ultimately, a more accurate CLA report.

Tip: Site-wide links filter – Set to 5

It’s logical to presume that Google devalues site wide links to an extent. It’s therefore advisable to set a threshold after which the tool ignores the site wide links. In most cases, setting the site wide link filter to 5 will be appropriate. This is enough to give the site wide links extra credence in the report but not enough to skew your results.

Remember, it can be very worthwhile to run two CLA reports and compare how your link graph looks with site wide links included versus with them set to 5, as you will see in my Hyve analysis coming up. You can gleam some actionable insights from using this technique, for example… How do site wide links impact our anchor text profile?

Tip: Choose a descriptive name for your report

This sounds obvious but start as you mean to go by making things simple for yourself. When running a lot of reports, it can be very easy to lose track of what report you ran and why, especially if you have to go back and re-trace your steps a month or so later. Take the time to think of the best descriptive name you can to ensure that this doesn’t become an issue for you.

3.1.3 Step Three – View your Report

Once your report is ready to view. You will have the report that shows you your link graph as Google is viewing it, that is with the disavowed links being taken into account, i.e. those links in the disavow file will be being ignored at this stage.

3.1.4 Step Four – Use the filter slice function to compare your old / new link graph

A quick way to check that your disavow file was formatted correctly and your disavowed links were activated in the report is to check the stored slices box on the CLA report page.

If your report populated properly, you will find a new “Stored Slice” called “Without Disavowed and Ignored links”:

Selecting this slice will open a new CLA report in another tab of your browser, this new report re-calculates all the metrics and graphs ignoring that you have used a disavow file. This effectively shows you how your link-graph should look to Google if they honor 100% of the disavow commands.

You can confirm that the report you are viewing is “without the disavowed links” because there is a notification showing you the rules used to create the slice:

Viewing your link profile ‘without the disavowed and ignored’ links is very useful for measuring the impact that your disavowed links have had on your link graph. This can be a great technique to use before you formally upload your disavow file to Google or as in the case of my Hyve.com example, you can use this to gauge the effectiveness of the disavowed links after the event. Did the disavow do a good job of remodelling your link graph to blend in with top ranking competitors?

Let’s jump back to my example case study – Hyve.com… starting with Anchor Text.

3.2 Anchor Text

It has been well documented that excessive use of commercial anchor texts on inbound links is one of the easiest identifiers of a self-engineered link profile since ‘natural’ link profiles do not contain a high density of these “money” keywords. This was one of the issues identified in the LRT Meta Case Study on Penguin victims.

Penguin 1.0 was developed to penalize aggressive use of money keywords as anchor text by webmasters and SEO’s. Penguin 2.0 has likely been more aggressive in terms of the threshold tolerance and it makes sense that the algorithm would consider the spread of money keywords across all the pages of the site as well.

It’s therefore important to manage the proportion of money anchor text in the context of your competitive verticals.

First, I’m going to assess Hyve.com’s link graph WITHOUT the disavow file so this is how the link graph looked when Hyve got hit by Penguin…

3.2.1 Anchor Text WITHOUT the Disavow File Used

Here is Hyve.com’s link profile anchor text in totality. This word cloud is a snap shot of the sites anchor text, first including all site wide links.

Straightaway “cloud hosting” is jumping right out at me, as is “hp and vmware cloud hosting”.

The brand term “hyve” is visible but I would prefer to only see a mix of brand terms here as the dominant force in the anchor text cloud.

We can deduce from this that there are clearly site wide links present that say, “cloud hosting”.

Here’s how the keyword cloud actually breaks down:

Links with the exact match anchor text “cloud hosting” account for 24.1% of the sites backlink profile. This is the tied with “powered by hyve vmware cloud hosting” as the two most prominent anchor texts.

“Powered by hyve vmware cloud hosting” is most likely coming from site wide links on client sites.

Let’s look at the word cloud for Hyve.com’s backlinks with the site wide links restricted to 5.

The reason for doing this is I believe that it would be logical for Google to look at backlinks in totality and also in isolation with some type of threshold. It’s reasonable to think that Google won’t count more than 5 links from any given site when it comes to calculating PageRank but in my view they probably consider anchor text of site wide links differently.

It makes sense to me that if you have a lot of site wide links with exact match anchor text, Google won’t simply ignore this anchor text.

Here’s how the link profile looks with site wide links set to 5:

This is a far more acceptable keyword cloud.

The brand term “hyve” dominates, which is much more natural looking.

The URL of the site is also prevalent. “Cloud hosting” and “web hosting” are still visible, which for my liking means that they are probably still a little top-heavy in terms of their proportion to total links.

Let’s look at the breakdown:

“Cloud Hosting” is still the highest money keyword but has reduced from being 24.1% down to 6.3% of links. “Powered by hyve vmware cloud hosting” has disappeared and therefore must be under 2.9% of links, so these links are indeed mostly site wide links.

As indicated by the tag cloud, “Web hosting” has actually increased as a proportion of total links after restricting the site wide links. It’s the second highest of the money keywords visible, sitting at 3.7%. This term was our 7th biggest loser in terms of lost keyword visibility after Penguin 2.0. It’s also interesting to note that “web hosting” is clearly still suppressed since it’s not ranked in the top 100 positions on Google or Bing for that matter.

On the flip side, what is pleasing about the breakdown is that when we take site wide links out of the equation by reducing them to 5 per site, the brand keywords have become a lot more visible.

What I have learned from all this is that there are a number of site wide links with the exact match anchor text “cloud hosting” that need to be dealt with. It would be much more preferable to change these to “hyve” if that was at all possible.

The overall usage of money keywords is also important. We can assess the extent to which money keywords have been over or under-utilised by comparing the proportion of money anchor text against the top performing competition in the core verticals.

Anchor Text Profile Before Disavow Versus the “Cloud Hosting” Competitors

This graph shows the proportion of money anchor text used versus competitors in the Cloud Hosting vertical:

In comparison to the top ranked sites, Hyve clearly over did it with the money anchor text.

Not enough “brand” or “other” generic anchor text has been used. The big disparity is “branded” keywords; these should represent a much higher proportion of total links.

Here is the same graph just looking at the absolute number of links:

Just look at the gulf in brand links!

This amplifies my previous point about brand anchor texts being far too low but what is really interesting is that the use of “money” and “compound”, in terms of absolute number of links is pretty much bang on.

In theory, Hyve could just of built 700 brand links and 80 “other” links and be in the clear from the anchor text thresholds the next time Penguin comes round. Of course, this would not of protected them from being caught if the links are completely toxic, however, from an anchor standpoint we know that these are thresholds that Google is happy to tolerate.

Anchor Text Profile Before Disavow Versus the “Cloud Hosting UK” Competitors

It’s the same story in the “Cloud Hosting UK” vertical, way too high a proportion of money keywords used…

Here’s the link profile comparison looking at the percentage of anchor text used:

Again far too much money anchor text, not enough brand and generic terms used. The use of compound keywords, i.e. combination of money and brand anchor text is also a little high in comparison to the top ranking sites for “cloud hosting UK”.

Here’s the same graph looking at the absolute number of links:

It’s a surprise that even in this vertical (cloud hosting UK) where sites in the main have far less links, Hyve still have far fewer brand links.

This was clearly one of the main flaws with the original link building strategy.

That being said, a disavow file has already been submitted so the graphs we have been looking at represent the way the link profile used to look.

I now want to look at the link profile from the perspective of anchor text but with the disavow file taken into consideration so we will now be ignoring the links that were disavowed – just as Google would. This is effectively what the link graph looks like now.

How did the anchor text profile of Hyve change once Hyve disavowed 674 domains? Let’s see…

3.2.2 Anchor Text INCLUDING the Disavowed file

Here’s the anchor text displayed as a word cloud ignoring the disavowed links just as Google would.

The site wide filter is off for this report so any site wide links that were not disavowed are included here:

“Powered by hyve vmware cloud hosting platform” and “cloud hosting” still dominate the anchor text for the site.

Site wide links with the anchor text “cloud hosting” have not been dealt with sufficiently by the disavow file.

This can be confirmed by looking at the breakdown:

Wow! Following disavow, the proportion of links with the anchor text “cloud hosting” has actually increased!

“Cloud servers” is prominent also; it would be preferable to see only brand terms here in this breakdown. Also, there are no URL links, which are what you expect to see in natural link profiles.

Let’s take a look at how the link graph looks with site wide links restricted to 5 links per site – still with the disavowed links ignored:

The brand remains the main term, which is very positive. “Cloud Hosting” is still there but I can now see a lot of URL links as well, which is an improvement.

Here’s the breakdown:

“Cloud Hosting” has diluted a little from 6.3% to 4.3%, which is probably still too high for one single term but I can check this against competition later.

Changing some of the site wide links to “hyve” would fix this.

Anchor Text After Disavow Versus the “Cloud Hosting” Competitors

I can compare the use of money anchor text as a whole for Hyve versus the top ranking sites for “Cloud Hosting” and continuing to honour the disavow file (ignoring those links), here is what the overall anchor text profile looks like:

As before the “brand” keywords are substantially less but we already knew this.

Pleasingly, money and compound anchor text is now completely inline with the competitors, as a proportion of links.

Adding brand links will dilute this but that in my view is not a negative because it’s better to be in a position where you can add exact match anchors rather than having to take them away.

Looking at the absolute number of links remaining against the Cloud Hosting competitors we get a better perspective on the gulf in links.

Hyve is at least 800 to 1000 domains short of being able to compete with the top 3 sites in the “cloud hosting” niche!

Given that CLA works on an ad-hoc basis – we can only look at a snapshot of data at any one point in time, i.e. no backlink tool can feasibly index the whole web so it’s likely that the gulf in link volume is even bigger than we are seeing here.

Anchor Text After Disavow Versus the “Cloud Hosting UK” Competitors

Here’s the absolute link graph versus the Cloud Hosting UK competitors.

Finally, this graph confirms it again, even after the disavow file has been used; there were too few brand links and this needs to be the priority for Hyve.com going forward.

In comparison to the highest ranked sites in the “Cloud Hosting UK” vertical the site is 300 to 400 links away from being able to compete for the top three positions.

To give you an idea of the type of anchor text profile Google is favouring, I have profiled the top 3 ranking sites in the Cloud Hosting UK niche in Appendices One, Two and Three. For comparison, Hyve’s anchor text profile is in Appendix four.

What you will see when you look at these profiles is just how little money anchor text is needed to rank well for big terms.

Especially of note is Nimbus, remember they had a crazily high number of site wide links?

As I suspected, not a single one is a money keyword. Hence, they have flown under the Penguin radar unimpeded. This would suggest that site wide link on its own is not inherently bad – it’s the anchor text placement that is the issue with these links.

3.3 Link Status (followed versus. no-followed)

A very high proportion of no-followed links is synonymous with SPAM tactics since a lot of automation software places links in volume on locations that are usually nofollow, such as blog comments.

Whilst Matt Cutts says that a site can’t be harmed by no-follow links I still believe that it’s important to make sure that they fall within an acceptable range versus your competitors.

3.3.1 Link Status WITHOUT the Disavow File Used

Let’s now look at how the link profile looked when Hyve got hit.

With site wide links included followed versus no-follow links is a 59% / 39% split.

With site wide links set to 5, followed links decrease to 50% followed and no followed links increase to 47%.

3.3.2 Link Status WITH the Disavow File Used

Taking into consideration the links that have been disavowed (viewing the link graph as it is now) and with site wide links included we have a 50% followed to 48% nofollowed ratio:

Setting site wide links to 5 so as to try and mitigate the chances of them skewing our data, increases the proportion of nofollow links to 58.9% leaving followed in the minority at 37.2%:

This analysis may seem a little confusing but what this essentially is telling us is that although there may have been a small number of no follow links in the disavow file – they have not been disavowed in any great volume.

<stron

Show more