Using Data Visualization to analyze the SEO Performance

 

Organic search is almost always, one of the top source of traffic, to a website and if you want to understand traffic acquisition, you need to understand the performance of this marketing channel.

The volume of organic search traffic your website get, still to a large extend, depend upon the volume and quality of website back links.

Though link building is a by-product of content marketing, analyzing the back link profile of a website is still important, in order to understand the overall SEO performance.

For example, if majority of back links are coming from directories and articles websites, you know the old school link building took place or the link builder is still engaged in outdated and dangerous link building practices.

But scanning each and every back link of a website is not practical. You need a tool through which you can visualize the overall back link profile of a website, something like the one below:

back-link-profile

From the chart above, we can conclude that majority of back links are from blogs which is a sign of healthy back link profile.

Now if we look at back link profile of another website, we can see that the majority of back links are from directories which means the link builder/content marketer is doing a crappy job:

back-link-profile2

There are lot of back link analysis tools available in the market, but they don’t do a very good job of visualizing the back links data. Most of them, just give you a very looooong list of back links to analyse.

Now when I am trying to understand the SEO performance, I am not going to analyse each and every back link. I want an X-ray report of a website back links profile which I can show to the client and say :

here is why your organic traffic is so low or

here is why your organic traffic is less than that of your competitor or

here is why your link builder/content marketer need to be fired

No back link analysis tool in the market, still provide such type of functionality.

Imagine how quickly you can assess the performance of any link building / content marketing campaign, if you can visualize the back links data.

This is the power of data visualisation.

Since I couldn’t find any such tool, I built my own. My tool can automatically segment the back link profile of any website into following types of links:

1. Articles links – These are mainly the back links from article directories.
2. Blog links – These are the back links from blogs
3. Directory links – These are the back links from directory websites
4. Edu links – These are the back links from educational institutions like .edu websites.
5. Forum Links – These are the back links from forums.

6. Gov. Links – These are the back links from government websites.
7. News links – These are the back links from news websites like BBC,  Guardian, New York Times etc.
8. Non-English links – These are the back links from non-English websites like Japanese, Chinese or Russian websites.
9. PR Links – These are the back links from press release websites.
10. Social Media Links – These are the back links from social media like Twitter, Facebook, LinkedIn etc.

11. Wiki Links – These are the back links from wiki sites like Wikipedia.

When you run this tool for a website, you can automatically segment its back links in a meaningful way. Instead of getting a long list of say 5k or 10k back links, you get several reports and each report contains a list of particular category of links (like blog links).

If you are using open site explorer to get the back links of a website then you can also see the metrics like PA, DA, number of root domains etc in the reports.

 

How I developed this tool

For ages I looked for a tool like this.  The functionality of the tool was clear to me but not the logic. I wrestled with Excel for weeks to create this tool, as we analyst generally don’t tend to think beyond Excel spreadsheets.

Then one day my brainy wonderful wife saw me, struggling with the tool, she suggested me, to develop the functionality using a database.

So I brushed up my MS Access skills and started working on it. It took me couple of hours to develop the functionality, I wanted so badly and man it was so damn easy.  It also highlighted some serious limitations of MS Excel:

First of all,

Excel is not designed to manage large amount of information and because of this, it is very easy to introduce errors (esp. duplication problems) into a spreadsheet which then makes analysis pretty challenging.

And we analysts, deal with lot of data. I analyse anywhere from 40k to 50k back links in one go.

Excel is a good tool if you want to retrieve the data in one way. But this is not generally the case. We often need data which satisfies multiple conditions and Excel fall flat on the ground there.

You can’t run complex queries in Excel without using even more complex formulas.

Because of the two dimensional design of a spreadsheet (data stored in rows and columns)  you can’t analyze the multi-dimensional nature of the data.

I don’t claim that you can’t develop my tool using Excel.

You can drive nails with a crescent wrench but that doesn’t make it a hammer” 🙂

Excel is simply not designed for data management, yet we use it for everything from back link analysis to keeping records of our clients.

If you are primarly dealing with numeric calculations then use Excel. For database management esp. relationship based, use MS Access.

 

How my tool works

The tool works on some simple and some complex SQL queries.

The logic is based on boolean algebra. Once you understand how the tool works you can then customized it to suit your requirements.

So for example if I want to filter out articles sites from a list of websites, I will use the following boolean expression:

Statement 1= (URL = ‘article’ AND Not Blog AND Not Directory And Not …..)

Statement 2= (Title = ‘article’ AND URL = Not Blog Not Directory….)

Statement 3= Statement 1 Or Statement 2

Statement 1 means retrieve those records where the websites contain the word ‘article’ somewhere in the URL but do not contain the word ‘blog’, the word ‘directory’, the word ‘wiki’……..

When this statement is evaluated the value returned will be either True or False.

Statement 2 means retrieve those records where the websites contain the word ‘article’ somewhere in the Title tag but do not contain the word ‘blog’, the word ‘directory’, the word ‘wiki’…… in the URL.

When this statement is evaluated the value returned will be either True or False.

Statement 3 means if Statement 1 becomes false then evaluate Statement 2. If statement 2 become false then don’t retrieve any record.

So basically I am looking for all those websites which contain the word ‘article’ either in the URL or in the title tag but which do not contain words like ‘blog’, ‘directory’, ‘wiki’ etc.

I used the same logic to develop SQL queries to filter out different categories of back links. For example, following is the SQL query to filter out all the back links from articles websites:

SELECT [Site Back links].URL, [Site Back links].Title, [Site Back links].[Anchor Text], [Site Back links].[Page Authority], [Site Back links].[Domain Authority], [Site Back links].[Number of Links], [Site Back links].[Number of Domains Linking to Domain], [Site Back links].Followable, [Site Back links].[301], [Site Back links].Origin, [Site Back links].[Target URL]

FROM [Site Back links]

WHERE ((([Site Back links].URL) Like “*article*” And ([Site Back links].URL) Not Like “*blog*” And ([Site Back links].URL) Not Like “*directory*” And ([Site Back links].URL) Not Like “*.edu*” And ([Site Back links].URL) Not Like “*.ac.*” And ([Site Back links].URL) Not Like “*.cc./*” And ([Site Back links].URL) Not Like “*.tec./*” And ([Site Back links].URL) Not Like “*forum*” And ([Site Back links].URL) Not Like “*.gov*” And ([Site Back links].URL) Not Like “*.nic*” And ([Site Back links].URL) Not Like “*News*” And ([Site Back links].URL) Not Like “*telegraph*” And ([Site Back links].URL) Not Like “*dailymail*” And ([Site Back links].URL) Not Like “*bbc*” And ([Site Back links].URL) Not Like “*guardian*” And ([Site Back links].URL) Not Like “*ft.com*” And ([Site Back links].URL) Not Like “*times*” And ([Site Back links].URL) Not Like “*pressrelease*” And ([Site Back links].URL) Not Like “*wire*”)) OR ((([Site Back links].URL) Not Like “*blog*” And ([Site Back links].URL) Not Like “*directory*” And ([Site Back links].URL) Not Like “*.edu*” And ([Site Back links].URL) Not Like “*.ac.*” And ([Site Back links].URL) Not Like “*.cc./*” And ([Site Back links].URL) Not Like “*.tec./*” And ([Site Back links].URL) Not Like “*forum*” And ([Site Back links].URL) Not Like “*.gov*” And ([Site Back links].URL) Not Like “*.nic*” And ([Site Back links].URL) Not Like “*News*” And ([Site Back links].URL) Not Like “*telegraph*” And ([Site Back links].URL) Not Like “*dailymail*” And ([Site Back links].URL) Not Like “*bbc*” And ([Site Back links].URL) Not Like “*guardian*” And ([Site Back links].URL) Not Like “*ft.com*” And ([Site Back links].URL) Not Like “*times*” And ([Site Back links].URL) Not Like “*pressrelease*” And ([Site Back links].URL) Not Like “*wire*”) AND (([Site Back links].Title) Like “*article*”));

Note: Don’t try to write SQL queries manually. Instead use the design view to write SQL queries.

MS Access has got 3 views: Datasheet view, Design View and SQL View. Right click on a tab in MS Access, to see these views:

datasheet, design sql view

Here is how these views look like:

three views

Use the design view to write SQL queries.

 

Requirements for using this tool

#1 You need access to a tool like ‘open site explorer’ or other similar tool which can fetch the back links of a website.

#2 You need MS Access installed on your hard disk. I developed this tool using MS Access 2013. I have not tested its compatibility with other versions of Access.

 

Requirements for customizing / fixing this tool

You may need to customize this tool to meet your specific requirement or you may find this tool not working for you. I designed the database tables, keeping open site explorer data in mind.

If you are using another back links analysis tool (majestic, ahrefs etc), then you may need to change the table design, in order to retrieve the data correctly.

However, no matter which backlink anlaysis tool you use, the underlying logic behind retrieving the data and SQL queries is going to remain the same as long as you have basic understanding of SQL and basic knowledge of how MS Access works and how the MS Access database is structured.

 

Instructions for using this tool

Step-1: Download all the back links of your website from  ‘Open Site Explorer’ or other similar tool into excel.

Step-2: Remove duplicate domains using FIND and SEARCH functions, for example: =LEFT(B2,SEARCH(“/”,B2,1)). You can choose not to remove duplicate domains. I remove them because I want to see only one link from a particular domain.

Step-3: Download the database file named OptimizeSmart-backLinkAnalysisTool-V5.accdb, unzip it and then double click on it to open it .

Step-4: Right click on the table named ‘Site Back Links:Table‘ and then select ‘Import’ > ‘Excel’ from the drop down menu:

import excel

Step-5: Browse the excel spreadsheet you want to import, keep all other settings intact and then click on the ‘ok’ button as shown below:

get external data

Step-6: In the next step, make sure that the ‘First Row Contains Column Headings‘ check box is checked. Then click on the ‘next’ button as shown below:

first row contains

Step-7: In the next step, keep the default settings intact and again click on the ‘next’ button as shown below:

another step

Step-8: In the next step, make sure that ‘Let Access add primary key‘ checkbox is checked. Then again click on the next button as shown below:

let access add

Step-9: In the next and final step, change the name of the table to ‘Site Back Links’ in the ‘Import to Table‘ text box as shown below:

site back links

Make sure that you change the table name, otherwise this tool won’t work.

Step-10: Click on the ‘Finish‘ button.  You will be asked to “overwrite existing table or query ‘Site Back Links‘ “. Click on the ‘yes’ button and then click on the ‘close’ button to close the ‘Import Spreadsheet Wizard’ dialog box.

Now you will see a screen like the one below:

starting screen

Step-11: Double click on the table ‘Site Back Links‘ (on the left hand side) to open it. Similarly, double click on each report (like ‘Articles Links’, ‘Blog Links’, ‘Directory Links’ etc) to open them one by one:

double click to open

Note: Depending upon the format in which you imported the excel data into MS Access, you may see ‘Enter Parameter Value‘ box as shown below:

enter parameter value

Keep clicking on ‘ok’ button until it is removed. If you don’t see any data in the datasheet view then you may need to go to the design view to fix this problem. There is also a possibility that, no data really exist for a particular report.

For example, if your website has got no back links from directory websites, then your ‘Directory Links’ report will not show any data.

Once you have opened all the query reports, the screen will look like the one below:

double click chart

Step-12: Double click on the ‘chart’ object to see the back link profile of your website:

back-link-profile

Once you can see the chart, you can also click on a tab and see particular category of back links (like social media links).

Since MS Access is specially designed to handle huge volume of data, you can analyse even 1 million+ back links in one go.

Now you can get much clear picture of which category of links are driving the organic search traffic.

To see this chart you just need to double click on the chart object in the MS Access interface.

This tool is free to use and I would have happily shared it via a direct link but the database file size is very big even after compression (930 KB in compressed format and 25 MB in un-compressed format).


Quick announcement about my new book on Web Analytics and Conversion Optimization

The role of maths and statistics in the world of web analytics is not clear to many marketers and web analysts. Not many talk or write about the usage of statistics and data science in conversion optimization. That is why I have written this book Maths and Stats for Web Analytics and Conversion Optimization to fill this knowledge gap.

This expert guide will teach you, how to leverage the knowledge of maths and statistics in order to accurately interpret the web analytics data and take business and marketing decisions which can quickly improve the bottom-line of your online business. Read first few chapters of this book for free here.


 

Himanshu Sharma

Certified web analyst and founder of OptimizeSmart.com

My name is Himanshu Sharma and I help businesses in finding and fixing their Google Analytics and conversion issues.
  • More than ten years' experience in SEO, PPC and web analytics
  • Certified web analyst (master level) from MarketMotive.com
  • Google Analytics certified
  • Google AdWords certified
  • Nominated for Digital Analytics Association Award for Excellence
  • Bachelors degree in Internet Science
  • Founder of OptimizeSmart.com and EventEducation.com
I am also the author of the book Maths and Stats for Web Analytics and Conversion Optimization If you have any questions or comments please contact me