# Bare Minimum Statistics for Web Analytics

The role of statistics in the world of web analytics is not clear to many marketers. Not many talk or write about the usage of statistics and data science in web analytics.

Unfortunately by and large, analytics industry is still dominated by data collection methodologies and tools. We all are obsessed about collecting more data. Lot of different type of data. But rarely do we focus more on analysing and interpreting the data we already have.

Someone learn a new hack about collecting a particular type of data and then he /she blog about it in the name of analytics.  Then there are people who talk about excel hacks in the name of analytics. But neither Excel hacks nor data collection tips and tricks will improve your business bottomline.

What that will really improve your business bottomline is the accurate interpretation of the data and the actions you take on the basis of that interpretation.

Only by leveraging the knowledge of statistics and understanding the context, you can accurately interpret data and take actions which can improve your business bottomline.

I spent awful lot of time reading books and articles on stats and data science in a hope that I will find something which might help me in my career. And I must admit that majority of topics I read on stats, initially don’t seem to have anything to do directly with my job.

This could be one reason why statistics is not taken seriously in the internet marketing industry. But overall, stats knowledge has improved my interpretation of data. I even have a section on this blog dedicated to the usage of statistics in web analytics called: Analytics Maths & Stats

I am constantly looking for new ways to implement statistics in web analytics.

This article talks about the bare minimum statistics which I think every internet marketer should be familiar with, in order to get optimum results from their analysis and campaigns.

I will explain some of the most useful stats terms/concepts one by one and will also show you their practical use in web analytics. So that you can take advantage of them straightaway.

## Statistical inference

It is the process of drawing conclusion from the data which is subject to random variation. Observational error is an example of statistical inference.

Practical use in web analytics

For e.g. consider the performance of 3 campaigns A, B and C in the last one month.

Here campaign ‘B’ seems to have the highest conversion rate. Does that mean, campaign B is performing better than campaign A and campaign C?

The answer is we don’t know for sure. This is because here we are assuming that campaign B has highest conversion rate only on the basis of our observation. So if there is an observational error, our assumption could be wrong.

Observational error is the difference between the collected data and the actual data.

In order to minimize observational error, we need to segment the ecommerce conversion rate into visits and transactions:

Now we know that campaign B doesn’t have the highest conversion rate as its sample size is too small. More about sample size later.

## Population and Sub-Population

Population – set of entities from which statistical inference is drawn. Also known as statistical population.

Sub-population – is a subset of population.

Practical use in web analytics

If you consider ‘Campaign C ‘above as a PPC campaign then its ad groups can be considered as sub-population. In order to understand the properties of statistical population, statisticians first try to understand the properties of individual sub-populations.

For the same reason, analyst recommend to segment the data. So if you want to understand the performance of ‘Campaign C’, then you should first try to understand the performance of its individual ad groups.

Similarly if you want to understand the performance of individual ad groups, you should first try to understand the performance of the individual keywords and ad copies in each ad group.

## Sample, Bad Sample and Sample Size

Sample - is that subset of population which represents the entire population. So analysing the sample should produce similar results as analysing all of the population.Sampling is carried outto analyse large data sets in a reasonable amount of time and in a cost efficient manner.

Bad Sample – is that subset of population which doesn’t represents the entire population. So analysing the bad sample will not produce similar results as analysing all of the population.

Sample size – it is the size of the sample. Larger the sample size, more reliable is the analysis.

Practical use in web analytics

Consider the following three campaigns:

Here campaign B doesn’t have the highest conversion rate because its sample size is too small. Just 4 transactions out of 20 visits. Had campaign B got 1 transaction out of 1 visit, its conversion rate would be 100%.Will that make its performance even better? No.

Google Analytics is notorious for its data sampling issues. And when you have got data sampling issues, the reported data/metrics can be anywhere from 10% to 80% off the mark as the sample selected by GA for its analysis would be a bad sample (the one which doesn’t represent the entire population/traffic on your site).

So you need to avoid data sampling issues as much as possible before you interpret your data.

## Statistical significance

Statistical significance means statistically meaningful.

Statistical Significant result – result which is unlikely to have occurred by chance.

Statistically Insignificant result – result which is likely to have occurred by chance.

Practical use in web analytics

The term ‘Statistical significance’ is used a lot in conversion optimization and especially A/B testing. If the result from your A/B test is not statistically significant than any uplift you see in you A/B test results will not translate into increased sales.

Another example:

Consider the following campaigns:

Here statistical significance is the statistical significance of the difference in conversion rates of the two campaigns: ‘A’ and ‘C’ and is calculated by conducting a statistical test like ‘T’ test or ‘Z’ test.

You can use this bookmarklet (based on ‘Z’ test) or this chrome extension from Lunametrics (based on ‘T’ test) to calculate the statistical significance in Google Analytics.

In this case statistical significance turned out to be 98%. So what that means is that we are 98% confident that the difference in conversion rates of the two campaigns: ‘A’ and ‘B’ is not by chance.

That means the conversion rate of campaign ‘A’ is actually higher than the conversion rate of campaign ‘C’ and is not just an observational error.

## Effect, Effect Size and Noise

Effect – in statistics effect is the result of something.

Effect Size (or Signal) - it is the magnitude of the result and is calculated as:

Examples of effect size – sales,orders, leads,profit etc.

‘Noise’ - it is the amount of unexplained variation/randomness in a sample.

Confidence (or Statistical Confidence) - It is the confidence that the result has not occurred by a chance.

Practical use in web analytics

So what it means that if your A/B test reports an uplift of 5% in conversion rate, it doesn’t not automatically result in actual uplift of 5% in conversion rate.

Had increasing conversion rate would be so easy, every website owner running A/B tests would be a millionaire by now. So you need to calculate the effect size.

Consider the following campaigns:

From the table above, you can conclude that the effect size (revenue) of campaign C is much higher than the effect size of campaign A.

So even when we are now statistically confident that Campaign A has higher conversion rate than campaign C, we should still be investing more in Campaign C because it has much larger effect size.

In the real world, what that really matters is the effect size i.e. sales, orders, leads, profits… and not the lame conversion rate.

It is the effect size which brings food on the table. It is the effect size which generates salary for the employees. It is the effect size which run business operations.

So whatever you do under conversion optimization must have considerable impact on the effect size. Impact on the conversion rate is secondary.

So if you are running A/B tests then it must considerable improve sales and gross profit over time. Double or triple digits increase in conversion rate is meaningless otherwise.

## Hypothesis

Null Hypothesis – according to null hypothesis, any kind of difference you see in a data set is due to chance and not due to a particular relationship.

Null hypothesis can never be proven – a statistical test can only reject a null hypothesis or fail to reject a null hypothesis. It cannot prove a null hypothesis.

Alternative Hypothesis – It is the opposite of the null hypothesis. According to alternative hypothesis, any kind of difference you see in a data set is due to a particular relationship and not due to chance.

In statistics the only way to prove your hypothesis is to reject the null hypothesis. You don’t prove the alternative hypothesis to support your hypothesis.

Solid Hypothesis – hypothesis based on qualitative data and not on personal opinion.

Practical use in web analytics

Before you conduct any test (A/B, multivariate or statistical test like ‘t’ or ‘z’ test), you need to form hypothesis. This hypothesis is based on your understanding of the client’s business and qualitative data.

For example:

Null Hypothesis can be something like: changing the colour of the ‘order now’ button to red will not improve the conversion rate.

Alternative Hypothesis can be something like changing the colour of the ‘order now’ button to red will improve the conversion rate.

Once you have formed your hypothesis, you conduct a test with the aim to reject your null hypothesis.

## False Positive and False Negative

False positive – it is a positive test result which is more likely to be false than true. For example, an A/B test which shows that one variation is better than the other when it is not really the case.

False negative – it is a negative test result which is more likely to be true than false. For example, an A/B test which shows that there is no statistical difference between the two variations when there actually is.

Type I error is the incorrect rejection of a true null hypothesis. It represents a false positive error.

Type II error is the failure to reject a false null hypothesis. It represents a false negative error.

All statistical tests have a probability of making type I and type II errors.

The probability of a test to make type I error is known as false positive rate or significance level and is denoted by greek letter alpha (α). A significance level of 0.05 means that there is a 5% chance of a false positive.

The probability of a test to make type II error is known as false negative rate and is denoted by greek letter beta (β). A false negative rate of 0.05 means that there is a 5% chance of a false negative.

Statistical power is the probability of a statistical test to accurately detect an effect (or accurately rejects the null hypothesis), if the effect actually exists. It is expressed as a percentage.

Statistical power (or Power of Statistical Test) = 1- false negative rate or 1−β

So if Statistical power of a test is 95% then it means there is 95% probability that the statistical test can correctly detect an effect and 5% probability that it can’t.

This 5% probability that the statistical test can’t correctly detect an effect is the false negative rate.

Practical use in web analytics

Lot of A/B test gurus and A/B testing softwares will tell you to stop your test once you reached a statistical significance of 95% or more. Now the problem with this approach is that you will continue testing until you get a statistically significant result, while choosing the sample size as you go with your test.

The consequence of this approach is that your probability of getting a statistically significant result by coincidence will go much higher than 5%. That means you will increase your chance of getting type I error in your test. That means your test will increase the rate of false positives.

The fundamental problem with statistics is that, if you want to reach to the conclusion you really want (may be deep down inside on a subconscious level), you can always find some way to do it.

To reduce the rate of false positives, decide your test sample size in advance and then just stick to it.

Don’t use statistical significance alone to decide whether your test should continue or stop. Statistical significance of 95% or higher doesn’t mean anything, if there is little to no impact on effect size (conversion volume).

Don’t believe in any uplift you see in your A/B test until the test is over. Focus on the effect size per variation while the test is running.

Any uplift you see in you A/B test results will not translate into actual sales even after conducting several A/B tests and getting statistically significant results each time, if:

• There is little to no impact on effect size (conversion volume).
• You declare success and failure on the basis of statistical significance alone.

## Correlation and Causation

Correlation – It is a statistical measurement of relationship between two variables. Let us suppose ‘A’ and ‘B’ are two variables. If as ‘A’ goes up, ‘B’ goes up then ‘A’ and ‘B’ are positively correlated. However if as ‘A’ goes up, ‘B’ goes down then ‘A’ and ‘B’ are negatively correlated.

Causation is the theory that something happened as a result. For example, fall in temperature increased the sale of hot drinks.

Practical use in web analytics

The most important correlations that I have found so far are:

1. Negative Correlation between Conversion Rate and Average Order Value
2. Negative Correlation between Conversion Rate and Transactions
3. Positive Correlation between Conversion Rate and Acquisition Cost

These three correlations have completely changed the way I think about conversion optimization for good.

You can get more details about these correlations from the post: Case Study: Why you should Stop Optimizing for Conversion Rate

The whole conversion optimization process is based on correlation analysis. Correlation based observations help you in coming up with a hypothesis. This is the hypothesis without which you can’t conduct any statistical tests and thus improve conversions.

Correlation is also widely used in Predictive Analytics and Predictive Marketing. Before you can predict the value of a dependant variable from Independent variable, you first need to prove that the correlation between two variables is not weak or zero. Otherwise such relationship is not good to predict anything.

That means mere presence of relationship between two variables/events doesn’t imply that one causes the other. For example we cannot automatically assume that increase in social shares has resulted in improvement in search engine rankings.

Before we can prove correlation between social shares and rankings, we first need to prove that linear relationship exist between social shares and rankings i.e. any increase or decrease in the value of social shares cause corresponding increase or decrease in search engine ranking.

Without first proving linear relationship, you could end up forming and testing wrong hypothesis. Once you have proved the correlation between social shares and rankings then you determine the correlation coefficient to measure the strength and direct of this linear relationship.

If the linear relationship is strong then you go ahead and conduct regression analysis to predict the value of one variable from another. Needless to say, correlation and regression are two strong pillars of conversion optimization and are very important for you as a Digital marketer.

Join over 5000 subscribers!
Receive an update straight to your inbox every time I publish a new article.

My business thrives on referrals, so I really appreciate recommendations to people who would benefit from my help. Please feel free to endorse/forward my LinkedIn Profile to your clients, colleagues, friends and others you feel would benefit from SEO, PPC or Web Analytics.

• Noah Haibach

This is a great read, thanks! I especially appreciate your comments on type I error and effect size. Too often do we see A/B interpretation without consideration of these factors.

And I think it’s also easily forgotten to test for the significance of a correlation – also a good point!

• optimizesmart