Google Analytics 4 Averages – Learn to Analyze & Report above Average
Nobody wants to be average, and yet we all love averages.
That is why our analytics reports are all jam-packed with averages:
In order to analyze and report above average, we would first need to stop being obsessed about all the metrics which are ‘average’ and take the insight they provide with a huge grain of salt.
Any set of measurements has two important properties:
- The central value
- The spread about that value.
We calculate the central value with the aim of determining a typical value in a data set.
A data set is a set of observed values for a particular variable (say avg. time on site).
We measure the spread with the aim to determine how similar or varied the set of observed values are in a data set.
If the set of observed values is similar, then the average (or mean) can be a good representative of all the values in the data set.
If the set of observed values varies by a large degree, then the average (or mean) is not a good representative of all the values in the data set.
We calculate the central value through Mean, Median and Mode.
We measure the spread of data values through Range, Interquartile Range (IQR), Variance and Standard Deviation.
Mean
The mean (also known as arithmetic mean or population mean) is simply an average of the numbers. It is denoted by the Greek letter µ (“mu”).
It is calculated as
Mean = sum of numbers /count of numbers
For example, let us suppose a website has got five web pages with the following engagement rate for each page:
Now engagement rate of the website = (35+40+0+48+100)/5 = 223/5 = 44.6%
But is 44.6% a true engagement rate?
No.
Look at the distribution of engagement rate across all the web pages.
Two web pages, page 3 and page 5 have extreme values of 0% and 100%.
We call such values ‘outliers’ in statistics.
Outliers have the sadistic ability to skew ‘averages’.
Another example:
Let us suppose a website has got five web pages with the following average engagement time for each page:
Now average engagement time on the website = (350+400+500+480+36000)/5 = 37730/5 = 7546 = 2 hrs 6 minutes
But is ‘2hrs 6 minutes’ a true average engagement time on the website?
No.
Look at the distribution of average time across all the web pages.
The web page ‘page 5’ has extreme values of 36000.
Again, the outlier ‘36000’ is skewing our average metric.
This is the fundamental problem with averages and the tragedy is that GA4 uses this metric throughout its reports.
You can’t really escape from ‘averages’.
As long you keep analyzing and reporting these average metrics, you will get average results.
So what is the solution then?
Calculate Median
Median is a middle number in a sorted list of numbers.
For example, let us suppose a website has got five web pages with the following engagement rate for each web page:
Let us first sort the list: 0%, 35%, 40%, 48%, 100%
If we calculate the median (instead of the mean) of this data set then it will be 40%.
Now is 40% a true representative of a typical engagement rate of each web page?
Yes.
This is because, unlike the mean, the median (or middle value) is not impacted by outliers (in our case: 0% and 100%).
Similarly,
Let us suppose a website has got five web pages with the following average engagement time for each page:
Let us first sort the list: 350, 400, 480, 500, 36000
Here the middle number is 480. So the median of the data set is 480.
Now is 480 seconds (or 8 minutes) a true representative of a typical engagement time on a web page?
Yes.
This is because, unlike the mean, the median (or middle value) is not impacted by outliers (in our case: 36000)
Note: You can always download Analytics data/report into excel and calculate the median of any data set (no matter how large) through MEDIAN excel function.
However, calculating the median of each and every data set all day long can be very time consuming and not practical for many.
So what is the solution?
The solution is that you first measure the spread of the data values in a data set and then decide whether or not you can trust the average value reported by your analytics tool, like Google Analytics.
There are two ways of measuring the spread:
1. You look at the distribution of values in a data set and find and eliminate outliers (or extreme values).
2. You calculate spread through IQR, variance or standard deviation.
Look at the distribution of values in a data set and find and eliminate outliers
I use this method majority of the time.
We measure the spread by calculating the ‘Range’, which is simply the difference between the maximum value and minimum value in a data set.
If the minimum value of average engagement time is something like 4 minutes and 30 seconds and the maximum value is something like 9 minutes and 30 seconds.
The Range is calculated as:
9 minutes 30 seconds – 4 minutes 30 seconds = 5 minutes
Let us suppose that the average engagement time of 9 minutes and 30 seconds is an outlier.
This outlier is skewing the ‘average engagement time’ because it has increased the value of the range.
If we discount this outlier, then the new maximum value would be something like 5 minutes 30 seconds.
So now the Range would be:
5 minutes 30 seconds – 4 minutes 30 seconds = 1 minute
A small range indicates that the central value (in our case, the average engagement time) is a better representative of the typical value in a data set.
So if we discount the outlier and then calculate the average engagement time, then we will get a better central value or typical value.
That’s why it is important that we look at the distribution of values, calculate the spread and identify and discount outliers before we choose to trust an average metric/value.
Granted, this is not the most accurate way to measure spread and determine the central value, but it is practical and works, esp. when you have to look at hundreds of reports day and night, and you don’t have time to calculate median or spread through IQR.
So instead of blindly relying on averages, you look at the distribution of data points.
Determine how narrow or widespread the distribution of values is in a data set by calculating the ‘Range’.
A very widespread distribution means you can’t rely on the average metric.
Another example: Average Rank
Not only do average metrics haunt Google Analytics reports; but you can also find them haunting Google Search Console reports:
Here is the actual distribution of ranking positions:
If you are ranking from position 2 to 3rd+ page (or better say position 2 to 30+ position) for a search query then you can not rely on an average value.
This is because the range of ranking positions is too large.
You don’t need to manually calculate the range here. It is quite evident from the distribution.
That’s why I urge you to look at the distribution.
If you don’t measure the spread of data values, you will never know whether or not your average value is a true representative of the typical value in a data set.
That’s why it is important that you calculate both the central value and the spread of the data values.
Note: You can calculate range in excel by using the formulas Max and Min.
For example: =MAX(F4:P4)-MIN(F4:P4). Here F4:P4 is a cell range.
Max() returns the largest value and Min() returns the lowest value in a set of values.
Calculating spread through IQR, Variance or Standard Deviation
The more difficult and time-consuming way of calculating spread is through IQR, variance or standard deviation.
If you have a very large data set with a lot of outliers, then you can’t depend upon the visual method I explained above to determine the spread of data values.
You then use IQR, variance or standard deviation to calculate the spread.
I recommend using IQR because it is a better measure of a spread than the range or standard deviation, as it is less likely to be distorted by outliers.
So you calculate the IQR and then decide whether you can rely on the average value reported by your analytics tools.
In order to understand IQR, you first need to understand quartiles.
A quartile is one of the four equal groups in which a data set can be divided. For example, consider the following ordered data set:
4 6 10 14 15 16 17 17 18 20 20
Here the point between the lowest 25% of the values is called the 25th Percentile or the lower Quartile.
The lower quartile is denoted by Q1.
The point between 50% of the values is called the 50th Percentile or the second Quartile.
This second quartile is actually the median. So median is also denoted by Q2.
The point between the lowest 75% of the values is called the 75th Percentile or the Upper Quartile.
The upper quartile is denoted by Q3.
The difference between the Upper quartile and lower quartile is called the Interquartile Range.
So, IQR = Q3-Q1
In Excel 2013 and beyond, there is a function called QUARTILE through which you can calculate Q1, Q3 and eventually IQR.
Syntax: =QUARTILE (array, quart)
Here ‘array’ is the range of cells that contain the data set.
‘Quart’ is the parameter that is used to specify which quartile to return.
It can have three values: First Quartile, Median Value and Third Quartile, as shown below:
Through the QUARTILE function, you can calculate the first and third quartiles.
Once you have done this, then find IQR using the formula Q3-Q1:
The data values that deviate from the middle value by more than twice the IQR are called outliers.
The data values that deviate from the middle value by more than 3.5 times the IQR are called ‘far outliers’.
In order to get a better understanding of how IQR works, you must know how it is calculated manually.
The following video explains calculating IQR manually.
Related Post: Common Google Analytics Mistakes that kill your Analysis, Reporting and Conversions
Segmentation – Powerful Method to fight ‘AVERAGES’
Another powerful method to reduce the negative impact of ‘average’ metrics on your analysis and business decision is ‘Data Segmentation’.
Segment like Hell.
The more you will segment the data, the smaller will be the data set, and the data values will be more close to the mean or average value.
In layman’s language, the more you will segment the data; the more accurate your average metrics will be.
Because of this reason, you will get a better insight if you analyze the Goal conversion rate of organic search for each of your goals in your target market (say New York) than the conversion rate of the organic search for all of the locations from which your site gets traffic.
Like it or not but you learned a lot of statistics in this post to fight averages.
In order to become above average in marketing or analytics, you need to learn even more statistics.
“Analyzing data without a basic understanding of statistics will always almost result in erroneous conclusions. “ – That’s my theory
I have proved this theory time and again in my posts:
- Is your conversion Rate Statistically Significant?
- Here is Why Conversion Volume Optimization is better than CRO
- Beginners Guide to Maths and Stats behind Web Analytics
- Predictive Analytics & Marketing – The Next Stage of Business Optimization
I can’t stress enough the importance of statistics and its supersets econometrics and data science in solving real-life problems.
Let me give you one good example.
According to the law of diminishing marginal utility, the first unit of consumption of a good/service produces more utility than the second and subsequent units.
This means the very first article that you will read on a topic, say ‘Google authorship’, will produce more benefits than the second and subsequent articles on the same topic.
So more articles you will read on ‘Google Authorship’, the less you will benefit from it.
Then soon, you will reach the point of diminishing returns and once you crossed this point, your efficiency will start decreasing, and you will be less productive.
Needless to say, in our industry, every new shiny thing/topic (from Pinterest to Google Authorship) is tortured to death in the name of blogging and thought leadership, and we tend to read every new article on the same topic in the hope of gaining something new.
But at the same time, we forget how the law of diminishing marginal utility is making us less and less productive with each additional unit of consumption.
Reading more will give you less time to do more, and if you do less and read more then you will learn less.
So reading more doesn’t always mean you learn more. It generally means you learn less. That’s why I suggest reading less.
So here we go. I just applied the law of diminishing marginal utility in solving a real-life problem (increasing productivity) and saving you countless hours.
Other articles on Maths and Stats in Web Analytics
- Beginners Guide to Maths and Stats behind Web Analytics
- What Matters More: Conversion Volume or Conversion Rate – Case Study
- The little known details about hypothesis in conversion optimization
- Is your conversion Rate Statistically Significant?
- Calculated Metrics in Google Analytics – Complete Guide
- Here is Why Conversion Volume Optimization is better than CRO
- Bare Minimum Statistics for Web Analytics
- Understanding A/B Testing Statistics to get REAL Lift in Conversions
- 10 Techniques to Migrate from Data Driven to Data Smart Marketing
- Data Driven or Data blind and why I prefer being Data Smart
- The Guaranteed way to Sell Conversion Optimization to your Client
- SEO ROI Analysis – How to do ROI calculations for SEO
Other articles related to GA4 (Google Analytics 4)
#1 Google Analytics 4 Intro
- What is GA4 (Google Analytics 4) – The Apps + Web Property?
- Key Benefits of Using Google Analytics 4 (GA4)
- GA4 Migration Guide – Learn to upgrade to GA4 from GA3 via checklist
- Google Analytics 4 (GA4) vs Universal Analytics – What is the Difference?
- GA4 vs GA4 360 – Pricing, Limits, Billing and More
- Google Analytics 4 Training & Tutorial with FREE GA4 ebook
- Google Analytics 4 (GA4) Channels, Source and Medium explained.
#2 Google Analytics 4 Property
- Google Analytics Account Hierarchy (Structure Explained)
- Understanding Google Analytics Measurement ID (GA4)
- Google Signals GA4 – See Demographics (Gender, Age) in Google Analytics 4
- Using the GA4 (Google Analytics 4) Test Property
- Google Analytics 4 Sub Properties Tutorial
- Roll up Property in Google Analytics 4 (GA4) – Tutorial
#3 Google Analytics 4 Integrations
- How to connect GA4 (Google Analytics 4) with Google Data Studio
- How to link GA4 (Google Analytics 4) with Google Ads
- How to link Google Search Console to Google Analytics 4 (GA4)
- How to Install Google Analytics 4 on Shopify
- GA4 Firebase Integration – Correctly Add App Data Streams to GA4 Property
#4 Google Analytics 4 Events
- GA4 (Google Analytics 4) Event Tracking Setup Tutorial
- Understanding Event Parameters in Google Analytics 4 (GA4)
- Recommended Events in Google Analytics 4 (GA4)
- Enhanced Measurement Events in Google Analytics 4 (GA4)
- Automatically Collected Events in Google Analytics 4 (GA4)
- How to Set Up GA4 Custom Events via Google Tag Manager
- Events Report in Google Analytics 4 (GA4)
- How to Rename Events in Google Analytics 4 (GA4)
- How to Use Google Analytics 4 Event Builder
- GA4 Form Interactions Tracking – Enhanced Measurement
#5 Google Analytics 4 Conversions
- Google Analytics 4 Conversion Tracking Guide – GA4 Goals
- How to Import Conversions from GA4 Property to Your Google Ads account
- GA4 Conversion Rate – How to find it and use it
#6 Google Analytics 4 Dimensions
- GA4 (Google Analytics 4) Dimensions Tutorial
- GA4 (Google Analytics 4) Custom Dimensions Tutorial
- GA4 User Properties (User Scoped Custom Dimensions) – Tutorial
- Event Scoped Custom Dimensions in GA4 – Tutorial
- How to remove (other) in GA4 reports and avoid Cardinality.
- How to remove not set in GA4 (Google Analytics 4)
#7 Google Analytics 4 Metrics
- GA4 (Google Analytics 4) Metrics Tutorial with Free Google Analytics 4 Ebook
- GA4 (Google Analytics 4) Custom Metrics Tutorial
- What are Predictive Metrics in Google Analytics 4 (GA4)
#8 Google Analytics 4 Ecommerce
#9 Google Analytics 4 Specialized Tracking
- GA4 (Google Analytics 4) Enhanced Measurement Tracking Tutorial
- Cross Domain Tracking in GA4 (Google Analytics 4) Setup Guide
- GA4 Site Search – Tracking Site Search in Google Analytics 4
- GA4 (Google Analytics 4) Scroll Tracking Tutorial
- Self-referral Google Analytics 4 – Referral exclusion GA4
- GA4 (Google Analytics 4) Data Import Tutorial
- Google Analytics 4 Content Grouping – Create Content Groups in GA4
- How to Track Single Page Apps in Google Analytics 4 (GA4)
- utm_source, utm_medium, utm_campaign Parameters – GA4 (Google Analytics 4)
- GA4 Form Tracking via Google Tag Manager
- How to Track Phone Calls in Google Analytics 4 – Call Tracking Tutorial.
- How to use Microsoft Clarity with GA4 (Google Analytics 4)
#10 Google Analytics 4 filters
- GA4 filters – Understanding Data Filters in Google Analytics 4
- How to Create and Test Filters in Google Analytics 4 (GA4)?
- Exclude Internal Traffic in GA4 (Google Analytics 4) via IP Filter
#11 Google Analytics 4 Explorations
- Free Form Report in GA4 (Google Analytics 4) – Exploration Report
- How to Use the User Lifetime Report in Google Analytics 4 (GA4)
- How to Use Path Exploration Report in GA4 (Google Analytics 4) – Path Analysis
- How to Use Segment Overlap Report in Google Analytics 4 (GA4)
- How to Use the Funnel Exploration Report in GA4 (Google Analytics 4) – Funnel Analysis
- Cohort Exploration Report in Google Analytics 4 (GA4)
- How to Create Landing Pages Report in GA4 (Google Analytics 4)
- How to Create Google Ads report in GA4 (Google Analytics 4)
- How to Segment GA4 Data by Data Stream
- Organic Search Traffic Analysis in GA4 – Complete Guide
- Google Analytics 4 (GA4) Outbound Links Tracking
- How to Track Email Campaigns and Traffic in GA4
- How to view full page URLs in GA4?
#12 Google Analytics 4 Advanced
- Understanding Google Analytics 4 Sessions
- GA4 (Google Analytics 4) Measurement Protocol Tutorial
- How to Build Comparisons (Advanced Segments) in Google Analytics 4 (GA4)
- Understanding Automated Insights in Google Analytics 4 (GA4)
- Understanding Channel Groupings in Google Analytics 4 (GA4)
- Understanding Data Sampling in Google Analytics 4 (GA4)
- Google Analytics 4 Regex (Regular Expressions) Tutorial
- Google analytics 4 GDPR compliance checklist
- How to Exclude URL Query Parameters in Google Analytics 4
- What is unassigned traffic in GA4 and how to fix it.
- Google Analytics 4 not working? Here is how to fix it.
#13 Google Analytics 4 Reports
- How to Create Custom Insights in Google Analytics 4 (GA4)
- How to Use Debug View Report in Google Analytics 4 (GA4)
- How to see Organic Search Keywords in GA4 (Google Analytics 4)
#14 Google Analytics 4 Attribution
- Guide to Attribution Models in GA4 (Google Analytics 4)
- How to Change Attribution Models in GA4 (Google Analytics 4)?
- GA4 (Google Analytics 4) Conversion Paths Report in Attribution
- GA4 (Google Analytics 4) Model Comparison Report in Attribution
- Advertising Snapshot in GA4 (Google Analytics 4) Attribution
- GA4 Attribution Modelling Tutorial
#15 Google Analytics 4 Audiences
- GA4 Audiences – Creating Custom Audience in Google Analytics 4
- How to Create a Remarketing Audience in Google Analytics 4 (GA4)
- Understanding Audience Triggers in Google Analytics 4 (GA4)
- Google Analytics 4 (GA4) Predictive Audiences – Tutorial
#16 Google Analytics 4 BigQuery
Nobody wants to be average, and yet we all love averages.
That is why our analytics reports are all jam-packed with averages:
In order to analyze and report above average, we would first need to stop being obsessed about all the metrics which are ‘average’ and take the insight they provide with a huge grain of salt.
Any set of measurements has two important properties:
- The central value
- The spread about that value.
We calculate the central value with the aim of determining a typical value in a data set.
A data set is a set of observed values for a particular variable (say avg. time on site).
We measure the spread with the aim to determine how similar or varied the set of observed values are in a data set.
If the set of observed values is similar, then the average (or mean) can be a good representative of all the values in the data set.
If the set of observed values varies by a large degree, then the average (or mean) is not a good representative of all the values in the data set.
We calculate the central value through Mean, Median and Mode.
We measure the spread of data values through Range, Interquartile Range (IQR), Variance and Standard Deviation.
Mean
The mean (also known as arithmetic mean or population mean) is simply an average of the numbers. It is denoted by the Greek letter µ (“mu”).
It is calculated as
Mean = sum of numbers /count of numbers
For example, let us suppose a website has got five web pages with the following engagement rate for each page:
Now engagement rate of the website = (35+40+0+48+100)/5 = 223/5 = 44.6%
But is 44.6% a true engagement rate?
No.
Look at the distribution of engagement rate across all the web pages.
Two web pages, page 3 and page 5 have extreme values of 0% and 100%.
We call such values ‘outliers’ in statistics.
Outliers have the sadistic ability to skew ‘averages’.
Another example:
Let us suppose a website has got five web pages with the following average engagement time for each page:
Now average engagement time on the website = (350+400+500+480+36000)/5 = 37730/5 = 7546 = 2 hrs 6 minutes
But is ‘2hrs 6 minutes’ a true average engagement time on the website?
No.
Look at the distribution of average time across all the web pages.
The web page ‘page 5’ has extreme values of 36000.
Again, the outlier ‘36000’ is skewing our average metric.
This is the fundamental problem with averages and the tragedy is that GA4 uses this metric throughout its reports.
You can’t really escape from ‘averages’.
As long you keep analyzing and reporting these average metrics, you will get average results.
So what is the solution then?
Calculate Median
Median is a middle number in a sorted list of numbers.
For example, let us suppose a website has got five web pages with the following engagement rate for each web page:
Let us first sort the list: 0%, 35%, 40%, 48%, 100%
If we calculate the median (instead of the mean) of this data set then it will be 40%.
Now is 40% a true representative of a typical engagement rate of each web page?
Yes.
This is because, unlike the mean, the median (or middle value) is not impacted by outliers (in our case: 0% and 100%).
Similarly,
Let us suppose a website has got five web pages with the following average engagement time for each page:
Let us first sort the list: 350, 400, 480, 500, 36000
Here the middle number is 480. So the median of the data set is 480.
Now is 480 seconds (or 8 minutes) a true representative of a typical engagement time on a web page?
Yes.
This is because, unlike the mean, the median (or middle value) is not impacted by outliers (in our case: 36000)
Note: You can always download Analytics data/report into excel and calculate the median of any data set (no matter how large) through MEDIAN excel function.
However, calculating the median of each and every data set all day long can be very time consuming and not practical for many.
So what is the solution?
The solution is that you first measure the spread of the data values in a data set and then decide whether or not you can trust the average value reported by your analytics tool, like Google Analytics.
There are two ways of measuring the spread:
1. You look at the distribution of values in a data set and find and eliminate outliers (or extreme values).
2. You calculate spread through IQR, variance or standard deviation.
Look at the distribution of values in a data set and find and eliminate outliers
I use this method majority of the time.
We measure the spread by calculating the ‘Range’, which is simply the difference between the maximum value and minimum value in a data set.
If the minimum value of average engagement time is something like 4 minutes and 30 seconds and the maximum value is something like 9 minutes and 30 seconds.
The Range is calculated as:
9 minutes 30 seconds – 4 minutes 30 seconds = 5 minutes
Let us suppose that the average engagement time of 9 minutes and 30 seconds is an outlier.
This outlier is skewing the ‘average engagement time’ because it has increased the value of the range.
If we discount this outlier, then the new maximum value would be something like 5 minutes 30 seconds.
So now the Range would be:
5 minutes 30 seconds – 4 minutes 30 seconds = 1 minute
A small range indicates that the central value (in our case, the average engagement time) is a better representative of the typical value in a data set.
So if we discount the outlier and then calculate the average engagement time, then we will get a better central value or typical value.
That’s why it is important that we look at the distribution of values, calculate the spread and identify and discount outliers before we choose to trust an average metric/value.
Granted, this is not the most accurate way to measure spread and determine the central value, but it is practical and works, esp. when you have to look at hundreds of reports day and night, and you don’t have time to calculate median or spread through IQR.
So instead of blindly relying on averages, you look at the distribution of data points.
Determine how narrow or widespread the distribution of values is in a data set by calculating the ‘Range’.
A very widespread distribution means you can’t rely on the average metric.
Another example: Average Rank
Not only do average metrics haunt Google Analytics reports; but you can also find them haunting Google Search Console reports:
Here is the actual distribution of ranking positions:
If you are ranking from position 2 to 3rd+ page (or better say position 2 to 30+ position) for a search query then you can not rely on an average value.
This is because the range of ranking positions is too large.
You don’t need to manually calculate the range here. It is quite evident from the distribution.
That’s why I urge you to look at the distribution.
If you don’t measure the spread of data values, you will never know whether or not your average value is a true representative of the typical value in a data set.
That’s why it is important that you calculate both the central value and the spread of the data values.
Note: You can calculate range in excel by using the formulas Max and Min.
For example: =MAX(F4:P4)-MIN(F4:P4). Here F4:P4 is a cell range.
Max() returns the largest value and Min() returns the lowest value in a set of values.
Calculating spread through IQR, Variance or Standard Deviation
The more difficult and time-consuming way of calculating spread is through IQR, variance or standard deviation.
If you have a very large data set with a lot of outliers, then you can’t depend upon the visual method I explained above to determine the spread of data values.
You then use IQR, variance or standard deviation to calculate the spread.
I recommend using IQR because it is a better measure of a spread than the range or standard deviation, as it is less likely to be distorted by outliers.
So you calculate the IQR and then decide whether you can rely on the average value reported by your analytics tools.
In order to understand IQR, you first need to understand quartiles.
A quartile is one of the four equal groups in which a data set can be divided. For example, consider the following ordered data set:
4 | 6 | 10 | 14 | 15 | 16 | 17 | 17 | 18 | 20 | 20 |
Here the point between the lowest 25% of the values is called the 25th Percentile or the lower Quartile.
The lower quartile is denoted by Q1.
The point between 50% of the values is called the 50th Percentile or the second Quartile.
This second quartile is actually the median. So median is also denoted by Q2.
The point between the lowest 75% of the values is called the 75th Percentile or the Upper Quartile.
The upper quartile is denoted by Q3.
The difference between the Upper quartile and lower quartile is called the Interquartile Range.
So, IQR = Q3-Q1
In Excel 2013 and beyond, there is a function called QUARTILE through which you can calculate Q1, Q3 and eventually IQR.
Syntax: =QUARTILE (array, quart)
Here ‘array’ is the range of cells that contain the data set.
‘Quart’ is the parameter that is used to specify which quartile to return.
It can have three values: First Quartile, Median Value and Third Quartile, as shown below:
Through the QUARTILE function, you can calculate the first and third quartiles.
Once you have done this, then find IQR using the formula Q3-Q1:
The data values that deviate from the middle value by more than twice the IQR are called outliers.
The data values that deviate from the middle value by more than 3.5 times the IQR are called ‘far outliers’.
In order to get a better understanding of how IQR works, you must know how it is calculated manually.
The following video explains calculating IQR manually.
Related Post: Common Google Analytics Mistakes that kill your Analysis, Reporting and Conversions
Segmentation – Powerful Method to fight ‘AVERAGES’
Another powerful method to reduce the negative impact of ‘average’ metrics on your analysis and business decision is ‘Data Segmentation’.
Segment like Hell.
The more you will segment the data, the smaller will be the data set, and the data values will be more close to the mean or average value.
In layman’s language, the more you will segment the data; the more accurate your average metrics will be.
Because of this reason, you will get a better insight if you analyze the Goal conversion rate of organic search for each of your goals in your target market (say New York) than the conversion rate of the organic search for all of the locations from which your site gets traffic.
Like it or not but you learned a lot of statistics in this post to fight averages.
In order to become above average in marketing or analytics, you need to learn even more statistics.
“Analyzing data without a basic understanding of statistics will always almost result in erroneous conclusions. “ – That’s my theory
I have proved this theory time and again in my posts:
- Is your conversion Rate Statistically Significant?
- Here is Why Conversion Volume Optimization is better than CRO
- Beginners Guide to Maths and Stats behind Web Analytics
- Predictive Analytics & Marketing – The Next Stage of Business Optimization
I can’t stress enough the importance of statistics and its supersets econometrics and data science in solving real-life problems.
Let me give you one good example.
According to the law of diminishing marginal utility, the first unit of consumption of a good/service produces more utility than the second and subsequent units.
This means the very first article that you will read on a topic, say ‘Google authorship’, will produce more benefits than the second and subsequent articles on the same topic.
So more articles you will read on ‘Google Authorship’, the less you will benefit from it.
Then soon, you will reach the point of diminishing returns and once you crossed this point, your efficiency will start decreasing, and you will be less productive.
Needless to say, in our industry, every new shiny thing/topic (from Pinterest to Google Authorship) is tortured to death in the name of blogging and thought leadership, and we tend to read every new article on the same topic in the hope of gaining something new.
But at the same time, we forget how the law of diminishing marginal utility is making us less and less productive with each additional unit of consumption.
Reading more will give you less time to do more, and if you do less and read more then you will learn less.
So reading more doesn’t always mean you learn more. It generally means you learn less. That’s why I suggest reading less.
So here we go. I just applied the law of diminishing marginal utility in solving a real-life problem (increasing productivity) and saving you countless hours.
Other articles on Maths and Stats in Web Analytics
- Beginners Guide to Maths and Stats behind Web Analytics
- What Matters More: Conversion Volume or Conversion Rate – Case Study
- The little known details about hypothesis in conversion optimization
- Is your conversion Rate Statistically Significant?
- Calculated Metrics in Google Analytics – Complete Guide
- Here is Why Conversion Volume Optimization is better than CRO
- Bare Minimum Statistics for Web Analytics
- Understanding A/B Testing Statistics to get REAL Lift in Conversions
- 10 Techniques to Migrate from Data Driven to Data Smart Marketing
- Data Driven or Data blind and why I prefer being Data Smart
- The Guaranteed way to Sell Conversion Optimization to your Client
- SEO ROI Analysis – How to do ROI calculations for SEO
Other articles related to GA4 (Google Analytics 4)
#1 Google Analytics 4 Intro
- What is GA4 (Google Analytics 4) – The Apps + Web Property?
- Key Benefits of Using Google Analytics 4 (GA4)
- GA4 Migration Guide – Learn to upgrade to GA4 from GA3 via checklist
- Google Analytics 4 (GA4) vs Universal Analytics – What is the Difference?
- GA4 vs GA4 360 – Pricing, Limits, Billing and More
- Google Analytics 4 Training & Tutorial with FREE GA4 ebook
- Google Analytics 4 (GA4) Channels, Source and Medium explained.
#2 Google Analytics 4 Property
- Google Analytics Account Hierarchy (Structure Explained)
- Understanding Google Analytics Measurement ID (GA4)
- Google Signals GA4 – See Demographics (Gender, Age) in Google Analytics 4
- Using the GA4 (Google Analytics 4) Test Property
- Google Analytics 4 Sub Properties Tutorial
- Roll up Property in Google Analytics 4 (GA4) – Tutorial
#3 Google Analytics 4 Integrations
- How to connect GA4 (Google Analytics 4) with Google Data Studio
- How to link GA4 (Google Analytics 4) with Google Ads
- How to link Google Search Console to Google Analytics 4 (GA4)
- How to Install Google Analytics 4 on Shopify
- GA4 Firebase Integration – Correctly Add App Data Streams to GA4 Property
#4 Google Analytics 4 Events
- GA4 (Google Analytics 4) Event Tracking Setup Tutorial
- Understanding Event Parameters in Google Analytics 4 (GA4)
- Recommended Events in Google Analytics 4 (GA4)
- Enhanced Measurement Events in Google Analytics 4 (GA4)
- Automatically Collected Events in Google Analytics 4 (GA4)
- How to Set Up GA4 Custom Events via Google Tag Manager
- Events Report in Google Analytics 4 (GA4)
- How to Rename Events in Google Analytics 4 (GA4)
- How to Use Google Analytics 4 Event Builder
- GA4 Form Interactions Tracking – Enhanced Measurement
#5 Google Analytics 4 Conversions
- Google Analytics 4 Conversion Tracking Guide – GA4 Goals
- How to Import Conversions from GA4 Property to Your Google Ads account
- GA4 Conversion Rate – How to find it and use it
#6 Google Analytics 4 Dimensions
- GA4 (Google Analytics 4) Dimensions Tutorial
- GA4 (Google Analytics 4) Custom Dimensions Tutorial
- GA4 User Properties (User Scoped Custom Dimensions) – Tutorial
- Event Scoped Custom Dimensions in GA4 – Tutorial
- How to remove (other) in GA4 reports and avoid Cardinality.
- How to remove not set in GA4 (Google Analytics 4)
#7 Google Analytics 4 Metrics
- GA4 (Google Analytics 4) Metrics Tutorial with Free Google Analytics 4 Ebook
- GA4 (Google Analytics 4) Custom Metrics Tutorial
- What are Predictive Metrics in Google Analytics 4 (GA4)
#8 Google Analytics 4 Ecommerce
#9 Google Analytics 4 Specialized Tracking
- GA4 (Google Analytics 4) Enhanced Measurement Tracking Tutorial
- Cross Domain Tracking in GA4 (Google Analytics 4) Setup Guide
- GA4 Site Search – Tracking Site Search in Google Analytics 4
- GA4 (Google Analytics 4) Scroll Tracking Tutorial
- Self-referral Google Analytics 4 – Referral exclusion GA4
- GA4 (Google Analytics 4) Data Import Tutorial
- Google Analytics 4 Content Grouping – Create Content Groups in GA4
- How to Track Single Page Apps in Google Analytics 4 (GA4)
- utm_source, utm_medium, utm_campaign Parameters – GA4 (Google Analytics 4)
- GA4 Form Tracking via Google Tag Manager
- How to Track Phone Calls in Google Analytics 4 – Call Tracking Tutorial.
- How to use Microsoft Clarity with GA4 (Google Analytics 4)
#10 Google Analytics 4 filters
- GA4 filters – Understanding Data Filters in Google Analytics 4
- How to Create and Test Filters in Google Analytics 4 (GA4)?
- Exclude Internal Traffic in GA4 (Google Analytics 4) via IP Filter
#11 Google Analytics 4 Explorations
- Free Form Report in GA4 (Google Analytics 4) – Exploration Report
- How to Use the User Lifetime Report in Google Analytics 4 (GA4)
- How to Use Path Exploration Report in GA4 (Google Analytics 4) – Path Analysis
- How to Use Segment Overlap Report in Google Analytics 4 (GA4)
- How to Use the Funnel Exploration Report in GA4 (Google Analytics 4) – Funnel Analysis
- Cohort Exploration Report in Google Analytics 4 (GA4)
- How to Create Landing Pages Report in GA4 (Google Analytics 4)
- How to Create Google Ads report in GA4 (Google Analytics 4)
- How to Segment GA4 Data by Data Stream
- Organic Search Traffic Analysis in GA4 – Complete Guide
- Google Analytics 4 (GA4) Outbound Links Tracking
- How to Track Email Campaigns and Traffic in GA4
- How to view full page URLs in GA4?
#12 Google Analytics 4 Advanced
- Understanding Google Analytics 4 Sessions
- GA4 (Google Analytics 4) Measurement Protocol Tutorial
- How to Build Comparisons (Advanced Segments) in Google Analytics 4 (GA4)
- Understanding Automated Insights in Google Analytics 4 (GA4)
- Understanding Channel Groupings in Google Analytics 4 (GA4)
- Understanding Data Sampling in Google Analytics 4 (GA4)
- Google Analytics 4 Regex (Regular Expressions) Tutorial
- Google analytics 4 GDPR compliance checklist
- How to Exclude URL Query Parameters in Google Analytics 4
- What is unassigned traffic in GA4 and how to fix it.
- Google Analytics 4 not working? Here is how to fix it.
#13 Google Analytics 4 Reports
- How to Create Custom Insights in Google Analytics 4 (GA4)
- How to Use Debug View Report in Google Analytics 4 (GA4)
- How to see Organic Search Keywords in GA4 (Google Analytics 4)
#14 Google Analytics 4 Attribution
- Guide to Attribution Models in GA4 (Google Analytics 4)
- How to Change Attribution Models in GA4 (Google Analytics 4)?
- GA4 (Google Analytics 4) Conversion Paths Report in Attribution
- GA4 (Google Analytics 4) Model Comparison Report in Attribution
- Advertising Snapshot in GA4 (Google Analytics 4) Attribution
- GA4 Attribution Modelling Tutorial
#15 Google Analytics 4 Audiences
- GA4 Audiences – Creating Custom Audience in Google Analytics 4
- How to Create a Remarketing Audience in Google Analytics 4 (GA4)
- Understanding Audience Triggers in Google Analytics 4 (GA4)
- Google Analytics 4 (GA4) Predictive Audiences – Tutorial
#16 Google Analytics 4 BigQuery
My best selling books on Digital Analytics and Conversion Optimization
Maths and Stats for Web Analytics and Conversion Optimization
This expert guide will teach you how to leverage the knowledge of maths and statistics in order to accurately interpret data and take actions, which can quickly improve the bottom-line of your online business.
Master the Essentials of Email Marketing Analytics
This book focuses solely on the ‘analytics’ that power your email marketing optimization program and will help you dramatically reduce your cost per acquisition and increase marketing ROI by tracking the performance of the various KPIs and metrics used for email marketing.
Attribution Modelling in Google Analytics and BeyondSECOND EDITION OUT NOW!
Attribution modelling is the process of determining the most effective marketing channels for investment. This book has been written to help you implement attribution modelling. It will teach you how to leverage the knowledge of attribution modelling in order to allocate marketing budget and understand buying behaviour.
Attribution Modelling in Google Ads and Facebook
This book has been written to help you implement attribution modelling in Google Ads (Google AdWords) and Facebook. It will teach you, how to leverage the knowledge of attribution modelling in order to understand the customer purchasing journey and determine the most effective marketing channels for investment.