Understanding Data Sampling in Google Analytics 4 (GA4)

Table of Contents for Understanding Data Sampling in Google Analytics 4 (GA4)

  1. Data sampling in Google Analytics 4 
  2. Sampling difference in Google Analytics 4 Vs Universal Analytics 
  3. Hit limits in Google Analytics 4
  4. Thresholding in Google Analytics 4
  5. Cardinality in Google Analytics 4
  6. Summary

In this article, I am going to talk about understanding data sampling in Google Analytics 4 (GA4). I will also cover hit limits, thresholding and cardinality in Google Analytics 4.

In data analysis, sampling is the process of analyzing a subset of data for analysis and reporting based on the similarity detected in the subset and the larger data set. 

For example, if you want to estimate the number of cars parking in a 1000 square meter area where the distribution of car parking was fairly uniform, you could count the number of cars parking in 10 square meters and multiply by 100, or count the cars parking in a 5 square meter and multiply by 200 to get an accurate representation of the entire 1000 square meters. 

In Google Analytics 4 there are few reports which are always unsampled and a few reports are sampled based on the conditions. Let’s understand how sampling happens in GA4, in more detail.

62 point checklist 
Get the E-Book (50 Pages)
Google Analytics 4 thumb 
Get the FREE E-Book (50+ Pages)

Data sampling in Google Analytics 4 

In Google Analytics 4, reporting is divided into two categories in the ‘Analysis’ tab; standard reports and advanced reports. 

Standard reports are always unsampled in GA4 (based on 100% of data for the selected date range) and advanced reports are sometimes sampled based on the conditions of what data you choose to see. 

The below image shows the standard reporting options in GA4 which are unsampled

Standard reports in ga4

The next image shows the advanced reporting options in GA4 which are sometimes sampled.

Sometime sample reports

These advanced reports include the following techniques:

technique 1

Unlike in Universal Analytics, if you apply a secondary dimension or segment to the standard reports, the data may be sampled. But in the case of GA4 you can apply comparisons, secondary dimensions, filter your reports, and everything will continue to be unsampled.

If you are viewing an unsampled GA4 report then you will see a green reporting icon with a checkmark at the top of the report:

green symbol

If you hover your mouse over the green reporting icon you will see the following message “This report is based on 100.0% of available data.”

100 percent data

If you are viewing a sampled GA4 report then you will see a yellow reporting icon with a % symbol at the top of the report:

Yello symbol

If you hover your mouse over the yellow reporting icon you will see the following message “This report is based on XX% of available data.” (In our case XX represents 95.28%).

sampled data

Sampling differences in Google Analytics 4 Vs Universal Analytics

In Universal Analytics, default reports (standard reports) are not subject to sampling. But if you apply ad-hoc queries to your data (like secondary dimensions or segments) they are subject to the below general sampling thresholds.

  • Standard Analytics: 500k sessions at the property level for the date range you are using
  • Analytics 360: 100M sessions at the view level for the date range you are using

If you want to know more about sampling in Universal Analytics you can read it here: Understanding Data Sampling in Google Analytics

In the case of Google Analytics 4, the default reports (standard reports) are always unsampled. You can apply comparisons and custom parameters to your report and all the reports will continue to be unsampled. 

Only the advanced report in the ‘Analysis’ tab may sometimes be sampled. In general, sampling occurs in advanced reporting when the data exceeds 10 million in counts and the report that you are creating is not a replica (similar) to the standard report.  

Hit limits in Google Analytics 4

In the case of Universal Analytics (standard), there is a hit limit of 10 million per month per account. However, Google Analytics 4 is a free tool and has no hit limits. I have searched a lot about this but it is not mentioned anywhere in the documentation. This makes it a more premium analytics tool at no cost.

Thresholding in Google Analytics 4

In Google Analytics 4, thresholds are applied to prevent anyone viewing a report from inferring the demographics or interests of individual users to the website. 

When a report contains age, gender, or interest categories (e.g. as a primary or secondary dimension, a data comparison, or a segment), a threshold may be applied and some data may be kept hidden (unknown) from the report. 

These thresholds are defined by Google and you cannot adjust them. However, if a threshold has been applied to a report, you will see unknown values in the report.  These values are replaced by “unknown” to keep user identity and basic information hidden.

thresholding

Cardinality in Google Analytics 4

Each report in Google Analytics 4 has dimensions assigned to it and each dimension has several values that can be assigned to it. For example, the gender dimension has three potential values (male, female or other) so the cardinality for that dimension is three. 

The total number of unique values for a dimension is known as its cardinality. 

Dimensions with a large number of possible values are known as high-cardinality dimensions. For example the page dimension has different values for every URL on your website. 

If a report contains high-cardinality dimensions then it may get affected by Google Analytics system limits (Google-defined) resulting in the creation of rolled-up (other) entries in the report. 

Cardinality may occur in standard reports as well as advanced reports in the ‘Analysis’ tab. 

There is no such definition available from Google on when cardinality appears (limit) but in general this may occur if you have more than 25,000 to 30,000 unique values for a dimension in the selected date range.

cardinality

Summary

GA4 will always show you unsampled reports for standard reports and only in the case of advanced reporting options in the ‘Analysis’ tab (cohort analysis, exploration, segment overlap, funnel analysis, path analysis, and user explorer), they might be sampled. 

Related Article: Google Analytics Sampling Tutorial

Other articles related to GA4 (Google Analytics 4)

#1 GA4 Intro

  1. What is GA4 (Google Analytics 4) – The Apps + Web Property?
  2. Key Benefits of Using Google Analytics 4 (GA4)
  3. How to upgrade to GA4 (Google Analytics 4)
  4. Google Analytics 4 (GA4) vs Universal Analytics – What is the difference?
  5. Google Signals GA4 – See demographics (gender, age) in Google Analytics 4
  6. Understanding Google Analytics Measurement ID (GA4)
  7. Google Analytics 4 training and tutorial
  8. Using the GA4 test property

#2 GA4 Integration

  1. How to connect GA4 (Google Analytics 4) with Google Data Studio
  2. How to link GA4 (Google Analytics 4) with Google Ads
  3. GA4 BigQuery – Connect Google Analytics 4 with BigQuery

#3 GA4 Events

  1. GA4 (Google Analytics 4) Event Tracking Setup Tutorial
  2. How to set up GA4 Custom Events via Google Tag Manager
  3. GA4 (Google Analytics 4) Enhanced Measurement Tracking Tutorial
  4. Events Report in Google Analytics 4 (GA4)
  5. GA4 User Properties (User Scoped Custom Dimensions) – Tutorial
  6. Event Scoped Custom Dimensions in GA4 – Tutorial

#4 GA4 Conversions

  1. GA4 (Google Analytics 4) Conversion Tracking Guide
  2. How to import conversions from GA4 property to your Google Ads account

#5 GA4 Dimensions and Metrics

  1. GA4 Metrics Tutorial with Free Google Analytics 4 Ebook
  2. GA4 Custom Metrics Tutorial
  3. GA4 Dimensions Tutorial
  4. GA4 Custom Dimensions Tutorial

#6 GA4 Ecommerce

  1. GA4 (Google Analytics 4) Ecommerce Tracking via GTM – Tutorial

#7 GA4 Specialized Tracking

  1. Cross Domain Tracking in GA4 (Google Analytics 4) Setup Guide
  2. GA4 Site Search – Tracking Site Search in Google Analytics 4
  3. GA4 (Google Analytics 4) Scroll Tracking Tutorial
  4. How to Install Google Analytics 4 on Shopify
  5. Self-referral Google Analytics 4 – Referral exclusion GA4
  6. GA4 Data Import Tutorial

#8 GA4 filters

  1. GA4 filters – Understanding data filters in Google Analytics 4
  2. How to create and test filters in Google Analytics 4 (GA4)?
  3. Exclude internal traffic in GA4 (Google Analytics 4) via IP filter

#9 GA4 Analysis Hub

  1. Analysis Hub Google Analytics – Exploration Report in GA4 (Google Analytics 4)
  2. How to use the user lifetime report in Google Analytics 4 (GA4)
  3. How to use path analysis report in Google Analytics 4 (GA4)
  4. How to use Segment Overlap Report in Google Analytics 4 (GA4)
  5. How to use the Funnel Analysis Report in Google Analytics 4 (GA4)

#10 GA4 Advanced

  1. How to use Debug View report in Google Analytics 4 (GA4)
  2. Understanding GA4 measurement protocol
  3. How to create a remarketing audience in Google Analytics 4 (GA4)
  4. GA4 Audiences – Creating custom audience in Google Analytics 4
  5. How to build comparison (Advanced segment) in Google Analytics 4 (GA4)

#11 GA4 Reporting

  1. How to create custom insights in Google Analytics 4 (GA4)


Frequently Asked Questions about Understanding Data Sampling in Google Analytics 4 (GA4)

What is data sampling?

In data analysis, sampling is the process of analyzing a subset of data for analysis and reporting based on the similarity detected in the subset and the larger data set. 

Are all reports in GA4 sampled?

Standard reports are always sampled in GA4 (based on 100% of data for the selected date range) and advanced reports are sometimes sampled based on the conditions of what data you choose to see. 

What is thresholding in GA4?

When a report contains age, gender, or interest categories (e.g. as a primary or secondary dimension, a data comparison, or a segment), a threshold may be applied and some data kept hidden (unknown) from the report. 

These thresholds are defined by Google and you cannot adjust them. However if a threshold has been applied to a report, you will see unknown values in the report.  These values are replaced by “unknown” to keep user identity and basic information hidden.

What is cardinality in GA4?

The total number of unique values for a dimension is known as its cardinality. For example, the gender dimension has three potential values (male, female or other) so the cardinality for that dimension is three. 

Register for the FREE TRAINING...

"How to use Digital Analytics to generate floods of new Sales and Customers without spending years figuring everything out on your own."



Here’s what we’re going to cover in this training…

#1 Why digital analytics is the key to online business success.

​#2 The number 1 reason why most marketers are not able to scale their advertising and maximize sales.

#3 Why Google and Facebook ads don’t work for most businesses & how to make them work.

#4 ​Why you won’t get any competitive advantage in the marketplace just by knowing Google Analytics.

#5 The number 1 reason why conversion optimization is not working for your business.

#6 How to advertise on any marketing platform for FREE with an unlimited budget.

​#7 How to learn and master digital analytics and conversion optimization in record time.



   

My best selling books on Digital Analytics and Conversion Optimization

Maths and Stats for Web Analytics and Conversion Optimization
This expert guide will teach you how to leverage the knowledge of maths and statistics in order to accurately interpret data and take actions, which can quickly improve the bottom-line of your online business.

Master the Essentials of Email Marketing Analytics
This book focuses solely on the ‘analytics’ that power your email marketing optimization program and will help you dramatically reduce your cost per acquisition and increase marketing ROI by tracking the performance of the various KPIs and metrics used for email marketing.

Attribution Modelling in Google Analytics and BeyondSECOND EDITION OUT NOW!
Attribution modelling is the process of determining the most effective marketing channels for investment. This book has been written to help you implement attribution modelling. It will teach you how to leverage the knowledge of attribution modelling in order to allocate marketing budget and understand buying behaviour.

Attribution Modelling in Google Ads and Facebook
This book has been written to help you implement attribution modelling in Google Ads (Google AdWords) and Facebook. It will teach you, how to leverage the knowledge of attribution modelling in order to understand the customer purchasing journey and determine the most effective marketing channels for investment.

About the Author

Himanshu Sharma

  • Founder, OptimizeSmart.com
  • Over 15 years of experience in digital analytics and marketing
  • Author of four best-selling books on digital analytics and conversion optimization
  • Nominated for Digital Analytics Association Awards for Excellence
  • Runs one of the most popular blogs in the world on digital analytics
  • Consultant to countless small and big businesses over the decade
error: Alert: Content is protected !!