GA4 Behavioral, Conversion Modeling and Consent Mode Guide
What is modeling in GA4?
“Modelling in GA4” actually refers to different types of data modelling techniques used within Google Analytics 4.
GA4 offers the following types of data modelling techniques:
#1 Behavioural modeling: Estimates user behaviour when they don’t consent to cookies.
#2 Conversion modeling: Estimates the impact of marketing when conversions can not be directly attributed to traffic source.
#3 Attribution modeling: Determines credit for conversions across touchpoints.
#4 Predictive metrics: Anticipates user behaviour like purchase or churn.
Observed data vs. Training data vs. Modelled data.
There are three categories of data in the context of GA4 data modelling:
- Observed data.
- Training data.
- Modelled data.
Get weekly practical tips on GA4 and/or BigQuery to accurately track and read your analytics data.
#1 Observed data
It is the actual data which comes directly from users who granted consent for GA4 to track their behaviour using identifiers like cookies or app IDs.
It provides precise and reliable information about their behaviour, including metrics like user counts, sessions, page views, events, and conversions.
#2 Training data
It is a combination of observed data and labelled data (also known as ‘labelled examples‘) used to train the machine learning algorithms behind modelled data.
Labelled data are data points with assigned labels/categories. They are used to guide and improve machine learning algorithms.
Examples of labelled data in GA4:
- Labelling specific events like “Add to cart” and “Checkout completed” as “conversion steps” to train the algorithm about your conversion funnel.
- Identifying user sessions with high page views and long average session time as “engaged users” to help predict future engagement patterns.
- Categorizing users based on demographics and past purchasing behaviour to improve user segmentation and personalization efforts.
The training data directly influences the accuracy and effectiveness of modelled data.
Biases within the training data can be reflected in the modelled data, leading to inaccurate predictions or insights.
Therefore, labelling should be done accurately and consistently to avoid confusing the algorithm.
By understanding the role and importance of labelled data, you can actively contribute to improving the effectiveness of GA4 data modelling and gain more accurate and actionable insights for optimizing your website or app performance.
#3 Modelled data
It is the estimated data for users who did not grant consent (opt-out users).
The modelled data also comes directly from users who granted consent for GA4 to track their behaviour using identifiers like cookies or app IDs.
In other words, the modelling itself leverages observed data.
Machine learning algorithms analyze patterns and behaviour from users who did consent and use these insights to estimate the behaviour of similar opt-out users.
Therefore, modelled data isn’t directly collected from opt-out users but inferred from observed data with similar characteristics.
This distinction is crucial for interpreting reports in GA4.
While modelled data helps fill in data gaps and provide insights into opt-out user behaviour, it is important to remember that it’s an estimation and may not be as accurate as observed data.
Note: GA4 strives only to report modelled data when it has a high degree of confidence in its accuracy. This helps avoid misleading users with potentially inaccurate insights.
Why is data modeling needed in GA4?
Any situation where user data is partially missing or unavailable due to privacy regulations, restrictions or technical limitations (like restricted third-party cookies and identifiers) is a key reason for using modelling—for example, consent banners and missing data from opt-out users.
In short, GA4’s modelling helps website/app owners gain insights while respecting user privacy in a changing data landscape.
How does modeling in GA4 impact your data and reports?
#1 Modeling in path and funnel explorations is applied differently than in standard reports.
#2 Behavioural Modeling is not supported for the following GA4 features:
- Audiences: You can’t create audiences based on modelled data.
- Most Explorations: Except for free-form tables, other exploration types (user explorer, cohort, user lifetime explorations, etc.) won’t include modelled data.
- Retention reports: These reports focus on user behaviour over time and don’t currently incorporate modelling.
- Segments with sequences: Segments involving user actions across multiple sessions don’t work with modelled data.
- Predictive metrics: Features like predicting future conversions haven’t been integrated with modelling yet.
#3 Modeled data is not automatically included in BigQuery exports. However, if needed, you can access and utilize modelled data through the GA4 API.
What is Behavioural modeling in GA4?
Behaviour modelling uses machine learning to estimate the behaviour of users who opt out of cookies based on similar users who opt in.
It estimates user behaviour metrics like daily active users, new users, etc.
Eligibility criteria for using behavioural modeling in GA4
- Consent Mode must be enabled across all pages/apps, which can communicate the user’s cookie/app identifier consent status to Google and send cookieless pings when users deny consent.
- Consent Mode for web pages must load tags before the consent dialog appears.
- Your GA4 property needs 1,000+ events/day with analytics_storage=’denied’ for 7 days.
- Your GA4 property also needs 1,000+ daily users with analytics_storage=’granted’ for 7 of the previous 28 days.
- The ‘Blended’ reporting identity is enabled in your GA4 property.
While meeting the data thresholds (1,000+ daily users with consent and 1,000+ daily events) triggers model training, it may take longer than seven days for the process to complete and for modelled data to become available.
The complexity of your website/app, data volume, and chosen conversion paths can all affect the training time.
Modelled data becomes available only after your GA4 property meets the eligibility requirements and remains accessible as long as those requirements are maintained.
If your property falls below the data thresholds or consent rates dip, you might lose access to modelled data in your reports.
Note: Behaviour modelling is only available in GA4, not Universal Analytics.
What is Conversion modeling in GA4?
Conversion modeling uses machine learning to estimate the impact of traffic sources when conversions can not be directly tied to the traffic sources due to privacy regulations, restrictions or technical limitations (like restricted third-party cookies and identifiers).
Conversion modelling automatically mixes observed and modelled data in reports to give a full picture of conversion attribution.
Where to find conversion modeling data in GA4?
You can find conversion modelling data in GA4 within specific reports like ‘Events’, ‘Conversions’, and explorations using event scope dimensions.
What is the difference between behavioural modeling and conversion modeling?
While conversion modeling in GA4 focuses on estimating specific conversion events and their attribution to user journeys, behavioural modeling estimates broader user behaviour patterns and engagement on your website or app.
FAQ: If I implement advanced consent mode v2, will I see events of users who denied the consent in GA4?
You will be able to see events (some but not all) of users who denied consent.
When consent is denied, GA4 can use behavioural modelling and conversion modelling to estimate user behaviour, plugging any gaps in your reporting.
Let us suppose you want to see the revenue metric for all users, whether the user consent is granted or not.
With advanced consent mode v2 in GA4, you can see some, but not all, revenue metrics for users who denied consent.
However, the level of visibility and accuracy will depend on the following factors:
1) Quality and quantity of the consenting user data.
The effectiveness and accuracy of the data modelling in GA4, particularly when using advanced consent mode, heavily depends upon the quality and quantity of the data from users who have given consent.
This includes the relevance, completeness, and reliability of the data.
High-quality data leads to better modelling and more accurate predictions.
The quantity of data also plays a crucial role. Models are generally more accurate when they are trained on larger datasets.
A substantial amount of data from consenting users can provide a more representative sample of the overall user base, leading to better modelling outcomes.
2) Type of revenue event (direct purchase events or indirect purchase indicators).
The distinction between direct revenue events (like a “Complete purchase” button click) and indirect indicators (like adding an item to a shopping cart) is a key point.
Direct actions (like direct purchases) by non-consenting users are typically not tracked, so the system relies more on indirect indicators and data modelling.
Google can only estimate these events based on similar users who granted consent.
3) Accuracy of modelling.
The effectiveness of modelling depends on the availability of similar user data with complete revenue information.
The estimates for non-consenting users might be less accurate if you have a diverse user base with varied purchase patterns.
The granularity of your revenue data also plays a role.
Aggregate revenue figures might be more reliable than individual purchase details through modelling.
4) Expectations from Modeled Data.
You will likely see aggregated revenue figures with annotations like “modeled” or “(modeled)” indicating they include estimates for non-consenting users.
You might be able to see trends and general patterns in revenue generation even if individual purchase details are unavailable.
Remember:
1) Modelling provides valuable insights, but it’s crucial to remember that estimates may not be exact.
2) Non-consenting users might exhibit different behaviours compared to those who consent, leading to model inaccuracies.
3) Treat modelled data cautiously and avoid drawing definitive conclusions based solely on them.
If you operate a website which gets less than 1000 visitors/day, your GA4 data collection will be significantly impacted in the near future.
Your tracking will be skewed because of a lack of consented/observed data in your GA4 property.
For data modelling to kick in your GA4 property, your GA4 property needs 1,000+ daily users with analytics_storage=’granted’ for 7 of the previous 28 days.
So, in real life, you will need a lot more than 1000 visitors/day because most of them will likely deny consent. And the population of users who deny consent will only increase in the future.
You will struggle to find many consenting audiences, especially if you are EU-based. Using BigQuery won’t save you either, as modelled data is not available in BigQuery export.
No observed data = no modelled data.
Without enough observed data from consenting users, GA4’s data modelling techniques won’t have enough information to generate reliable estimates for opt-out user behaviour.
So what you can do then?
Find ways to maximize observed data collection.
1) Review your consent messaging and design to improve user acceptance rates.
2) Offer incentives or rewards for users who consent, such as exclusive content, discounts, or early access to features.
3) Focus on first-party data collection, like collecting email addresses.
By collecting and storing first-party data, you can build a richer user profile even with limited consent rates. This helps overcome data gaps caused by opt-out users and provides valuable insights into your audience.
3) Use server-side tagging.
Server-side tagging can reduce reliance on user consent in several ways. The most obvious one is converting third-party data into first-data.
During server-side processing, you can analyze and transform third-party data points like device IDs or campaign identifiers into first-party data elements like anonymous user IDs. This conversion gives you direct ownership and control over the data.
4) Focus on qualitative data collection.
Gather qualitative data about user behaviour and preferences directly from your audience.
5) Utilize feedback from customer support channels or social media to gain insights into user experience and pain points.
What is modeling in GA4?
“Modelling in GA4” actually refers to different types of data modelling techniques used within Google Analytics 4.
GA4 offers the following types of data modelling techniques:
#1 Behavioural modeling: Estimates user behaviour when they don’t consent to cookies.
#2 Conversion modeling: Estimates the impact of marketing when conversions can not be directly attributed to traffic source.
#3 Attribution modeling: Determines credit for conversions across touchpoints.
#4 Predictive metrics: Anticipates user behaviour like purchase or churn.
Observed data vs. Training data vs. Modelled data.
There are three categories of data in the context of GA4 data modelling:
- Observed data.
- Training data.
- Modelled data.
#1 Observed data
It is the actual data which comes directly from users who granted consent for GA4 to track their behaviour using identifiers like cookies or app IDs.
It provides precise and reliable information about their behaviour, including metrics like user counts, sessions, page views, events, and conversions.
#2 Training data
It is a combination of observed data and labelled data (also known as ‘labelled examples‘) used to train the machine learning algorithms behind modelled data.
Labelled data are data points with assigned labels/categories. They are used to guide and improve machine learning algorithms.
Examples of labelled data in GA4:
- Labelling specific events like “Add to cart” and “Checkout completed” as “conversion steps” to train the algorithm about your conversion funnel.
- Identifying user sessions with high page views and long average session time as “engaged users” to help predict future engagement patterns.
- Categorizing users based on demographics and past purchasing behaviour to improve user segmentation and personalization efforts.
The training data directly influences the accuracy and effectiveness of modelled data.
Biases within the training data can be reflected in the modelled data, leading to inaccurate predictions or insights.
Therefore, labelling should be done accurately and consistently to avoid confusing the algorithm.
By understanding the role and importance of labelled data, you can actively contribute to improving the effectiveness of GA4 data modelling and gain more accurate and actionable insights for optimizing your website or app performance.
#3 Modelled data
It is the estimated data for users who did not grant consent (opt-out users).
The modelled data also comes directly from users who granted consent for GA4 to track their behaviour using identifiers like cookies or app IDs.
In other words, the modelling itself leverages observed data.
Machine learning algorithms analyze patterns and behaviour from users who did consent and use these insights to estimate the behaviour of similar opt-out users.
Therefore, modelled data isn’t directly collected from opt-out users but inferred from observed data with similar characteristics.
This distinction is crucial for interpreting reports in GA4.
While modelled data helps fill in data gaps and provide insights into opt-out user behaviour, it is important to remember that it’s an estimation and may not be as accurate as observed data.
Note: GA4 strives only to report modelled data when it has a high degree of confidence in its accuracy. This helps avoid misleading users with potentially inaccurate insights.
Why is data modeling needed in GA4?
Any situation where user data is partially missing or unavailable due to privacy regulations, restrictions or technical limitations (like restricted third-party cookies and identifiers) is a key reason for using modelling—for example, consent banners and missing data from opt-out users.
In short, GA4’s modelling helps website/app owners gain insights while respecting user privacy in a changing data landscape.
How does modeling in GA4 impact your data and reports?
#1 Modeling in path and funnel explorations is applied differently than in standard reports.
#2 Behavioural Modeling is not supported for the following GA4 features:
- Audiences: You can’t create audiences based on modelled data.
- Most Explorations: Except for free-form tables, other exploration types (user explorer, cohort, user lifetime explorations, etc.) won’t include modelled data.
- Retention reports: These reports focus on user behaviour over time and don’t currently incorporate modelling.
- Segments with sequences: Segments involving user actions across multiple sessions don’t work with modelled data.
- Predictive metrics: Features like predicting future conversions haven’t been integrated with modelling yet.
#3 Modeled data is not automatically included in BigQuery exports. However, if needed, you can access and utilize modelled data through the GA4 API.
What is Behavioural modeling in GA4?
Behaviour modelling uses machine learning to estimate the behaviour of users who opt out of cookies based on similar users who opt in.
It estimates user behaviour metrics like daily active users, new users, etc.
Eligibility criteria for using behavioural modeling in GA4
- Consent Mode must be enabled across all pages/apps, which can communicate the user’s cookie/app identifier consent status to Google and send cookieless pings when users deny consent.
- Consent Mode for web pages must load tags before the consent dialog appears.
- Your GA4 property needs 1,000+ events/day with analytics_storage=’denied’ for 7 days.
- Your GA4 property also needs 1,000+ daily users with analytics_storage=’granted’ for 7 of the previous 28 days.
- The ‘Blended’ reporting identity is enabled in your GA4 property.
While meeting the data thresholds (1,000+ daily users with consent and 1,000+ daily events) triggers model training, it may take longer than seven days for the process to complete and for modelled data to become available.
The complexity of your website/app, data volume, and chosen conversion paths can all affect the training time.
Modelled data becomes available only after your GA4 property meets the eligibility requirements and remains accessible as long as those requirements are maintained.
If your property falls below the data thresholds or consent rates dip, you might lose access to modelled data in your reports.
Note: Behaviour modelling is only available in GA4, not Universal Analytics.
What is Conversion modeling in GA4?
Conversion modeling uses machine learning to estimate the impact of traffic sources when conversions can not be directly tied to the traffic sources due to privacy regulations, restrictions or technical limitations (like restricted third-party cookies and identifiers).
Conversion modelling automatically mixes observed and modelled data in reports to give a full picture of conversion attribution.
Where to find conversion modeling data in GA4?
You can find conversion modelling data in GA4 within specific reports like ‘Events’, ‘Conversions’, and explorations using event scope dimensions.
What is the difference between behavioural modeling and conversion modeling?
While conversion modeling in GA4 focuses on estimating specific conversion events and their attribution to user journeys, behavioural modeling estimates broader user behaviour patterns and engagement on your website or app.
FAQ: If I implement advanced consent mode v2, will I see events of users who denied the consent in GA4?
You will be able to see events (some but not all) of users who denied consent.
When consent is denied, GA4 can use behavioural modelling and conversion modelling to estimate user behaviour, plugging any gaps in your reporting.
Let us suppose you want to see the revenue metric for all users, whether the user consent is granted or not.
With advanced consent mode v2 in GA4, you can see some, but not all, revenue metrics for users who denied consent.
However, the level of visibility and accuracy will depend on the following factors:
1) Quality and quantity of the consenting user data.
The effectiveness and accuracy of the data modelling in GA4, particularly when using advanced consent mode, heavily depends upon the quality and quantity of the data from users who have given consent.
This includes the relevance, completeness, and reliability of the data.
High-quality data leads to better modelling and more accurate predictions.
The quantity of data also plays a crucial role. Models are generally more accurate when they are trained on larger datasets.
A substantial amount of data from consenting users can provide a more representative sample of the overall user base, leading to better modelling outcomes.
2) Type of revenue event (direct purchase events or indirect purchase indicators).
The distinction between direct revenue events (like a “Complete purchase” button click) and indirect indicators (like adding an item to a shopping cart) is a key point.
Direct actions (like direct purchases) by non-consenting users are typically not tracked, so the system relies more on indirect indicators and data modelling.
Google can only estimate these events based on similar users who granted consent.
3) Accuracy of modelling.
The effectiveness of modelling depends on the availability of similar user data with complete revenue information.
The estimates for non-consenting users might be less accurate if you have a diverse user base with varied purchase patterns.
The granularity of your revenue data also plays a role.
Aggregate revenue figures might be more reliable than individual purchase details through modelling.
4) Expectations from Modeled Data.
You will likely see aggregated revenue figures with annotations like “modeled” or “(modeled)” indicating they include estimates for non-consenting users.
You might be able to see trends and general patterns in revenue generation even if individual purchase details are unavailable.
Remember:
1) Modelling provides valuable insights, but it’s crucial to remember that estimates may not be exact.
2) Non-consenting users might exhibit different behaviours compared to those who consent, leading to model inaccuracies.
3) Treat modelled data cautiously and avoid drawing definitive conclusions based solely on them.
If you operate a website which gets less than 1000 visitors/day, your GA4 data collection will be significantly impacted in the near future.
Your tracking will be skewed because of a lack of consented/observed data in your GA4 property.
For data modelling to kick in your GA4 property, your GA4 property needs 1,000+ daily users with analytics_storage=’granted’ for 7 of the previous 28 days.
So, in real life, you will need a lot more than 1000 visitors/day because most of them will likely deny consent. And the population of users who deny consent will only increase in the future.
You will struggle to find many consenting audiences, especially if you are EU-based. Using BigQuery won’t save you either, as modelled data is not available in BigQuery export.
No observed data = no modelled data.
Without enough observed data from consenting users, GA4’s data modelling techniques won’t have enough information to generate reliable estimates for opt-out user behaviour.
So what you can do then?
Find ways to maximize observed data collection.
1) Review your consent messaging and design to improve user acceptance rates.
2) Offer incentives or rewards for users who consent, such as exclusive content, discounts, or early access to features.
3) Focus on first-party data collection, like collecting email addresses.
By collecting and storing first-party data, you can build a richer user profile even with limited consent rates. This helps overcome data gaps caused by opt-out users and provides valuable insights into your audience.
3) Use server-side tagging.
Server-side tagging can reduce reliance on user consent in several ways. The most obvious one is converting third-party data into first-data.
During server-side processing, you can analyze and transform third-party data points like device IDs or campaign identifiers into first-party data elements like anonymous user IDs. This conversion gives you direct ownership and control over the data.
4) Focus on qualitative data collection.
Gather qualitative data about user behaviour and preferences directly from your audience.
5) Utilize feedback from customer support channels or social media to gain insights into user experience and pain points.
My best selling books on Digital Analytics and Conversion Optimization
Maths and Stats for Web Analytics and Conversion Optimization
This expert guide will teach you how to leverage the knowledge of maths and statistics in order to accurately interpret data and take actions, which can quickly improve the bottom-line of your online business.
Master the Essentials of Email Marketing Analytics
This book focuses solely on the ‘analytics’ that power your email marketing optimization program and will help you dramatically reduce your cost per acquisition and increase marketing ROI by tracking the performance of the various KPIs and metrics used for email marketing.
Attribution Modelling in Google Analytics and BeyondSECOND EDITION OUT NOW!
Attribution modelling is the process of determining the most effective marketing channels for investment. This book has been written to help you implement attribution modelling. It will teach you how to leverage the knowledge of attribution modelling in order to allocate marketing budget and understand buying behaviour.
Attribution Modelling in Google Ads and Facebook
This book has been written to help you implement attribution modelling in Google Ads (Google AdWords) and Facebook. It will teach you, how to leverage the knowledge of attribution modelling in order to understand the customer purchasing journey and determine the most effective marketing channels for investment.