Impact of Google Advanced Consent Mode on BigQuery & GDPR

Impact of Google Advanced Consent Mode on BigQuery

When you use Google Advanced Consent Mode, the data discrepancy between your GA4 property and GA4 BigQuery export increases considerably, and you start collecting junk data in BigQuery.

When Advanced Consent Mode is implemented, there is a notable difference between what you see in your GA4 reports and what is available in your BigQuery export data tables.

For example, you could see far more conversions and purchase events in BigQuery than in your GA4 property.

This happens because,

By default, the GA4 BigQuery export does not fully honour the Google Advanced Consent Mode and continues to import event data (some but not all) from your GA4 property even if users decline consent.

GA4 BigQuery export includes the cookieless pings collected by GA4 when user consent is not given.

These “cookieless pings” in GA4 from unconsented users contain limited, anonymized event data. They generally include:

  • Event names and details (page views, clicks, etc.).
  • Limited device and browser information.
  • IP-based geolocation (anonymized).
  • Consent status.
  • Events without personal identifiers (such as cookies, device IDs, user IDs, client IDs, advertising IDs, etc).
  • Data related to anonymized sessions (Session start and end, Session metadata etc). But without consistent identifiers, it’s difficult to link events within a session.

All of this data is pretty much useless to analyze.

There might be a few cases where some high-level or aggregated insights can be squeezed out.

But by and large, it is junk data.

For example,

1) The user_pseudo_id for unconsented users in GA4 “cookieless pings” is either missing or different for each session, making it impossible to track the same user across multiple sessions or to link events to a single user over time.

Without personal identifiers, tracking individual user journeys or behaviors across sessions is almost impossible.

2) Without identifying users, you can’t reliably attribute conversions, purchases, or events to specific campaigns, ads, or referral sources.

3) Without consistent identifiers for unconsented users, the data becomes fragmented and unreliable at the user level. You lose the ability to track users across different devices and platforms. Each session or even each event might appear as a unique user, severely distorting user-based metrics.

4) Without identifiers, it isn’t easy to deduplicate events, leading to potential over-reporting of user actions, making the data less reliable.

When the data is not reliable either at the user or the session level, it is pretty much junk data.

So what can you do?

If you use Google Advanced Consent Mode, consent signal values (yes, no, null) – which reflect the user’s consent status – are often included in the BigQuery export.

#1 You can use ‘privacy_info.analytics_storage’ and ‘privacy_info.ads_storage’ fields to filter out events based on user consent for analytics and ads tracking.

SELECT 
  event_date,
  event_name,
  privacy_info.analytics_storage,
  privacy_info.ads_storage,
  COUNT(*) AS event_count
FROM 
  `<Use your table id_*>`
WHERE 
  _TABLE_SUFFIX BETWEEN 
    FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY))
    AND 
    FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY))
  AND (privacy_info.analytics_storage = 'yes' OR privacy_info.ads_storage = 'yes')
GROUP BY 
  1, 2, 3, 4
ORDER BY 
  event_date DESC, event_count DESC

#2 You can use the consent signals to filter out data from users who haven’t provided consent, which is likely the source of anonymized or incomplete data.

SELECT 
  event_date,
  event_name,
  privacy_info.analytics_storage,
  privacy_info.ads_storage,
  COUNT(*) AS event_count
FROM 
  `<Use your table id_*>`
WHERE 
  _TABLE_SUFFIX BETWEEN 
    FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY))
    AND 
    FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 1 DAY))
  AND (privacy_info.analytics_storage IS NULL OR privacy_info.analytics_storage = 'no')
  AND (privacy_info.ads_storage IS NULL OR privacy_info.ads_storage = 'no')
GROUP BY 
  1, 2, 3, 4
ORDER BY 
  event_date DESC, event_count DESC

#3 Since the junk data you see likely comes from sessions or events without user identifiers, you can filter out rows that lack these identifiers.

#4 You can create custom views or tables in BigQuery that exclude unconsented data by default.

This way, you always work with consented data in your analysis.

The SQL below creates a new view containing only consented events data:

CREATE OR REPLACE VIEW `dbrt-ga4.analytics_207472454.consented_events` AS
SELECT 
  *
FROM 
  `<Use your table id_*>`
WHERE 
  (_TABLE_SUFFIX BETWEEN FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY))
  AND FORMAT_DATE('%Y%m%d', CURRENT_DATE()))
  AND (privacy_info.analytics_storage = 'yes'
  OR privacy_info.ads_storage = 'yes')

consented events ga4

Following is the SQL example code to create a materialized table to store the consent data

CREATE OR REPLACE TABLE `dbrt-ga4.analytics_207472454.consented_events_table` AS
SELECT 
  *
FROM 
  `dbrt-ga4.analytics_207472454.events_*`
WHERE 
  _TABLE_SUFFIX BETWEEN FORMAT_DATE('%Y%m%d', DATE_SUB(CURRENT_DATE(), INTERVAL 30 DAY))
  AND FORMAT_DATE('%Y%m%d', CURRENT_DATE())
  AND (privacy_info.analytics_storage = 'yes'
  OR privacy_info.ads_storage = 'yes')

Impact of Google Advanced Consent Mode on GDPR

The Google Advanced Consent Mode is not fully GDPR compliant because BigQuery collects event data (some but not all) from your GA4 property even if users decline consent.

There are three categories of data under GDPR that everyone should be aware of:

#1 Consented data – the user gave you the permission to process the data.

#2 Unconsented data – the user did not give you permission to process the data. Even if the user disables the cookie consent banner on your website or does not participate in providing consent, it is still unconsented data. Lack of consent status means no consent.

#3 Data you are allowed to process under legitimate interest – no consent required.

There should not be any overlap between these three categories as far as compliance is concerned. This is very important piece of information.

For example, If you are asking for consent, then it means without consent, you are not allowed to process the data under legitimate interest.

So if a user denied consent, you then can not continue to process unconsented data under legitimate interest.

If an organization seeks consent for data processing, that that action in itself implies that consent is the appropriate legal basis.

There needs to be a clear distinction between consented data, unconsented data and the data you are allowed to process under legitimate interest.

Anonymized and encrypted unconsented data is still unconsented data.

Even if the data has been anonymized and/or encrypted to protect user privacy, the core issue remains: users did not provide explicit consent for the collection, use, or analysis of their data.

It is a clear-cut violation of GDPR.

Play with unconsented data at your own risk.

My best selling books on Digital Analytics and Conversion Optimization

Maths and Stats for Web Analytics and Conversion Optimization
This expert guide will teach you how to leverage the knowledge of maths and statistics in order to accurately interpret data and take actions, which can quickly improve the bottom-line of your online business.

Master the Essentials of Email Marketing Analytics
This book focuses solely on the ‘analytics’ that power your email marketing optimization program and will help you dramatically reduce your cost per acquisition and increase marketing ROI by tracking the performance of the various KPIs and metrics used for email marketing.

Attribution Modelling in Google Analytics and BeyondSECOND EDITION OUT NOW!
Attribution modelling is the process of determining the most effective marketing channels for investment. This book has been written to help you implement attribution modelling. It will teach you how to leverage the knowledge of attribution modelling in order to allocate marketing budget and understand buying behaviour.

Attribution Modelling in Google Ads and Facebook
This book has been written to help you implement attribution modelling in Google Ads (Google AdWords) and Facebook. It will teach you, how to leverage the knowledge of attribution modelling in order to understand the customer purchasing journey and determine the most effective marketing channels for investment.

About the Author

Himanshu Sharma

  • Founder, OptimizeSmart.com
  • Over 15 years of experience in digital analytics and marketing
  • Author of four best-selling books on digital analytics and conversion optimization
  • Nominated for Digital Analytics Association Awards for Excellence
  • Runs one of the most popular blogs in the world on digital analytics
  • Consultant to countless small and big businesses over the decade