How to find and remove PII from Google Analytics

Last Updated: September 30, 2022

Table of contents for how to find and remove PII from Google Analytics

  1. Introduction to PII
  2. How to identify if PII is captured in Google Analytics
  3. Steps to create a custom report to identify the PII
  4. How to remove PII from Google Analytics
  5. Understanding the custom task code

Introduction to PII

PII stands for personally identifiable information. Google considers any information that reveals the identity, contact or location of an individual as PII.

To protect users’ privacy, Google’s policies consider that this information should not be passed into reports.

Below are some of the information which is considered as PII:

  • Email address
  • Phone number
  • Social security number
  • Full names of the users
  • Precise location of users

Note: PII is different from EU General Data Protection Guidelines (GDPR).

If Google identifies that PII is being collected or sent to your reports, it can terminate your account. Initially, it would notify you saying that your Google Analytics account is collecting PII data and if it is not deleted, then it can terminate the account.

Based on your location, your users’ location and type of breach there can be fines as well. You should contact your legal team in that case.

How to identify if PII is captured in Google Analytics

Step-1: Log in to your Google Analytics account.

Step-2: Click on the ‘Behaviour’ tab on the left-hand menu.

Behaviour menu GA

Step-3: Now, select site content and ‘All Pages’.

Site content all pages

Step-4: Then filter for ‘@’ in the pages report to check if any email addresses have been captured.

PII Validation in reports

Step-5: The results will show up If there are any email addresses captured in reports.

Similarly, you can also run the same validation in the events reports as well.

Below are some of the regex format that you can use to identify the PII in Google Analytics reports:

To identify full email ID formats:

([a-zA-Z0-9_\.-]+)@([\da-zA-Z\.-]+)\.([a-zA-Z\.]{2,6})

To identify social security numbers:

(\d{3}-?\d{2}-?\d{4})

To identify addresses:

(drive|street|road|dr.|po box|rd.)

To identify phone numbers:

(\d{3}-?\d{3}-?\d{4})

To identify names:

(fn|ln|lastname|firstname|name|fullname)

You can use these regex, shared in different reports like all pages, events reports or even you can create a custom report, to identify if any PII is captured in the reports.

Additionally, you can also create a custom report to identify any PII information captured in your reports.

Let’s see one example to identify the PII captured by using custom reports.

Steps to create a custom report to identify the PII

Step-1: Log in to your Google Analytics account.

Step-2: Click on the ‘Customization’ tab from the left-hand menu.

customization In GA

Step-3: Select ‘Custom reports’ from the options available.

custom reports

Step-4: Click on the ‘New custom report’ tab, as shown below:

new custom report

Step-5: In ‘Report content’, let’s select ‘Page’ as dimensions and ‘Page views’ as metrics. Note that we are checking page reports to check for PII, you can even check in events and other reports as well.

Report content

Step-6: We can use any of the regex mentioned above to validate PII, let’s consider the email address regex:

([a-zA-Z0-9_\.-]+)@([\da-zA-Z\.-]+)\.([a-zA-Z\.]{2,6}).

In the filter set up the configuration as shown below.

filter configurtaion

Step-7: If any PII is captured in reports, this report would help to identify it.

How to remove PII from Google Analytics

The best way to remove PII information from Google Analytics is to follow this custom task implementation suggested by Simo.

Follow below steps in order to add this custom task feature in Google Tag Manager

Step-1: Log in to your Google Tag Manager account

Step-2: Click on ‘Variables’ from the left-hand menu.

Variables in GTM 2

Step-3: Click on ‘New’ under User-defined variables’.

user defined variables

Step-4: Click on the pencil icon in the variable configuration.

Variable configuration

Step-5: From the variable type, choose ‘Custom JavaScript’.

custom javascript

Step-6: Let us name this variable as ‘JS-Remove PII load from Hit Payload’ and copy the custom JavaScript from GTM guru Simo and paste it in the variable.

Custom JS Variable

Step-7: Now, you can navigate to the global Google Analytics variable in GTM.

Global Google analytics Variable

Step-8: Click on ‘Edit’ and under ‘More settings’, select ‘Add Field’ as shown below:

Add Field in GTM

Step-9: In the field name enter ‘customTask’ and in the field value select the ‘Variable’ that we have created.

custom task GTM 1

Step-10: Click ‘Save’ and publish the changes in Google Tag Manager.

Understanding the custom task code

The first part of the code configuration object as shown below:

 var piiRegex = [{
       name: 'EMAIL',
       regex: /.{4}@.{4}/g 

It’s basically an array with two properties: name and regex. In the above example name is email and regex is to check for email address.

The first parameter is to replace the string value after ‘REDACTED’. If the name is ‘EMAIL’ and is captured in your Google Analytics, once the custom task is implemented it would be shown as ‘[REDACTED EMAIL]’ in your Google Analytics reports wherever PII data was removed.

The second parameter is the regex, you can create your own regex or use the regex we have used above to find any kind of PII (email, phone numbers, SSN,ect)

When this custom task code is implemented in GTM, it runs for all the payloads and validates if any PII is captured and replaces with the configuration array that we have created.

I hope this article was helpful to find any PII captured and to remove PII in Google Analytics reports. Also, make sure that you audit your accounts on a regular basis.

Register for the FREE TRAINING...

"How to use Digital Analytics to generate floods of new Sales and Customers without spending years figuring everything out on your own."



Here’s what we’re going to cover in this training…

#1 Why digital analytics is the key to online business success.

​#2 The number 1 reason why most marketers are not able to scale their advertising and maximize sales.

#3 Why Google and Facebook ads don’t work for most businesses & how to make them work.

#4 ​Why you won’t get any competitive advantage in the marketplace just by knowing Google Analytics.

#5 The number 1 reason why conversion optimization is not working for your business.

#6 How to advertise on any marketing platform for FREE with an unlimited budget.

​#7 How to learn and master digital analytics and conversion optimization in record time.



   

My best selling books on Digital Analytics and Conversion Optimization

Maths and Stats for Web Analytics and Conversion Optimization
This expert guide will teach you how to leverage the knowledge of maths and statistics in order to accurately interpret data and take actions, which can quickly improve the bottom-line of your online business.

Master the Essentials of Email Marketing Analytics
This book focuses solely on the ‘analytics’ that power your email marketing optimization program and will help you dramatically reduce your cost per acquisition and increase marketing ROI by tracking the performance of the various KPIs and metrics used for email marketing.

Attribution Modelling in Google Analytics and BeyondSECOND EDITION OUT NOW!
Attribution modelling is the process of determining the most effective marketing channels for investment. This book has been written to help you implement attribution modelling. It will teach you how to leverage the knowledge of attribution modelling in order to allocate marketing budget and understand buying behaviour.

Attribution Modelling in Google Ads and Facebook
This book has been written to help you implement attribution modelling in Google Ads (Google AdWords) and Facebook. It will teach you, how to leverage the knowledge of attribution modelling in order to understand the customer purchasing journey and determine the most effective marketing channels for investment.

About the Author

Himanshu Sharma

  • Founder, OptimizeSmart.com
  • Over 15 years of experience in digital analytics and marketing
  • Author of four best-selling books on digital analytics and conversion optimization
  • Nominated for Digital Analytics Association Awards for Excellence
  • Runs one of the most popular blogs in the world on digital analytics
  • Consultant to countless small and big businesses over the decade

Learn and Master Google Analytics 4 (GA4) - 126 pages ebook

X
error: Alert: Content is protected !!