How to query Google Analytics data in BigQuery

  1. Understanding the SQL editor in BigQuery
  2. Introduction to SQL statements
  3. The structure of Google Analytics data table in BigQuery
  4. Introduction to the SELECT statement
  5. How to retrieve all the columns of a data table
  6. How to sort a result set in descending order
  7. How to sort a result set in ascending order
  8. How to use the WHERE conditions to retrieve data
  9. How to use both WHERE conditions and sorting the result set
  10. How to retrieve only particular columns from a data table
  11. How to retrieve only a specific number of records from a data table
  12. How to execute multiple SQL statements

We use SQL to query Google Analytics data in BigQuery.

SQL stands for Structured Query Language. It is a programming language which is used to store, access and manipulate data in a database. 

BigQuery supports two types of SQL: 

  1. Standard SQL
  2. Legacy SQL

Legacy SQL is the new name for BigQuery SQL. It is a non-standard SQL which BigQuery used before the launch of BigQuery 2.0. 

Google now recommends that you use the standard SQL for querying the data stored in BigQuery.

We will be using standard SQL to query Google Analytics data in BigQuery.

Understanding the SQL editor in BigQuery

The SQL editor is used to add, edit and run SQL statements.

Here is what the empty SQL editor looks like in BigQuery:

Following is an example of the SQL editor which contains one SQL statement:

Following is an example of SQL editor which contains two SQL statements:

 
Get the E-Book (50 Pages)
 
Get the FREE E-Book (50+ Pages)

Introduction to SQL statements

A SQL statement is used to perform certain operations on a database. 

These operations can be (but are not limited to):

  • Extract data from a database
  • Insert new data into a database
  • Update data in a database
  • Delete data from a database etc.

Following is an example of a SQL statement:

SELECT event_date
FROM
  `dbrt-ga4.analytics_207472454.events_20210124`
LIMIT
  1000

When you add the SQL statement to the SQL editor then the SQL editor looks like the one below:

SQL statements ignore line breaks

What that means we can write the following SQL statement:

As

The reason we write a SQL statement on multiple lines is to make it more readable. 

SQL statement are case insensitive

This means we can write the following SQL statement:

As

Following is another example of a SQL statement:

You can enter and run multiple SQL statements in your SQL editor

When you add two or more SQL statements in your SQL editor then make sure that they are separated by a semicolon ;

The structure of Google Analytics data table in BigQuery

A relational database is made up of one or more tables called data tables.

Each table is identified by a name and is made up of records and fields

A record corresponds to the row of a table and a field corresponds to the column of a table. 

Following is an example of a Google Analytics data table in BigQuery:

This data table is made up of two records (one for each ‘event_date’) and six columns (‘Row’, ‘event_date’, ‘event_timestamp’, ‘event_name’, ‘event_params.key’ and  ‘event_params.value.string_value’).

Introduction to the SELECT statement

The SQL SELECT statement is used to retrieve records (aka rows) from one or more data tables in BigQuery. The records that are retrieved are known as a result set.

Following is the syntax for the SELECT statement:

SELECT expressions
FROM tables
[WHERE conditions]
[ORDER BY expression [ ASC | DESC ]]
LIMIT number_rows [ OFFSET offset_value ]

‘SELECT’, ‘FROM’, ‘WHERE’, ‘ORDER BY’, ‘ASC’, ‘DESC’, ‘LIMIT’ and ‘OFFSET’ are called keywords because they carry special meaning. 

Though you can write keywords in all lowercase (remember SQL is case insensitive), we write keywords in all uppercase to improve the readability of our SQL statements and to make the keywords stand out.

The following are the examples of various parameters of the SELECT statement:

  1. expressions
  2. tables
  3. conditions
  4. expression 
  5. number_rows
  6. offset_value

Parameters are like user inputs. These are the values that you provide before you run a SQL statement. Thus a SQL statement is made up of keywords and parameters. 

SELECT expressions (required) => The name of one or more columns that you want to retrieve from a data table.  Use * if you want to retrieve all the columns of a data table.

FROM tables (required) => The name of one or more data table from which you want to retrieve data in BigQuery.

[WHERE conditions] (optional to use) => One or more conditions that must be satisfied for the records to be retrieved. 

Note: Use the ‘AND’ or ‘OR’ operator if you want to specify more than one condition with the WHERE keyword. If no condition is provided then all the records will be retrieved. 

[ORDER BY expression [ ASC | DESC ]]  (optional to use) => Sort the records in the result set by the specified expression. 

Use the modifier ‘ASC’ if you want the result set to be sorted in ascending order by expression.

Use the modifier ‘DESC’ if you want the result set to be sorted in descending order by expression.

Note(1): If no modifier is used with the order by expression then your result set by default would be sorted in ascending order by expression.

Note(2): If you want to use multiple expressions with the ORDER BY keyword then they should be comma-separated.

LIMIT number_rows (optional to use) => Use this parameter to specify the number of rows the result set should return.

[ OFFSET offset_value ] (optional to use) => Use this parameter to offset the first record(s) returned by the LIMIT keyword. 

How to retrieve all the columns of a data table

Consider the following data table named ‘Results_Traffic_Data_Table’:

Follow the steps below to retrieve all the columns of the data table:

Step-1: Click on the ‘QUERY TABLE’ button: 

Once you clicked on this button, BigQuery will automatically create a SQL statement for you:

Here, gsheets-ivory-enigma4567.Google_Sheets_Dataset.Results_Traffic_Data_Table` is the name of the data table. 

However, this SQL statement is not complete as it is missing the name of the columns that need to be retrieved. 

Step-2: Let’s retrieve all the columns of the table by typing the character * next to the SELECT keyword:

Step-3: Let’s format the existing query first to improve readability:

Step-4: Click on the ‘Run’ button:

You should now see all the columns of the table:

How to sort a result set in descending order

Let’s retrieve all the columns of the data table and also sort the records in the result set by ‘gasessions’ field in descending order.

Again we are using the same data table named ‘Results_Traffic_Data_Table’ for this case study:

Follow the steps below to retrieve all the columns of the data table and also sort the records in the result set by ‘gasessions’ in descending order:

Step-1: Modify your SQL statement by typing the following ‘order by gasessions desc’:

Step-2: Click on the ‘Run’ button. 

You should now see the query results like the one below:

EXERCISE: Let’s retrieve all the columns of the data table and also sort the records in the result set but this time by ‘gausers’ field in descending order:

Follow the steps below:

Step-1: Modify your SQL statement like the one below:

Step-2: Click on the ‘Run’ button. 

You should now see the query results like the one below:

How to sort a result set in ascending order

Let’s retrieve all the columns of the data table and also sort the records in the result set by ‘gasessions’ in Ascending order:

Again we are using the same data table named ‘Results_Traffic_Data_Table’ for this case study.

Follow the steps below to retrieve all the columns of the data table and also sort the records in the result set by ‘gasessions’ in Ascending order:

Step-1: Modify your SQL statement by typing the following ‘order by gasessions asc’:

Step-2: Click on the ‘Run’ button. 

You should now see the query results like the one below:

EXERCISE: Let’s retrieve all the columns of the data table and also sort the records in the result set but this time by ‘gapageviews’ field in Ascending order:

Follow the steps below:

Step-1: Modify your SQL statement like the one below:

Step-2: Click on the ‘Run’ button. 

You should now see the query results like the one below:

How to use the WHERE conditions to retrieve data

Again we are using the same data table named ‘Results_Traffic_Data_Table’ for this case study.

EXERCISE #1: Let’s retrieve all the columns where the value of ‘gasourcemedium’ field is ‘google / organic’

Follow the steps below:

Step-1: Modify your SQL statement by typing the following ‘where gasourcemedium = ‘google / organic’’:

Step-2: Click on the ‘Run’ button. 

You should now see the query results like the one below:

EXERCISE #2: Let’s retrieve all the columns where the value of ‘gadate’ field is ‘2021-01-08T00:00:00’ or where the value of ‘gasourcemedium’ field is ‘google / organic’

Follow the steps below:

Step-1: Modify your SQL statement by typing the following ‘where gadate =’2021-01-08T00:00:00’ OR gasourcemedium = ‘google / organic’’:

Step-2: Click on the ‘Run’ button. 

You should now see the query results like the one below:

EXERCISE #3: Let’s retrieve all the columns where the value of ‘gadate’ field is ‘2021-01-07T00:00:00’ and where the value of ‘gasessions’ field is greater than 100

Follow the steps below:

Step-1: Modify your SQL statement by typing the following ‘where gadate =’2021-01-07T00:00:00’ AND gasessions > 100’:

Step-2: Click on the ‘Run’ button. 

You should now see the query results like the one below:

How to use both WHERE conditions and sorting the result set

Again we are using the same data table named ‘Results_Traffic_Data_Table’ for this case study.

EXERCISE #1: Let’s retrieve all the columns where the value of ‘gadate’ field is ‘2021-01-07T00:00:00’ and also sort the records in the result set by ‘gasessions’ field in descending order

Follow the steps below:

Step-1: Modify your SQL statement by typing the following:

where gadate =’2021-01-07T00:00:00′ 
order by gasessions desc

Step-2: Click on the ‘Run’ button. 

You should now see the query results like the one below:

EXERCISE #2: Let’s retrieve all the columns where the value of ‘gadate’ field is ‘2021-01-07T00:00:00’ and where the value of ‘gasessions’ field is greater than 100. Also sort the records in the result set by ‘gasessions’ field in ascending order

Follow the steps below:

Step-1: Modify your SQL statement by typing the following:

where gadate =’2021-01-07T00:00:00′ AND gasessions > 100 
order by gasessions asc

Step-2: Click on the ‘Run’ button. 

You should now see the query results like the one below:

How to retrieve only particular columns from a data table

Again we are using the same data table named ‘Results_Traffic_Data_Table’ for this case study:

EXERCISE #1: Let’s retrieve only the column named ‘gadate’ from the data table

Follow the steps below:

Step-1: Modify your SQL statement by typing the following:

SELECT gadate 

Step-2: Click on the ‘Run’ button. 

You should now see the query results like the one below:

EXERCISE #2: Let’s retrieve the following two columns from the data table: ‘gadate’ and ‘gasourcemedium’

Follow the steps below:

Step-1: Modify your SQL statement by typing the following:

SELECT gadate, gasourcemedium

Step-2: Click on the ‘Run’ button. 

You should now see the query results like the one below:

EXERCISE #3: Let’s retrieve the following two columns from the data table: ‘gasourcemedium’ and ‘gasessions’. Also sort the records in the result set by ‘gasessions’ field in descending order

Follow the steps below:

Step-1: Modify your SQL statement by typing the following:

SELECT gasourcemedium, gasessions
order by gasessions desc

Step-2: Click on the ‘Run’ button. 

You should now see the query results like the one below:

EXERCISE #4: Let’s retrieve the following two columns from the data table: ‘gasourcemedium’ and ‘gasessions’ where ‘gasessions’ > 2000. Also sort the records in the result set by ‘gasessions’ field in ascending order

Follow the steps below:

Step-1: Modify your SQL statement by typing the following:

SELECT gasourcemedium, gasessions
WHERE gasessions > 2000
order by gasessions asc

Step-2: Click on the ‘Run’ button. 

You should now see the query results like the one below:

How to retrieve only a specific number of records from a data table

Again we are using the same data table named ‘Results_Traffic_Data_Table’ for this case study:

EXERCISE #1: Let’s retrieve all the columns but only the first three records from the data table

Follow the steps below:

Step-1: Modify your SQL statement by typing the following:

SELECT *
LIMIT 3

Step-2: Click on the ‘Run’ button. 

You should now see the query results like the one below:

EXERCISE #2: Let’s retrieve only the column named ‘gadate’ and only the first three records from the data table

Follow the steps below:

Step-1: Modify your SQL statement by typing the following:

SELECT gadate
LIMIT 3

Step-2: Click on the ‘Run’ button. 

You should now see the query results like the one below:

EXERCISE #3: Let’s do the following:

1) Retrieve the following two columns from the data table: ‘gasourcemedium’ and ‘gapageviews’.

2) Retrieve only those records where ‘gapageviews’ > 300

3) Sort the records in the result set by ‘gapageviews’ field in descending order

4) Retrieve only the first four records from the data table

Follow the steps below:

Step-1: Modify your SQL statement by typing the following:

SELECT gasourcemedium, gapageviews
Where gapageviews > 300
order by gapageviews desc 
LIMIT 4

Step-2: Click on the ‘Run’ button. 

You should now see the query results like the one below:

EXERCISE #4: Let’s do the following:

1) Retrieve the following two columns from the data table: ‘gasourcemedium’ and ‘gapageviews’.

2) Retrieve only those records where ‘gapageviews’ > 300

3) Sort the records in the result set by ‘gapageviews’ field in descending order

4) Retrieve only the first four records from the data table

5) Offset the first two records returned by the LIMIT keyword

Follow the steps below:

Step-1: Modify your SQL statement by typing the following:

SELECT gasourcemedium, gapageviews
Where gapageviews > 300
order by gapageviews desc 
LIMIT 4
OFFSET 2

Step-2: Click on the ‘Run’ button. 

You should now see the query results like the one below:

How to execute multiple SQL statements

You can enter two or more SQL statements in the SQL editor. You just need to make sure that that they are separated by a semicolon ;

When you run multiple SQL statements at the same time, you don’t see query results straightaway. What you do see are the links to view the individual query results:

Click on the ‘VIEW RESULTS’ link to see the results for a particular query:

Click on the arrow button to navigate back to the page which lists the links to view individual query results:

query Google Analytics data in BigQuery

Other articles on Google Analytics BigQuery

  1. Advantages of using Google BigQuery for Google Analytics
  2. Cost of using BigQuery for Google Analytics
  3. Guide to BigQuery Cost optimization
  4. What is Google BigQuery Sandbox and how to use it
  5. Understanding the BigQuery User Interface
  6. Sending data from Google Analytics to BigQuery without 360
  7.  How to connect GA4 (Google Analytics 4) with BigQuery
  8. events_& events_intraday_ tables in BigQuery for GA4 (Google Analytics 4)
  9. Using Google Cloud pricing calculator for BigQuery
  10. How to access BigQuery Public Data Sets
  11. How to use Google Analytics sample dataset for BigQuery
  12. Connect and transfer data from Google Sheets to BigQuery
  13. How to send data from Google Ads to BigQuery
  14. What is BigQuery Data Transfer Service & how it works.
  15. How to send data from Facebook ads to BigQuery
  16. How to send data from Google Search Console to BigQuery
  17. How to pull custom data from Google Analytics to BigQuery
  18. Best Supermetrics Alternative – Dataddo
  19. Google Analytics BigQuery Tutorial
  20. How to backfill Google Analytics data in BigQuery

Frequently asked questions about how to query Google Analytics data in BigQuery

Which database language is used to query Google Analytics data in BigQuery?

We use SQL to query Google Analytics data in BigQuery. SQL is a programming language which is used to store, access and manipulate data in a database.

BigQuery supports which flavour of SQL?

BigQuery supports two types of SQL: Standard SQL and Legacy SQL (formerly known as BigQuery SQL).

What type of SQL to use with BigQuery?

Google now recommends that you use the standard SQL (and not the Legacy SQL) for querying the data stored in BigQuery.

Where I can find Google Analytics 4 (GA4) data in BigQuery?

The following two data tables contain your GA4 data in BigQuery:
1) events_()
2) events_intraday_

Register for the FREE TRAINING...

"How to use Digital Analytics to generate floods of new Sales and Customers without spending years figuring everything out on your own."



Here’s what we’re going to cover in this training…

#1 Why digital analytics is the key to online business success.

​#2 The number 1 reason why most marketers are not able to scale their advertising and maximize sales.

#3 Why Google and Facebook ads don’t work for most businesses & how to make them work.

#4 ​Why you won’t get any competitive advantage in the marketplace just by knowing Google Analytics.

#5 The number 1 reason why conversion optimization is not working for your business.

#6 How to advertise on any marketing platform for FREE with an unlimited budget.

​#7 How to learn and master digital analytics and conversion optimization in record time.



   

My best selling books on Digital Analytics and Conversion Optimization

Maths and Stats for Web Analytics and Conversion Optimization
This expert guide will teach you how to leverage the knowledge of maths and statistics in order to accurately interpret data and take actions, which can quickly improve the bottom-line of your online business.

Master the Essentials of Email Marketing Analytics
This book focuses solely on the ‘analytics’ that power your email marketing optimization program and will help you dramatically reduce your cost per acquisition and increase marketing ROI by tracking the performance of the various KPIs and metrics used for email marketing.

Attribution Modelling in Google Analytics and Beyond
Attribution modelling is the process of determining the most effective marketing channels for investment. This book has been written to help you implement attribution modelling. It will teach you how to leverage the knowledge of attribution modelling in order to allocate marketing budget and understand buying behaviour.

Attribution Modelling in Google Ads and Facebook
This book has been written to help you implement attribution modelling in Google Ads (Google AdWords) and Facebook. It will teach you, how to leverage the knowledge of attribution modelling in order to understand the customer purchasing journey and determine the most effective marketing channels for investment.

About the Author

Himanshu Sharma

  • Founder, OptimizeSmart.com
  • Over 15 years of experience in digital analytics and marketing
  • Author of four best-selling books on digital analytics and conversion optimization
  • Nominated for Digital Analytics Association Awards for Excellence
  • Runs one of the most popular blogs in the world on digital analytics
  • Consultant to countless small and big businesses over the decade
error: Alert: Content is protected !!