Getting Started with BigQuery for Marketers

Estimated Reading Time: 8 minutes
November 10, 2021
Getting Started with BigQuery for Marketers

Intro

For many marketers, their first exposure to Google BigQuery (BQ) is through the integration with Google Analytics (GA). In addition to providing ownership of the complete, hit-level dataset from GA, connecting GA to BQ provides numerous opportunities for analysis, modeling, and activation and can help teams grow in their analytics maturity. In this post, we’ll cover the basic concepts you’ll need to get started analyzing your analytics data in BQ.  

Note: This post covers the integrations with BQ for both Google Analytics 360 (GA360) and Google Analytics 4 (GA4), the next generation of GA. When we refer to GA, the information is relevant for both platforms; when there are differences between the platforms, we’ll call them out explicitly.

For more advanced topics, see our other posts: Data Concepts and How They Apply to BQ, and An Overview of the Benefits of BigQuery. Stay tuned for these upcoming topics: I Just Connected Google Analytics to BigQuery – Now What?, and Getting Started with Audience Targeting using GA Data.  

Basics

BigQuery is an enterprise data warehouse within Google Cloud Platform (GCP). Aside from being used to store massive quantities of data, BQ is designed to support both data warehouses and data lakes and provide great performance at scale. For analysts, this means they can query both structured and unstructured data and work more efficiently since it’s possible to query billions of rows within seconds. 

Structurally, BQ has a three-tiered hierarchy of projects, datasets, and tables/views. Across GCP, all actions and configurations must belong to a project, which you can think of as the top-level container. Within the project, you will need at least one dataset to hold the data. Finally, at the lowest level, there are the tables and views. Tables are used to store the data as a set of rows and columns and views are virtual tables defined by an SQL query. To analyze data, you’ll query either tables or views (or both). When setting up the integration between GA and BQ, a new dataset and two tables are created inside your GCP project.

BQ Hierarchy

The actions you take in BQ are called jobs and there are four types: load, copy, query, and extract. We’ll touch on each type of job below.

Permissions

To do any work in BQ, your account needs to have the right permissions. Permissions can be granted at the project or dataset, or individual resource level within datasets. Google provides thorough documentation of the dozens of permissions that exist and the simplified predefined IAM roles. For analysts, we typically recommend being assigned two roles: bigquery.dataViewer and bigquery.jobuser. This gives the account access to see the tables and associated metadata and the ability to run queries against them. 

Accessing Data

There are four main ways that you can interact with your data stored in BQ. The most common approach is through the web browser by visiting console.cloud.google.com. Developers can use the bq command-line tool or the BQ API. Additionally, there are integrations with third-party tools such as Data Studio, R, and Tableau.  

Billing & Pricing

When you create a new project in GCP, you’ll need to set up a Cloud Billing Account. The Cloud Billing Account is responsible for paying for the services and resources used by the project.  However, Google Cloud does have a free tier of service that currently includes 10GB of data storage per month and 1T of query data processing per month. 

If your needs exceed the free tier, there are options for how to manage the costs associated with storing and analyzing your data. Storage costs depend on the quantity of data and how recently the data was modified. For tables that have not been modified for 90 consecutive days, storage costs drop by approximately 50 percent. When analyzing data, you can pay on-demand for queries or opt for flat-rate pricing if you are a high-volume customer. If you are streaming data into or out of BQ, there are also costs associated with streaming inserts and streaming reads.  

With GA360, if you’ve set up the default export of data to BQ which includes a batch export three times per day, you’ll accrue storage and analysis costs. The default export for GA4 occurs once per day and similarly contributes to storage and analysis costs. For both GA360 and GA4, you have the option to upgrade to a streaming export which will add cost for streaming inserts. 

The Four Types of Jobs

Loading & Copying

There are several ways to load data into BQ. You can load data in batches (such as importing a .csv file), via streaming, by using queries to create new data, or through a third-party application or service. The integration between GA and BQ handles loading your analytics data into BQ and new data is loaded into BQ every morning (corresponding to the data from the previous day).

You also have the ability to copy tables in BQ.

Exporting

There are also multiple ways to export data from BQ. From the UI, you can export data to Google Cloud Storage or Google Data Studio. If the table is less than 1GB, you can export it to a single file; otherwise, BQ will export larger tables to multiple files. BQ supports .csv, JSON, Avro, and Parquet file types.

You can also export data using the bq command-line tool, BQ API, or a service like Dataflow. Additionally, you have the ability to export the results of a query (instead of an entire table) by including the EXPORT DATA statement or saving the data directly from the UI.

Querying

As a marketer, you’ll probably spend the majority of your time in BQ writing queries and analyzing the results. Each query produces a result set composed of rows and columns. The easiest way to get started is by running queries from the web browser (console.cloud.google.com) but you can also run queries using the bq command-line tool, BQ API, or third-party tools. From the UI, you can save, share, and schedule queries to run in the future.

Brush Up on SQL

BQ uses its own dialect of SQL called Standard SQL. If you or other members of your team are already familiar with SQL, there’s generally not a steep learning curve. Although there are some features of Standard SQL that may be new or different, most analysts find their previous SQL skills to be very transferable to BQ.

If you need to brush up on your SQL, there are several interactive tutorials from qwiklabs that you can use to get familiar with BQ. Check out BigQuery Basics for Data Analysts and Insights from Data with BigQuery.

Learn the GA Schema

If you plan to work with your analytics data, you’ll need to get familiar with the fields that are exported via the integration with GA. As a first step, we recommend refreshing yourself on how GA collects data, the data model, and some key concepts such as users, sessions, and attribution. Check out the Google Analytics Academy courses (GA360) or playlists (GA4) to brush up on these skills.

Next, review and bookmark these resources from Google to help you get started writing your own queries.

Finally, the real way to learn the schema is by working with it: choose a report or set of metrics from GA, write the query, and compare the results to the GA UI. If they match, congratulations!  If not, review the fields you used and the calculations and try again. It’s typically an iterative process and with the instant feedback from the GA UI, you’ll know when you’re headed in the right direction.

More questions on how to get started with your Google Analytics data in BigQuery?

Submit a Contact Us form to talk with one of our activation specialists.

Author

  • Pam Castricone

    Pam Castricone is currently an Emerging Solutions Strategist at InfoTrust. Specializing in statistical and machine learning models, Pam is passionate about helping her clients uncover greater insights and value from their data assets. As a Google Cloud Certified Professional Data Engineer, Pam also helps her clients put their models into production in the cloud to drive long-term usability and success. When she isn’t analyzing data, Pam enjoys reading, the arts, and going out for brunch.

    View all posts
Last Updated: January 17, 2023

Get Your Assessment

Thank you! We will be in touch with your results soon.
{{ field.placeholder }}
{{ option.name }}

Talk To Us

Talk To Us

Receive Book Updates

Fill out this form to receive email announcements about Crawl, Walk, Run: Advancing Analytics Maturity with Google Marketing Platform. This includes pre-sale dates, official publishing dates, and more.

Search InfoTrust

Leave Us A Review

Leave a review and let us know how we’re doing. Only actual clients, please.