Businesses are transitioning from Universal Analytics to Google Analytics 4 (GA4), familiarizing themselves with its features. Clients have specific use cases for accessing data outside the UI platforms, such as building reporting or accessing unaggregated data from analytics. Google BigQuery and GA4 Data API are options to consider based on the business case for accessing data outside the UI. This article highlights the main differences between Google BigQuery and GA4 Data API, providing insight into when to consider either platform for reporting. Understanding the specific use cases is important for deciding whether to utilize Google BigQuery or GA4 Data API for accessing data outside the UI.
GA4 Data API and Google BigQuery (GBQ) Overview
The GA4 Data API data API allows you to access your GA4 data programmatically using client languages such as Python, Node, Java, and more. For businesses, this is a key step in automating and customizing business reports using the same data available within the UI. The main note here is that the data from the API should always match the GA4 UI—even for cases where Consent Mode is enabled.
You may already know that the GA4 UI data displays aggregated information for daily reporting. GBQ is a data warehouse hosted in the cloud and when linked with GA4, you can access your data in its raw form for more complex use cases, flexibility, and control over how you analyze and use your data.
Although both the API and GBQ allow you to access GA4 data, the structure and flexibility of the kind of data accessible would differ. The table below highlights some of the main differences:
GA4 Data API | Google BigQuery |
---|---|
Data is aggregated | Raw data exports |
Subjected to sampling, cardinality, and data retention limits when quota limits are exceeded | No sampling, no cardinality, no data retention limits |
Results from Google signals and modeled data are accessible. | Google signals and modeled data are not exported to BQ. |
Only registered parameters in the UI are accessible through the API. | There are no limits on Event, User, and Item-scoped custom parameters, if they are not registered. |
Just like the UI, it might be difficult or impossible to combine different parameter scopes without compromising the accuracy of your results. | Ability to blend different scopes after doing some ETL on the fields |
Session-level metrics are available through the API in third-party tools such as Looker Studio. | As of the time of this article, session-level parameters are not available by default. Some work is needed to create queries to mimic session-scoped behaviors similar to the GA4 UI. |
Some tokens restrict the amount of calls to the API per day, after which you can’t pull data. | Google BQ is not token-restricted. |
Since the data pulled is similar to results from the UI, its use cases are limited/basic. | Much suitable for running advanced analyses (ex. propensity modeling) and building custom attribution models, creating audiences, and pushing to other tools |
GA4 Data API and Google Big Query (GBQ) Use Cases
Reporting / Visualization
For a report that mirrors the GA4 API from an external platform, use the Data API. Reports from the API match GA4 Exploration reports regarding dimensions, metrics, date ranges, and filters. Two general strategies for building your reports are:
- Native Integrations with a BI Tool: Some BI tools, like Looker Studio, support integration with GA4. Valid credentials are required to access the GA4 dataset to build reports. However, every dashboard load requires data to be fetched from the API source. Frequent access to the dashboard by many users may cause delays and restrictions.
- Custom Integrations with a BI Tool: One can utilize SQL to create reports from Data API information within a BI platform even without a GA4 integration. This can be achieved by regularly loading the data into an intermediate SQL database. This requires custom data engineering or a third-party ETL tool like FiveTran or OWOX BI. Though more technical, this solution is more durable if done correctly.
Both approaches offer parity with GA4 UI, but the aggregated data is difficult to customize. For instance, Page Exit Rate (as of this article) is unavailable through the API. To get more customization and granularity, use the BigQuery export dataset.
Audience Analysis
Collecting and analyzing user audience lists for remarketing purposes is a common practice in analytics. This involves tracking data points for each user’s website journey. To access event-level data for user audience analysis, it is recommended to use BigQuery exports due to the aggregated nature of Data API shows. GA4 User Data export enables syncing of audience data with email services and CRMs, improving remarketing optimization.
Advanced Modeling / Machine Learning
Google Analytics data can be used to predict which campaigns and channels are most effective in driving conversions. Two data sources can be used: Data API for aggregate data and BigQuery for events. BigQuery is ideal for multi-touch attribution, which tracks user interaction with your website and other channels. In contrast, the Data API is great for Media Mix Modeling and other regression-based models that require aggregated spending and interaction data.
Conclusion
The article discusses the differences and use cases for Data API and GBQ in accessing GA4 data. For data science and customization, GBQ is preferred, while the API is suitable for general reporting needs.
To learn more about limitations and best practices when using these platforms to access GA4 data, check out our sister article, which delves deeper into the topic.