GDPR and Google Analytics: Is It Really Illegal?

Estimated Reading Time: 15 minutes
January 20, 2022
GDPR and Google Analytics: Is It Really Illegal?

**Important – This is not legal counsel, the materials provided are for informational purposes only and not for the purpose of providing legal advice. Final decisions must be made by your own legal representation**

UPDATE: Additional opinions have been issued by other EU DPAs which update the guidance provided in this piece. Reference the continuation of this story for the current status.

The analytics and advertising worlds have been awash with privacy news about the Austrian Data Protection Authority’s ruling that a website’s use of Google Analytics (GA) was a General Data Protection Regulation (GDPR) violation due to the transfer of EU resident’s personal data to the United States. Media headlines (and email subjects from every EU-based technology provider) are quick to blanket state that “Google Analytics is illegal”; this misses a lot of the nuance from the ruling and is not necessarily true for an organization today. As with most things legal, there is a lot of gray area. 

Let’s talk about it! 

So what exactly was the ruling?

To very briefly summarize, in at least the instance of the complaint made on August 14, 2020, it was found that the data in question collected via GA did constitute personal data. Further, the safeguards in place to protect this personal data were insufficient to protect against identification of the user by U.S. intelligence agencies. The processing was therefore unlawful due to the transfer of the personal data outside the EU to the United States. The data exporter (website) was found to be responsible while the data importer (Google) was not found to be responsible as they are a processor; it is the responsibility of the controller (the website) to ensure the proper protections are in place with the processors being worked with. No legal remedies were imposed (no fine) as the website in question has changed ownership to a German entity, leaving the decision of further remedies to the German DPA. 

What was the data which was collected via Google Analytics?

The website in question was using the free version of Google Universal Analytics. They had accepted all standard contractual terms with Google, Google Signals was not enabled, and no “User ID” was being collected in the GA configuration. Based upon the data provided in the complaint, it appears the website was collecting just base pageview hits, as well as scroll tracking with custom dimensions appearing to be configured. Importantly called out, the setting for IP Anonymization was not properly configured in the GA tagging.

Due to how GA was configured, at minimum, IP address, Google ID (id value stored in _gid cookie when user is logged in to Google services in their browser), Client ID (cid), and browser characteristics were being collected.

Why was the data considered “personal data” as defined by GDPR?

This ruling specifically leaves open the topic of the IP address in isolation as personal data, although this has been determined to be the case in some past judgments. Beyond the IP address, the unique identifiers are also considered. These identifiers are the Google Analytics ID (_gid) and the Client ID (cid). 

Let’s take a quick detour to explain these two identifiers:

  1. Client ID (cid) – this is a unique identifier assigned to a user on a single website. It is stored in a first-party cookie and is collected in all GA hits. This identifier allows GA to associate hits together to report on session metrics and sessions together to report on user metrics.
  2. Google Analytics ID (_gid) – this is a unique identifier assigned to a user on a single website when the user is logged in to a Google service in the same browser with which they are accessing the website. When this condition is true it will be assigned to the user, stored in a first-party cookie, and collected in all GA hits. If the user has Ads Personalization enabled in their Google account, this ID is used to associate pages viewed and actions taken with the user to enhance targeting and personalization by Google. When a website has Google Signals enabled in their GA property, this ID and associated data is used to enhance the audience creation and demographics reporting available to the website owner in GA.

Google makes the argument that each of these identifiers are unique to the site in question and therefore are not sufficient to identify the user. Further, there are no means used to link the identification numbers in the case of the complaint due to the user not having Ads Personalization enabled in their Google account.

The ruling counters this by concluding that because the possibility exists for the association of these identifiers, as evidenced by the fact due to a settings change they are associated, the position that due to “general discretion” no means are used to identify the natural person cannot be taken. Therefore, the identification numbers at issue may constitute personal data.

Beyond this, due to the presence of additional data points (IP address is specifically cited), available data can be combined to identify the user. This would be true in the case of Google, but more importantly for this judgement also the case of the U.S. authorities.

How does this apply to the data that I’m collecting with Google Analytics?

Put simply, it means that according to at least the Austrian DPA, GA used with the settings as outlined earlier is collecting personal data and is therefore in scope of GDPR.

But we have different settings applied … it’s not the same scenario, right?

The complaint in this case was made against a site using free Google Universal Analytics with basically an out-of-the-box implementation. At least as of the writing of this article, the _gid and cid are still being collected in the same manner as raised in this case when the user is logged in to a Google service in the same browser with which a website using default Google Universal Analytics is accessed. It is therefore possible that these identifiers would be considered personal data. 

[Important call-out: this does not mean that they for sure would be. The ruling states “the Google Analytics identification numbers at issue here may constitute personal data.” It is the presence of these in addition to the IP address and other browser identifiers which makes the collection for sure in scope of the definition.]

The additional presence of the IP address in this case is a result of Anonymize IP being improperly configured on the site in question. The ruling therefore does not consider an outcome where this setting is properly configured and the IP address is not processed and stored by GA. This is where things get a bit murky. If this had been properly configured and the IP address was not available to be combined, would the result have been the same? We can’t know definitively until a case is considered including the use of this GA configuration. 

Cool, we use Anonymize IP so we’re in the clear at least for now, right?

Not exactly. The explanation above is for an implementation of free Google Universal Analytics with just base tracking configured. Most of you reading this will have far more data being collected than simply basic GA tracking. 

If you are assigning a User ID when a user registers on your site and collecting that in GA, that would likely fall in the scope of personal data.

If you are collecting a handful of additional identifiers and combining that information together, that would likely fall in the scope of personal data.

If you have Google Signals enabled and are using GA data for audience creation and targeting, that will likely put data in scope of personal data.

If you are doing downstream processing of GA data where it is combined with other datasets which have identifier information, that could qualify as personal data.

This list could get long …

Ultimately the determination here depends upon what you are collecting, what types of integration you are doing, and what settings you have configured. As a general rule of thumb, you should run on the assumption that yes, data collection with GA will be classified as personal data and is therefore in scope of GDPR.

Ok, so there’s personal data there and we’re in scope of GDPR—why was this found to be non-compliant?

Non-compliance in this case was the finding that there was a violation of Article 44 of GDPR which deals with any transfers of personal data to a third country. Specifically at issue here was the question of appropriate safeguards of data protection considering the personal data was being transferred to the United States. 

GDPR allows for international agreements on data transfers and a system of standards to verify appropriate safeguards are in place. Originally, U.S. companies were able to rely on one of these international agreements, the EU-US Privacy Shield Framework. This framework was invalidated with the Schrems II decision by the Court of Justice of the European Union in July 2020. The core driver of the invalidation was due to U.S. surveillance programs not being limited in their access to personal data as required by EU law. The decision leaves businesses to verify the legal basis for EU-US data transfers on a case-by-case basis. Demonstrable protections ensuring adequate protection of the data being transferred must be in place. These protections include things like Standard Contractual Clauses, binding corporate rules with respect to data, technical protections, operational protections, etc.

In this case, the court ruled that the protections Google has in place as of the date of the complaint were insufficient to adequately protect the personal data of an EU user’s data when transferred to the US. Specifically discussed was protection from identification by U.S. intelligence agencies. 

But I have all kinds of contracts and Google has protections in place, is this not sufficient?

According to the Austrian Data Protection Authority, no, the standard contracts and Google’s protections as of August 14, 2020 were not sufficient. 

The date here is important. Google can, and likely will, update some of their protections and contractual clauses in response to this (and other potential) rulings. For example, as noted in the judgement, the fact that Google changed their service of GA to be provided by Google Ireland Ltd (EU entity) instead of Google LLC (US entity) for European customers was not considered as the change happened in April 2021 and the complaint was for August 2020. Changes like this could impact future rulings. 

Speaking of future rulings, there are many similar complaints raised with respect to many analytics and advertising services used across the web today. Additionally, many more are likely following this decision. This is but the first of many coming data points to be used in the ultimate compliance assessment.

Sounds like a mess, what should I be doing now?

It is our opinion that there are still far too many outstanding questions to justify a rash decision like discontinuing the use of GA. Consider:

  • The specific set of conditions in this ruling are likely to be different from the conditions true for your organization today 
  • Google has already and has the further opportunity to modify how they are treating EU user data for compliance considerations 
  • Additional rulings are expected in other similar complaints
  • The potential for a new EU-US data transfer standard is possible at any point (even if unlikely) given that discussions are ongoing 

Taken together, it is unclear where the question of compliance and GA will ultimately land. Today, what is important is to take the steps necessary to ensure that compliance risk is mitigated as much as possible. Our recommendations:

Ensure that Anonymize IP is properly configured across all GA Properties and live on your websites.

As mentioned above, this is not a “silver bullet” to solve this problem. It does, however, remove one of the core data points mentioned which can be combined to constitute personal data and therefore increase risk. This should be a standard across every GA property and is actually enabled automatically in Google’s newest version of analytics, Google Analytics 4 (GA4).

Ensure that GA is properly configured to respect the consent selections of users

Consent was irrelevant for the compliance decision in this case. Even for consenting users, their data must be protected to a sufficient level with respect to data transfers which was the core of this ruling. That said, it’s a foundational requirement for GDPR compliance to ensure the consent selections of users are being properly respected.

Treat GA as in scope of GDPR

This means ensuring proper disclosures are in place, data sharing agreements are in place, standard contractual clauses for compliant EU-US data transfers are sufficient, data retention settings are properly configured, data deletion ability is there for users, etc.

[For more information on compliant Google Analytics tracking, check out our guides for Universal Analytics and for GA4]

Consider the “worst-case scenario” of having to discontinue the use of any third-party platforms in light of compliance when thinking about longer-term strategy

As this ruling shows, surprises are inevitable. Even if you think you are doing everything right, it’s possible that something unforeseen can make you change the way your business operates overnight. Consider how to take more control of your first-party data architecture and rely less on third-party partners for critical business processes. Some ways of doing this:

  • Set up an event-based data collection model. GA is migrating to this model already with GA4 while all other third-party analytics platforms are already using this structure. An event-based model allows for standardization and more site behavior analysis from truly anonymous datasets.
  • Begin the process of standing up server-side data distribution (or as more commonly marketed, server-side tag management). With this type of configuration you have full control over what data is collected and sent to each of the third-party platforms you are working with.
  • Maintain your own first-party data warehouse including for digital interaction data (web analytics). The barrier to entry with this type of data capability is being lowered quickly. Combined with a server-side data distribution system, you can very easily send your web interaction data to a cloud database such as BigQuery in Google Cloud. Unlike GA, you have full control over where the data in your owned cloud environment is processed and stored. Assert this control to ensure EU user’s data is maintained within the EU. 
  • Ensure the underlying data architecture is able to be “plugged in” to any number of partner solutions. This means maintaining your own data standards and taxonomies. In the event of a worst-case scenario, your data should be interoperable across platforms.

A few final items to be aware of relevant for GA and compliance:

Consent Mode

Consent Mode is a new (beta) feature available for GA. When Consent Mode is properly configured and a user has not consented to ads and analytics tracking, none of the identifiers mentioned in this ruling are collected. No cookies are set, no identifiers are set nor collected which could even be associated with a user from one page to another, let alone to their identity. This is a privacy-focused feature that should be evaluated and adopted as a part of the GA strategy for any EU business. 

GA4

The Austrian DPA’s ruling considers a complaint in which Google Universal Analytics is used. Google introduced its next version of analytics, GA4, in 2021. Organizations are slowly migrating to this solution. GA4 has more privacy-focused functionality built in (Anonymize IP always on, Consent Mode support, event-based collection model, etc). Any modifications or customizations introduced by Google such as the potential to select where data is processed/stored would likely only be introduced in GA4 and not the legacy Universal Analytics solution. If you are on the fence about migrating, now is the time. 

Good chat! 

This ruling is the first coming from a large volume of complaints filed following the Schrems ii decision in 2020. There are expected to be many more decisions in the coming months with respect to data transfers and compliant data processing. Stay tuned as the rules to be followed for data analytics and activation are refined. In the meantime, mitigate risk as much as possible—both for compliance and your business processes.

Interested in discussing a privacy-centric first party data strategy?

We’d love to chat. Reach out to our team of data governance experts!

Author

  • Lucas Long

    Lucas Long is co-author of the Amazon best-selling book, Crawl, Walk, Run: Becoming a Privacy-Centric Marketing Organization. He is also the Director of Privacy Strategy at InfoTrust, working with global organizations at the intersection of digital strategy, privacy regulations, and technical data collection architecture. Through these efforts, Lucas helps companies understand their limitations for data enablement due to privacy challenges and design optimal ways to accomplish core use cases in a compliant manner.

    When not discussing the intricacies of GDPR and cookie laws with clients, Lucas enjoys traveling and exploring new cultures, one bite at a time. Based in Barcelona, he is also a presenter, featured at industry events organized by Google, the Digital Analytics Association, the American Marketing Association, and the Journal of Applied Marketing Analytics.

    View all posts
Last Updated: January 17, 2023

Get Your Assessment

Thank you! We will be in touch with your results soon.
{{ field.placeholder }}
{{ option.name }}

Talk To Us

Talk To Us

Receive Book Updates

Fill out this form to receive email announcements about Crawl, Walk, Run: Advancing Analytics Maturity with Google Marketing Platform. This includes pre-sale dates, official publishing dates, and more.

Search InfoTrust

Leave Us A Review

Leave a review and let us know how we’re doing. Only actual clients, please.