After the Italian Data Protection Authority suspended data transfers to Google Universal Analytics, companies quickly sought solutions. However, there's still confusion about the new version, Google Analytics 4 (GA4), and whether it's compliant. This article aims to address questions like whether to keep GA4 and if it makes sense to migrate to it now, highlighting technical solutions for using GA4.Table of Contents:
- The Context
- Compliance Requirements
- GA4 - Yes or No?
- The Solution
- Requirements for the server-side proxy
- Which Data to Remove for GDPR Compliance?
In Italy, France, and Austria, the respective data protection authorities have declared that the use of Google Analytics is not compliant with the GDPR.
In Italy, following investigations triggered by several reports, the Italian data protection authority warned a company to align with the European regulation within 90 days. This injunction effectively implies either discontinuing the use of Google Universal Analytics or configuring it to comply with GDPR standards.
The incident in question concerns the use of Google Universal Analytics (GA), the older version of Google's renowned tool, and does not refer to the newer GA4 (Google Analytics 4 Properties).
The fundamental issue lies in the fact that data processed by GA are then forwarded to Google LLC's servers in the United States, where they are explicitly accessible to government authorities such as the CIA and NSA.
This transfer to US servers occurs continuously when using Google Universal Analytics. The most intuitive and secure solution, therefore, is to disable this measurement tool.
But what's the verdict for the new version, GA4? Some, including Google itself, claim that Google Analytics 4 is GDPR compliant and safe for European companies to use. Others, however, are more skeptical and seek clarification before continuing to use this tool.
Although it's true that Google has increased its focus on privacy with the advent of GA4, there are still some critical issues to consider.
To delve deeper, let's start with:
What is NOT enough to be compliant
Many companies have set up their tracking by integrating security measures that are actually outdated in terms of GDPR compliance. Here are some of the measures that are not sufficient to be truly compliant:
Encrypting data before sending it
Encryption is an effective security measure to prevent third parties from "spying" on data during transfer, but it's completely insufficient for GDPR compliance, as the company subject to the requests of US authorities (in this case, Google) can access the personal data in clear text.
Indeed, Google LLC itself handles data encryption and is obliged to provide access to or the imported data in its possession, including the encryption keys.
Asking for user consent
User consent and data transfer to the United States are two different things. It's not enough for users to consent to analytics cookies by visiting your site and agreeing to cookie use in the appropriate banner. On this point, we quote the FAQs of CNIL (the French data protection authority):
Explicit consent of the data subjects is one of the possible derogations provided for some specific cases by Article 49 of the GDPR. However, as stated in the guidelines of the European Data Protection Board on these derogations, they can only be used for non-systematic transfers and cannot constitute a long-term and permanent solution, as resorting to a derogation cannot become the general rule.
Anonymizing IP addresses in Google Analytics
Google has integrated some features into the new version of Analytics, Google Analytics 4 Properties (GA4), probably also with a view to compliance with GDPR. Already in Google Universal Analytics, it was possible to set "anonymize_ip" to anonymize user IP addresses.
Google Analytics 4 does not record or store IP addresses during data collection. However, before discarding the IP addresses of users, GA4 derives some important location information, such as latitude, longitude, and city.
This functionality is not enough.
In fact, for the authorities, IP address anonymization does not have a real effect in preventing user identification. With a wide variety of identifiable data, Google might still be able to trace back to the identity of an individual user.
Moreover, in the settings of GA4, it's possible to:
- Disable the collection of Google Signals data based on geographic area.
- Disable the collection of granular data on locality and device based on geographic area.
These settings may still prove insufficient as we do not have control over some user-identifying parameters that still pass through Analytics and that Google could use to trace the identity of individual users.
For this reason, we are not the only ones to believe that even GA4 (as it is) is not GDPR compliant.
GA4: yes or no?
If your company has not yet migrated to GA4, you're likely considering various alternatives to Google's Analytics tool.
Migrating outside the Google ecosystem might prove to be less efficient and, in some cases, more costly in terms of setup and maintenance fees.
Google Analytics 4 is a powerful and flexible tool, constantly updated with new features. It's probably the most "future-proof" solution on the market, especially considering the imminent disappearance of third-party cookies and what a recent Deloitte guide describes as an increasing "signal loss."
However, if you have already set up GA4 on your company's site(s) and are considering removing it to switch to alternative measurement software, wait a moment:
There are technical solutions we can use to make GA4 GDPR compliant, following the guidelines of the data protection authorities!
Data protection authorities have indicated the use of a European proxy server as a possible solution.
But doesn’t GA4 already use an EU proxy automatically?
Yes, that's correct. Here’s what happens in broad terms:
- If the GA4 instance set up on the site you are browsing detects that your session comes from the European Union, your request will automatically be processed by an EU server. We could call this GA4's "automatic proxy."
- As I wrote earlier, GA4 automatically discards user IP addresses after extracting metadata for location purposes.
- The data are then collected by Google's proxy located in the EU and are subsequently processed on other specific Google servers (whose location is not specified).
So, if we can't rely on GA4's "automatic proxy," we need to set up an additional proxy, directly controlled by us.
This way, we can control the data that the authorities explicitly ask us to pseudonymize.
Functioning of the GA4 server-side proxy
Here's broadly what happens with a GA4 setup using a GTM server-side proxy:
- The user begins browsing your site.
- The data request and analysis goes through our server physically located in Europe.
- The data is cleaned by us on this first server using a Google Tag Manager server-side container.
- The cleaned data then passes to GA4's automatic proxy in the European Union.
In this way, we can mask or remove the identifying data we need to manage for GDPR compliance even before they reach Google’s automatic proxy.
For example, we can remove IP addresses even before Google can extract metadata related to the geographical location of users.
Requirements for the server-side proxy
CNIL, regarding the proxy solution, states the following:
The proxy server must also be hosted under conditions that ensure the data it processes are not transferred outside the European Union to a country that does not provide a level of protection substantially equivalent to that within the European Economic Area.
We can say that the use of a server-side proxy must meet three requirements to be an adequate solution to the problem:
- The server must be physically located in the EU.
- The company owning the server must be duly registered in the EU.
- Any user data that could be used to trace back to their identity or the characteristics of their device must be removed; it is also advisable to modify some data before sending it to the US-based tool.
Which data to remove to be GDPR compliant?
Here is a non-exhaustive list of the types of data to be removed to make GA4 GDPR compliant, according to CNIL (the French data protection authority):
IP addresses should not reach the servers of analytics tools (GA4 in this case).
Parameters in URLs
It is necessary to remove any parameters contained in the collected URLs (e.g., UTM, but also URL parameters that allow internal routing of the site).
Fingerprinting InformationAll information typically used to generate a “digital fingerprint” of the user should be removed: even information about a user's device such as screen size or operating system version can be used in some cases to trace back to their identity.
And the deletion of any other data that may lead to reidentification.
Does your company need support in web analytics?
We have seen that it is possible to use the technical solution of a server-side proxy to make Google Analytics 4 GDPR compliant.
Keep in mind that the data collected will necessarily be less rich compared to a configuration not limited by GDPR rules. This is a reality that must now be adapted to, with some precautions.
We recommend evaluating in detail, not only with your privacy consultants and DPOs, but also with the help of an expert agency on the subject, the impact of a possible GA4 configuration with a server-side proxy.
If you think the solution proposed in this article is interesting, leave a comment. Would you like further clarification? Do not hesitate to contact us!
This page has been translated using automated translation tools and artificial intelligence technologies. We strive to ensure that the content is accessible in multiple languages, but please be aware that the translation may not be perfect. If you have any doubts or need clarifications, please feel free to contact us.