What Is a Data Clean Room?
At the original time of writing, August 2022, a “Data Clean Room” (DCR) was a growing buzzword in marketing, analytics, and advertising industries. And although data clean rooms weren’t an entirely new concept, they hadn’t been widely adopted with Google’s Ads Data Hub (ADH) and Infosum’s offering the only major players in the market. With the February release of the IAB Tech Lab’s data clean room standards I’ve been tasked with revisiting the concept and updating this article.
A Simple Definition of a Data Clean Room
Data clean rooms are cloud-based data warehouses where multiple different companies, or different business units in different countries, can bring data for joint analysis without exposing the data brought by each party to the other. Each part can query both data sets through the data clean room environment and get an output of aggregated results according to guidelines set by the data clean room provider (like Google’s ADH) or agreed to by the parties using a third-party solution (like Infosum, Habu, or Snowflake). This generally means having strict privacy controls which do not allow any party to view or export any customer-level or personally identifiable data.
Why Use a Data Clean Room?
There are plenty of reasons to start using a data clean room:
- Audience exploration and analysis: explore how audiences and customers overlap, providing valuable aggregate insights without the underlying data being visible to either party
- Reach and frequency measurement: providing the ability to deduplicate campaign reach and frequency for better media planning and measurement
- Cross-platform attribution: enabling partners to conduct their own self-service multi-platform attribution across the retail media online and physical stores from advertiser data
- Forecasting: customer lifetime value and propensity to purchase models used by one party could be adapted and run on the DCR collaborators data set for understanding consumer reaction to and performance of future campaigns
Is a Data Clean Room Privacy Compliant?
Something Craig Sullivan said at MeasureCamp London that’s stuck with me: “‘It depends’ is the refuge of a coward,” and it’s the first answer that comes to mind when I asked myself this question; it depends, on what data you’re using, what you’re using it for, and your (and your legal departments) interpretation of the privacy law of the land you’re operating in.
Generally speaking, data clean rooms do not store personally identifiable information (PII) within their environment, and most platforms will not allow certain data points to leave their clean room environment. These safeguards are what make a data clean room privacy-centric and help organizations follow laws such as the General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA).
A common theme, at least, of EU privacy legislation is a clear explanation of what data you’re collecting and how you intend to use it in plain language so that they can agree (read: explicit consent) to the data being collected and used that way. I don’t know about you, but I’ve not seen Data Clean Rooms or any of the abovementioned use cases in any Consent Management Platform pop-ups yet. Of course, many would agree this falls under the “improve advertising and targeting” category as do most digital marketing use cases but this brings us back to whether people can consent to something that isn’t explained clearly in plain language?
As with most uses of people’s data, it’s fine until it’s not and finding out it’s not requires someone to challenge the use in court and for a decision to be reached which, as we’ve just seen with the recent Meta decision, can take up to a decade.
Challenges of a Data Clean Room
First and foremost are the legal and logistical challenges involved in two companies agreeing to share what’s considered by many to be one of the most prized assets in first-party data.
A close second is the cost of an agreement which from personal experience in a previous life client-side can be eye-watering and that’s before considering whether, as a company, you have the skills to get the insights from the data and then are able to act on those insights effectively to the tune of the time, effort, and cost outlay just to stand up a DCR environment with a third party. Spending the time on defining your hypotheses, queries, subsequent downstream testing, and activation in conjunction with those responsible for and familiar with technology and data required while the respective legal teams hash out a deal will go a long way to ensuring you make the most of the engagement.
What’s Next for Data Clean Rooms?
The impending deprecation of third-party cookies is driving huge interest in Media Retail Networks, their first-party data, and ability to provide full funnel attribution.
Numerous leading retailers are offering advertising solutions on their e-commerce properties. Amazon Ads kickstarted things and have since been joined by retailer offerings from Best Buy Ads, Costco Wholesale, eBay Ads, The Home Depot Retail Media+, Instacart Ads, Kroger Precision Marketing, Macy’s Media Network, Roundel (Target), Walmart Connect, and Wayfair Media Solutions in the United States. In Europe, Otto, Zalando, Tesco, Sainsbury’s, and Boots are some of the retailers with their own offerings. Even Uber is in on the game. Many of these platforms either offer an in-house data clean room built on proprietary technology (like Amazon Marketing Cloud) or a pre-built solution from a DCR technology provider (like Infosum, Habu, or Snowflake).
As the technology becomes more common it should become easier to get up and running as legal and logistical processes move from custom or novel to defined and familiar. This evolution should also go a ways to bringing down the cost and, with establishment of accepted practices, reduce the level of effort required to deliver a positive return on investment.