We built a system that allows a single team to implement data privacy changes across a suite of over 70 products.
By Kelsey Johnson
When I first started working on the Data Governance team at The New York Times in 2017, I would often be met by blank stares when I tried to explain my job. Over time, I perfected my elevator pitch: I work on privacy, ethics and governance at The Times, and it’s a bit like herding cats.
As the second member of The Times’s Data Governance team, I dove straight into the high-profile General Data Protection Regulation (G.D.P.R.) project, which required that our organization follow strict ground rules for handling the personal data of our users located in the European Economic Area. Our team of two had to simultaneously discover every way our business operations were processing the data of our users, all the while creating and implementing rules to honor our users’ G.D.P.R. rights.
Two people, over 70 product teams and five months to get it all in place. Don’t let anyone tell you herding cats is easy.
G.D.P.R. was just the beginning. There are now over 100 privacy laws in other countries. In the United States, there are numerous bills that have either been introduced into state legislatures or have been signed into law.
Not only do each of these laws apply to specific geographic regions, but they each tend to have their own unique requirements. This means that companies like The Times have to implement complex sets of rules to ensure their websites and apps are in compliance no matter where users are located. New regulations often go into effect with short grace periods, so companies have very little time to make changes to their technology stacks.
Because technology is constantly evolving, legislation will need to adapt as technology changes. This means companies need to be ready to modify their interpretations of the law anytime a law gets amended. That leaves companies with two options: to reactively respond any time a change is required, or to invest in privacy as part of their business strategy and dedicate resources to the task.
The Times has chosen the latter.
What we made
It was 2019 when we retired our cat-herding hats and harnessed our brain power to build PURR: Privacy, Users, Rules and Regulations.
PURR is our homegrown system that operationalizes The Times’ privacy offerings. The system centralizes our business logic and rules, and it communicates with our front-end products, instructing them on how to carry out the privacy rights of each unique user that visits us. It also connects to an internal preferences system that allows us to securely save and sync logged-in users’ privacy preferences across all of our products.
[We’re hiring. Come work with us!]
This enables us to easily and efficiently adjust our interpretations of data regulations and it simplifies the implementation of such changes across our suite of products.
Because of PURR, we have said goodbye to year-long privacy projects and roadmap disruptions whenever a new regulation is passed. Now, a single team can independently implement most changes. Purr-fect, right?
How it works
I like to think of PURR as a privacy machine. It consumes information about a reader, analyzes that information based on knowledge it has been given about our company’s interpretations of privacy regulations, and then outputs instructions on how that reader should be treated from a privacy perspective when they interact with one of our products.
Each Times product, such as News, Cooking or Games, is required to ask PURR for instructions on how to treat every single reader that visits. These instructions are called Directives.
There are several types of directives that a product needs to be given when it comes to privacy — right now, there are eight. We have broken them down into two categories: user interface directives and data handling directives.
The former tells products when they need to show a certain element on the page, such as a cookie banner, a marketing consent checkbox or an opt-out button. The latter tells products how to handle a user’s personal data. For example, a data handling directive might block certain third party tracking mechanisms from collecting data about a user.
Directives are sent via three different methods: request headers, a cookie called `nyt-purr` and JSON. These give Times products the flexibility to choose how they want to consume the information from PURR.
Let’s say you are in Europe and you visit our NYT Cooking homepage to get some inspiration for your weeknight dinner. Upon loading the page, you will be shown a banner, like this one.
This is what we call our G.D.P.R. Tracker Banner. NYT Cooking showed it to you because it received the “user-interface” directive called PURR_DataProcessingConsentUI from PURR. When you loaded the page, PURR knew you were in Europe and G.D.P.R.-eligible, so it delivered a value of show with the PURR_DataProcessingConsentUI directive, which told our products to present you with a banner that allows you to opt in or opt out. If, however, you visited NYT Cooking from New York City, the directive would have delivered a value of hide.
This is just one example of a directive. Some other examples include PURR_AcceptableTrackers, which tells products what type of trackers can be fired when a user visits a page; PURR_DeleteIPAddress, which instructs our first-party analytics tool whether to delete the IP Address of a user; and PURR_AdConfiguration, which tells products what types of ads are permitted on the page, such as non-personalized ads instead of behaviorally targeted ads.
For PURR to know what directives to issue to products, it needed to be programmed with privacy logic, which we refer to as Rules. The Data Governance team works in collaboration with the Times Legal and Engineering departments to write these rules.
Rules are written expressions that PURR must evaluate. They can become quite complex, but a rule basically says, “Hey PURR, you need to look at those inputs, and when these circumstances are met, you need to issue these specific directives.” It’s because of rules that we can provide a custom privacy experience to every user who visits our site.
Let’s go back to the NYT Cooking example. In order for PURR to know whether a user needs to be shown our G.D.P.R. Tracker Banner, it first needs to know whether the user is located in a country that is subject to the G.D.P.R. and whether they’ve already opted in or opted out.
In plain text, the rule would be written like this: IF user is in a G.D.P.R.-eligible country AND user has no set preferences with NYT, THEN show banner.
In SQL, the rule gets translated into the following expression:
PURR_DataProcessingConsentUI SQL Expression Rules
CASE WHEN ( -- people who are gdpr eligible and have
not opted out or in
gdpr_pref_regi = "none" AND
gdpr_pref_agent = "none"
ELSE -- people who are gdpr eligible and have either
opted out or in
ELSE -- people who are not gdpr eligible
Rules Producer and Executors
In order to operationalize Rules and Directives, we needed systems that could bring them to fruition. We first had to find a place where we could write and save the rules — a rules producer, of sorts.
Luckily for us, we didn’t have to build anything from scratch. In recent years, The Times had built ABRA (which is short for “Abra Basically Reports ABtests”), our homegrown testing and targeting platform. Because ABRA provides the ability to translate human-friendly rules into a widely adopted, open-source and machine-friendly format, it was a simple engineering choice to expand its capabilities to support PURR.
In addition to a system that would allow us to write and save the rules, we needed a way to run the rules. We refer to these as Rules Executors, which ingest the ABRA rules and execute on those rules in order to send directives to our products.
Because the Times has over 70 products and a diverse landscape of technical stacks, we wrote three separate Rules Executors: one for products under the nytimes.com domain that use Fastly; another for products that are under the nytimes.com domain but do not use Fastly; and one for products not on the nytimes.com domain, such as our vendor-managed sites.
It’s important to note that the PURR system does not stand on its own. It relies on several other capabilities within The Times’s data and core platforms, such as a homegrown system that keeps track of our user’s data privacy preferences. It also interacts with Samizdat, another homegrown system that our native apps leverage to integrate with PURR. And lastly, we built a customized AMP-PURR endpoint which allows us to integrate with our Google AMP products.
The first of many lives
We have proven that PURR can adapt as new privacy laws are passed. When the United Arab Emirates passed a data privacy law in the summer of 2020, we were able to bring over 60 products into compliance within four working days and with zero roadmap disruptions to the product teams. When Brazil’s privacy law came into effect shortly after, it took us only one day to ensure our products were compliant.
Over time, we hope to implement new directives that foster privacy-forward features in our products. And as The Times explores new product offerings, such as NYT Kids, we believe PURR will be a reliable way to ensure such products are ethically built while also meeting strict legal requirements.
Although our cat herding hasn’t been reduced to zero, we have learned that implementing data governance regulations doesn’t need to be a resource-zapping, roadmap-destroying burden. It can be an opportunity to innovate and provide users with a privacy-focused digital experience.
If you’d like to learn more about PURR, check out a full talk we gave at a recent Privacy_Infra() event. And if you are seeking to expand your network of privacy peers, don’t hesitate to email us: [email protected]!
Kelsey Johnson works in Data Governance and Privacy at The New York Times, where she and her team are tasked with leading their company’s data privacy and protection efforts. She likes to think of her work as “ethics at scale.” She is a civil engineer by trade (a proud Bucknell Bison and VTech Hokie) and has past experience working in the engineering and insurance sectors. Prior to joining The Times she spent a year in Seattle pursuing her passion for female empowerment through education. Kelsey loves running (a proud marathoner), spending summers by the ocean or in the mountains, and is a devoted sea glass collector. Follow her on Twitter and LinkedIn.
[If this type of work is interesting to you, come work with us!]
How We Manage New York Times Readers’ Data Privacy was originally published in NYT Open on Medium, where people are continuing the conversation by highlighting and responding to this story.
Source: New York Times