Centre for Internet & Society

In its efforts to deprecate third-party cookies, Google, in August 2019, has brought an alternative plan with its new Privacy Sandbox platform. This plan promises to preserve anonymity when serving tailored advertising. While unveiling the system, Google explained that even though advertising is necessary to keep the web available to everyone, the web ecosystem is at risk if privacy policies do not keep pace with evolving expectations. But does this new framework help users in any way?

 

Introduction

The revenue model of major corporations like Google and Facebook is advertisements. In 2020, Google's annual ad revenue amounted to 146.92 billion US dollars. Companies like Google collect data through their services like Gmail, YouTube, Google Search, and third-party cookies on other websites. Accordingly, Google provides targeted ads based on the vast amount of data it collects on an individual. Google AdSense is the company's advertisement service, which improves the marketing efforts of its customers. It allows advertisers to promote ads, list products, and offer services to web users through Google's vast ad network (properties, affiliate pages, and apps). Till now, third-party cookies -- enabling companies to track users’ browsing habits -- have been an enabling force for targeted advertisements. However, fears about data collection without consent via cookies have prompted information privacy laws such as GDPR in Europe. One should note that web browsers like Apple’s Safari and Mozilla Firefox have deprecated third-party cookies from their browsers in March 2020 and September 2019 respectively. In January 2020, Google also decided to phase out third-party cookies in Chrome.  

In its efforts to deprecate third-party cookies, Google, in August 2019, has brought an alternative plan with its new  Privacy Sandbox platform. This plan promises to preserve anonymity when serving tailored advertising. While unveiling the system, Google explained that even though advertising is necessary to keep the web available to everyone, the web ecosystem is at risk if privacy policies do not keep pace with evolving expectations. Accordingly, Google has proposed a dynamic and evolving range of bird-themed targeted advertising and measuring approaches. It strives to uproot third-party cookies. Federated Learning of Cohorts (FLoC) and TURTLEDOVE are among the most popular. Google envisions it to be the industry norm for serving advertisements on the internet. Google is rolling out its FLoC in the new version of its web browser, Chrome, and seems bound eventually to replace third-party cookies. 

This article explains how FLoC works and demonstrates how it is different from conventional third-party cookies for online targeted advertisements. It goes on to evaluate Google’s claims that it protects user privacy. Finally, it assesses whether FLoC will allow Google to entrench its position in the digital market further.

How does FLoC operate? 

FLoC is only one component of Google's Privacy Sandbox program, consisting of a series of measures and updates. It aims to transform the existing ad tech ecosystem and adopt a privacy-first approach to the internet. FLoC aims to deliver personalized advertising to large groups of users with common interests. 

Chrome puts users into ‘cohorts’ with the help of on-device machine learning based on their browsing behavior. These clusters of large groups of people showing similar interests on the web make the individual user indistinguishable from other people in the cohort. In this manner, an individual is put in multiple cohorts like the ‘car cohort’ or the ‘cooking cohort.' Google FLoC observes the similarity in interests through predictive and contextual users by observing the pattern of the user’s website and page visits. So, the advertisers identify the groups (FLoCs) and show ads to those FLoCs. These changes, nevertheless, will take time, with an anticipated period of two years for the elimination of third-party cookies. Google claims that FLoC enhances consumer privacy when allowing personalized ads by targeting user groups instead of individuals.

 

How is FLoC different from a Cookie?

 

Before making out a difference between a cookie and FLoC, first, let us discuss what a cookie is and how a third-party cookie works in an advertisement. A cookie is a small piece of text sent to the browser by a website visited by a user. It helps the site remember the visitor's information, making it easier to revisit the site and make it more useful. Third-party cookies facilitate similar functionalities, enabling tracking by websites other than what a user is currently on. 

The structure of the advertising cookie is designed primarily to collect information from users. Advertisers can only locate these cookies on a website with the consent of the website owner. The information that cookies gather on users acts as a digital footprint for marketers and businesses to create an integrated network. It serves as a profile with thorough information about a user's tastes, shopping habits and other preferences. These cookies are generally third-party or persistent cookies. Google uses Ad Words and AdSense for advertising, and the best use of ad banners is for retargeting purposes. Google, not the website maker, deposits these cookies -hence the term “third-party cookies.” In this way, the company pays Google to show the specific visual ad to all the people who visit its website on all the other Google AdSense network websites. 

Unlike third-party cookies, Google FLoCs are different in several ways. FLoC takes the user’s browsing history in Chrome and analyzes it to assign the user to a category or “cohort.” Most importantly, it does not give a unique identifier to the individual users. Instead, they exist only as part of a larger cohort, with at least a thousand users to sustain anonymity. In FLoC enabled Chrome, the user’s web browser shares a "cohort ID" with websites and marketers. Consequently, advertisers now have to target users depending on the category to which they belong. 

Furthermore, FLoC passes the user information onto the user’s device rather than circulating it on the internet. Google explains that the idea is to disable the reconstruction of cross-site browsing history. The information that it obtains locally from the browser stays on the device; Google only shares the cohorts. The concept behind this interest-based targeting strategy is to conceal people "in the crowd." To keep their browsing history confidential and protect user identity by restricting individual profiling. 

It is still unclear whether FLoC will replace third-party cookies completely since FLoC currently requires third-party cookies to work. The FLoC GitHub documentation states that the user must not block third-party cookies on the device to log and sync cohort data. On the surface, the new technology ensures that it shuffles fewer data across the web and enhances user privacy, but there is more to it.  

Potential privacy issues with Google FLoC 

Despite Google's plan to replace third-party cookies and not  create alternative identifiers, the company can now access the user’s search history. This means that the new rules do not apply to themselves. This change does not mean that  Google will not track users now. It still tracks users when they use Google websites, Google Docs, Gmail, YouTube, and Google search. This tracking is in the form of a first-party cookie that Google deposits on its own services, including the applications using the ad sense network. 

 

Google FLoC, based on the browsing behavior of an individual, would put users into ever-changing categories or ‘cohorts’ on a weekly basis.

 

Along with these developments comes a hidden risk-the ability of automated systems to perpetuate bias. FLoC’s clustering algorithm may replicate the potentially illegal discriminatory behavior that results from algorithmic behavioral targeting. Similar concerns are around FLoC, the clustering algorithm  may group people by sensitive attributes such as race, sexual orientation, or disability. For example, in 2019, US Department of Housing and Urban Development charged Facebook for ads that discriminated against people based on their race, sex, and disability. It alleged that the platform allowed the house sellers and landlords to discriminate among the users. Besides, researchers claim that a company’s advertising algorithm exacerbates gender bias. The University of Southern California researchers found that men were more likely to see Domino’s pizza delivery job ads on Facebook. At the same time, women were more likely to see Instacart shopper ads. Even though Google acknowledges the risk of algorithmic bias but fails to articulate safeguards that are robust enough to mitigate this.

 

The Google FLoC documentation states that it  monitors cohorts through auditing. It checks the usage of sensitive data like race, religion, gender, age, health, and financial status. Moreover, it plans to analyze the correlation between the resulting cohort and “sensitive” categories. Suppose it finds that too many users belonging to a cohort visits a specific type of “sensitive” website. In that case, Google either will block the cohort or change the cohort forming algorithm. They have also said that it is against their ad policies to serve personalized ads based on sensitive categories. However, by collating people’s general behaviors and interests, the system may infer sensitive information. Therefore, Google, through its services and Cohort ID, will have access to more personal data. The technology is against the objective it seeks to achieve, i.e., to put an end to individual profiling or revealing sensitive attributes. Moreover, the accusation that the company allows advertisers to discriminate against users renders it more sinister.

 

The vital question centers around whether FLoC constitutes “personal data” and complies with the privacy laws. The European Union's General Data Protection Regulation, 2016 (GDPR), one of the strictest privacy laws, clarifies what "personal data" is.  It says data is "personal" in cases when an individual is identifiable directly or indirectly, using online identifiers such as their name, identification number, IP addresses, or their location data. The FLoC proposal itself highlights the concern as mentioned above. The proposal reads that the websites that are aware of the person’s PII (personally identifiable information), e.g., when a person signs in using their email address, can record and reveal their cohort. It is so because, by collating people’s general behaviors and interests, the system may infer sensitive information. 

 

Therefore, FLoC can crumble anonymity and privacy online if one combines FLoC data with information like site sign-in to trace an individual. In this manner, it can reveal sensitive information that allows advertisers to misuse and discriminate against the users.  With the change in the advertising ecosystem, the browser generates FLoCs, with advertisers merely at the receiving end. 

The Electronic Frontier Foundation (EFF) compares FLoC to a "behavioral credit score," finding it a "terrible idea." It poses new privacy threats, as websites can uniquely fingerprint FLoC users and access more sensitive information than is needed to serve related advertising. Suppose the user visits a retail website. In that case, a retail website should not know what other websites the user has earlier visited. It should not know about the user's political inclinations or about being on treatment for depression. Later, the Chrome browser observes the browsing pattern and categorizes users into the "type of person"  and the 'group' they belong to. Google via FLoC will therefore share user's online behavioral patterns with every website the user visits. 

Thus, Google FLoC harms privacy-by-design by sharing information with advertisers and websites. Advertisers would not have access to such information using third-party cookies. Moreover, FLoC would make Chrome reveal browsing history with sites--something none of the existing browsers do. Introducing FLoC intends to provide the right amount of data to advertisers without revealing too much about any individual. However, FLoC poses more privacy concerns by sharing more user information than is required. It changes the approach from a contextual to a behavioral one. Hence it does not protect user privacy. Moreover, with its first-party analytic and advertising cookies, Google has access to much more data than with third-party cookies. The proposal does not mention how it processes the data and is quiet with respect to procedural transparency.  

One of the most pressing questions that remain is with regards to the FLoC’s effective functioning. Google's ad team has validated this privacy-first solution by developing simulations focused on the concepts described in Chrome's FLoC proposal. Findings indicate that FLoC has the potential to replace third-party cookies to create interest-based datasets effectively. Google claims, “tests of FLoC to reach in-market and affinity Google Audiences show that advertisers can expect to see at least 95% of the conversions per dollar spent when compared to cookie-based advertising.” The outcome of FLoC's forming algorithm and the target audience will determine its power. Google has not published any hard statistics on "how private" FLoCs are or anything about privacy measurements in general.

Another legal issue that comes to light is accountability. Apart from the publishers' accountability, which asks for the user data and processes for targeted advertising, what would be the browsers' accountability actually processing the FLoC data (Browsing history)? Google's standard should specifically emphasize the browser's accountability as it is the sole FLoC data controller. The onus is on them for legitimate processing. For similar reasons, Google announced that it would not proceed with FLoC testing in Europe and countries that fall within GDPR and the ePrivacy Directive. The reason is the lack of clarity  regarding which entities serve as the data controller and processor respectively when creating cohorts.

Looking at user data collection, taking consent seems to be the last resort and serves as the legal basis for the lawful processing of personal data. It is unlawful for browsers to process the browsing history without consent. Even if the company claims not to share any profiles, they are bound to ask for specific, informed consent. Under the GDPR, a necessary condition for personal data processing must be within specified lawful grounds, such as the subject’s consent to the processing for a particular purpose. It further strengthens the consent requirement as a legal basis. In India, similar to the GDPR, the Personal Data Protection Bill, 2019 (PDP), has been laid on the bedrock of consent. Under Clause 11 of the Bill, consent is qualified by “free,” “specific,” and “informed.” and requires to be clear in scope and capable of being withdrawn. Therefore, data processing should be allowed only when the individual permits it. The PDP Bill further requires that Data fiduciaries offer adequate information to Data Principles about data processing for keeping it transparent and accountable in the eventa data breach. 

Therefore, consent is vital for transparency in processing. In its absence, the data cannot be appropriated or sent. Thus, introducing FLoC falls foul of privacy laws such as GDPR and the Indian PDP Bill. Due to the lack of consent and privacy reasons, privacy-centric, Chromium-based browsers like Brave and Vivaldi are already disabling Google FLoC.

Google’s gambit to reorient the adtech ecosystem under the garb of privacy ends up undermining it. Urgent regulation and advocacy in  all jurisdictions is needed to ensure that risks are mitigated and Google does not end up unduly benefiting from this ecosystem at the expense of online individuals and communities.

 

Acknowledgements: The author would like to thank Ali Jawed, Arindrajit Basu and  Gurshabad Grover for their feedback and editorial suggestions.

Vipul Kharbanada and Pallavi Bedi served as blind peer-reviewers for this piece.

 

The author graduated from the Faculty of Law, Aligarh Muslim University, in 2019 and holds an LL.M (Constitutional and Administrative Law) from Symbiosis Law School, Pune. She has a keen interest in Digital Rights & Tech Policy. 

email: [email protected]

(Disclosure: The Centre for Internet & Society has received funds from Google)

The views and opinions expressed on this page are those of their individual authors. Unless the opposite is explicitly stated, or unless the opposite may be reasonably inferred, CIS does not subscribe to these views and opinions which belong to their individual authors. CIS does not accept any responsibility, legal or otherwise, for the views and opinions of these individual authors. For an official statement from CIS on a particular issue, please contact us directly.