Tying the Browser Elements and Services Together Part 2: Worklets

Introduction to Worklets

This post covers the next unique element in the browser that has been adapted for the Google Privacy Sandbox: worklets. Worklets were introduced in Chrome 61 (2017) specifically for performance-critical tasks related to audio processing, video manipulation, and animation. They allow for multi-threaded execution off the main Javascript thread, were designed for tight integration with browser APIs, and have restricted capabilities to ensure security and minimize attack vectors. The main driver for their development was the need to handle highly specialized tasks within the browser engine with strong security measures for sensitive operations.

Worklets have been adapted by the Google Privacy Sandbox for four specific uses:

Running auctions
Bidding on auctions
Reporting on the results of auctions
Handling activities in shared storage

In this post, we will only deal with the actual use cases at a very high level. The main purpose of this post is to set the stage for the later posts where we delve into the various worklet functionalities in greater detail. What this post should help us understand is what worklets are and why they were chosen as the best technology for implementing those use cases.

Having said that there is another browser element called a web worker which would seem to be related to worklets. Given the naming, aren’t worklets just smaller or limited function versions of a web worker?

In fact, they are.

So I am going to start this post discussing web workers briefly, in order to clarify what they are, the difference between them and worklets, and why they were not used for the Privacy Sandbox.

What are Web Workers?

In order to understand web workers, it is important to go back in time to the early 2000s. At this time, web sites were relatively simple and ran a reasonable amount of JavaScript that could be processed in the main thread without unduly impacting the rendering speed of the browser. But as the years went on, developers started to want to run more computationally expensive applications in the browser, for example large image processing. The result was an obvious need for some mechanism to allow these computationally-expensive elements to run in a way that the performance impacts on the main Javascript thread were reduced to keep rendering speed at a level acceptable to end users. The Web Worker API was the solution that was developed by the W3C Web Hypertext Application Technology Working Group (also known as the WHATWG) in 2009 as part of HTML5 to deal with these issues. They are now part of the main HTML specification.

Web workers are designed to perform computationally intensive or long-running tasks in a separate thread, improving responsiveness of the main thread. They were intended to be used for long-running scripts with high startup performance costs and high per-instance memory costs, that are not interrupted by scripts that respond to user-generated interactions. This allows these long-running tasks to be executed without having to yield computational priority to keep a web page responsive. Workers were always considered to be relatively heavyweight and are supposed to be used sparingly for any given application.

Figure 1 - A Simple Example of a Web Worker.

You define the worker first, then can send or post messages to the worker as it runs in parallel with the main thread

‍

Web workers are general purpose and can handle a wide range of functionalities. They can access the DOM in a limited way and interact with network resources like fetching data or making AJAX requests. Communication is primarily through a postMessage call in JavaScript, which requires data to be serialized, thus limiting the size of data that can be transferred without impacting performance. Their DOM access is also only indirect through the postMessage call, which reduces the risk of manipulating the main page content.

Besides limitations on DOM access, web workers have other security restrictions that help reduce certain attack vectors:

Limited API Access. While they have access to some APIs, they lack access to sensitive APIs like localStorage or geolocation.
Same-Origin Policy. Web workers are subject to the same-origin policy, meaning they cannot access resources from different origins unless explicitly allowed.

These limited security restrictions are a major reason why web workers are not adequate for use in the Google Privacy Sandbox.

What are Worklets?

As mentioned in a prior post, worklets are a new concept that was part of the CSS Houdini specification and were released in Chrome 61 in 2017. Worklets are a lightweight version of web workers that allow developers to extend the CSS rendering engine to handle custom CSS properties, functions and animations. Worklets are similar to web workers in that at least some types of worklets, specifically audio and animation worklets, can run scripts independent of the main JavaScript execution environment.

Worklets were specifically designed to provide developers more control of how browsers render pages. It allows them to extend beyond the limitations of CSS and write code to directly control how the browser creates individual pixels on the page. Instead of using declarative rules to render an specific element, worklets allow the developer to write code that produces the actual pixels on the page.

Before we get into more detail about worklets, you are probably wondering how something designed for managing UI and content elements applies to backend processing functionality like auctions, bidding, and reporting. This is where things get a bit hazy. Nowhere online can I find a discussion of how, when, and why worklets began being used for use cases other than those originally for rendering. Yet at some point, developers figured out that the enhanced security and isolation provided by worklets, as well as some of their other features, made them the best choice for running processes not related to rendering. You might call this an “off-specification use.”

The best guess I have comes from the Chromium documentation and Mozilla main documentation pages on worklets. The Chromium page identifies four types of worklets broken into two classes:

Main thread worklets (Paint Worklets and Layout Worklets): A worklet of this type runs on the main thread.
Threaded worklets (Audio Worklets and Animation Worklets): A worklet of this type runs on a worker thread.

The Mozilla main documentation page on worklets, on the other hand, has a table (Table 1) that identifies the following types of worklets (and what they do):

‍

Table 1 - Types of Worklets in Mozilla Worklets Documentation Page

‍

Notice the last row of the table - for Shared Storage worklets. These are part of the shared storage API, which is one of the storage types specifically used by the Google Privacy Sandbox, and which we will deep dive into in a later post when covering the storage elements of the Privacy Sandbox. This is a new API, currently still in draft, that was developed as a complement to the approach to storage partitioning developed by the W3C Privacy Community Group, which was described in our last post.

Storage partitioning was designed to reduce the likelihood of cross-site tracking. The problem with partitioned storage is that there are many legitimate use cases for ad tech that require some form of shared storage to implement. The Shared Storage API (shown as a storage service in our services architecture diagram in a prior post) is used for two very specific purposes in the Google Privacy Sandbox:

Reporting data across auctions, advertisers, and publishers in a manner that prevents cross-site leakage. The worklet uses a number of technologies, including adding noise to the data that is pulled from storage, to prevent recombining data across sites that would allow for cross-site leakage.
Rendering of the winning ad from an auction into a fenced frame using cross-site data in a way that limits the potential for mixing data between two entities. The developer uses JavaScript to select a URL (in this case an opaque URL) pointing to ad creative from a provided list of available ads that were added to shared storage during the bidding process. The developer can then use the API to render the ad from the winning bidder into a fenced frame.

The fundamental notion of the Shared Storage API is to intentionally not partition storage by top-frame site, although elements like iFrames and fenced frames would still be partitioned by origin. How then to prevent cross-site re-identification of users? Basically, the designers require that data located in shared storage can only be read in a restricted environment that has carefully constructed ways in which the data is shared.

Thus were born the notion of shared storage worklets. This is because the fundamental design of worklets provide the perfect way to allow shared storage and yet reduce the surface for potential cross-site re-identification of users to a minimum.

Shared storage worklets were first introduced in Chrome 86 (released in April 2020) as an experimental feature and still remain experimental according to Mozilla. It allows developers to run private operations on cross-site data, without the risk of data leakage. This is particularly useful for scenarios like fenced frames where isolation and privacy are crucial. It is not yet part of the official W3C specification. This means it has limited documentation (in the W3C draft Shared Storage API specification and the Shared Storage API explainer in the Github repository), and its availability and functionality might differ across browsers and could change in the future.

The Shared Storage worklet provides an opportunity to examine why worklets are used for auctions and bidding. If we can understand why they were chosen for reporting, for example, then we can infer why they were the best element to use for auctions and bidding.

Unique Features of Worklets

There are some core differences between web workers and worklets that make them the best platform for background processes in the Privacy Sandbox.

Worklets Have Stronger Isolation versus a web worker. Web workers run in a separate thread, providing isolation from the main thread and other web workers. This prevents JavaScript code running on the main thread from directly modifying data or interfering with the worker's execution. However, they still have access to the DOM, can share data through message passing, and potentially leak information through side-channels. Worklets have restricted access to the DOM, significantly reducing the risk of manipulating the main page content or leaking information through DOM elements.
Worklets have a reduced API surface. Worklets restrict access to a number of APIs that web workers have access to that could provide opportunities for potential information leakage through side-channels. Table 2 shows the list of restricted APIs and why those restrictions are in place.

‍

Table 2 - API Restrictions in Worklets vs. Web Workers

‍

Worklets are thread-agnostic. That is, they are not designed to run on a dedicated separate thread, like each worker is. Implementations can run worklets wherever they choose (including on the main thread). This feature allows the Sandbox to utilize worklets within the main thread without compromising isolation. The reduced need for dedicated worker threads simplifies the isolation management within the Sandbox environment.

This is important from a performance perspective. The browser can leverage the main thread's existing resources for less intensive worklets, potentially improving overall responsiveness.

Worklets are able to have multiple duplicate instances of the global scope created, for the purpose of parallelism. While traditional web workers have a single global scope, worklets allow creating multiple instances with the same global scope. This enables parallelism within a single worklet instance.

As we will see in a later post, this is critically important for auctions and bidding as it could, for example, allow a bidder to bid on multiple auctions on a single page without having to create separate worklets and the computational and memory overhead they represent.

Worklets do not use an event-based API. Instead, classes are registered on the global scope, whose methods are invoked by the user agent. Unlike web workers that rely on events for communication, worklets use registered classes on the global scope. User interaction or triggers call specific methods on these registered classes. This design choice potentially simplifies the security model for worklets as it reduces the attack surface compared to event-based communication, which involves registering and processing various event listeners.

This feature is important to the Privacy Sandbox because registering and managing numerous event listeners, potentially across different objects, could allow poorly designed or malicious code to register for events it shouldn't or handle them incorrectly, potentially providing a side-channel for information leakage.

A class-based API, on the other hand, has a well-defined set of methods exposed to the user agent. This reduces the attack surface, as attackers have fewer entry points to exploit vulnerabilities. In the context of the Google Privacy Sandbox, the Sandbox implementation might leverage the class structure to define specific classes and methods around use cases that would be allowed within the worklet versus other use cases that would not be. This enables fine-grained control over the functionalities available to the worklet, further restricting unauthorized code execution and enhancing security.

Worklets have a lifetime for their global object which are specified by the browser vendor. Web worker global objects are typically tied to the worker's lifetime. They are explicitly terminated when the worker is terminated. Unlike web workers with a more explicit termination model, worklet global object lifetime is defined by the implementation, not the developer. This means the browser vendor determines how long the worklet and its associated data persist.

This implementation-defined nature can be leveraged by the Privacy Sandbox in specific ways:

Controlled Persistence. The Privacy Sandbox might define specific policies for worklet lifetimes within its environment. This could involve:
Short-lived worklets. For tasks involving less sensitive or temporary data, the worklet and its global object might be terminated shortly after the task completion. For example, reporting worklets currently have a fixed 50ms time limit for gathering information. There has actually been a request from some of the FOT #1 participants to not only make this fixed time longer, but to provide a range so that different ad servers with different (more time consuming) performance characteristics on the code called by the reporting worklet can complete their task.
Enforced Termination. The Sandbox can enforce stricter termination policies, ensuring worklets and their associated data are not retained for longer than necessary, mitigating potential privacy risks.

‍

Unique Features of the Shared Storage Worklet

Now let’s turn back to the Shared Storage worklet. This has two unique features which will also be important when discussing auction and bidding worklets:

The calling site must be attested to via a privacy sandbox enrollment process in order to use the Shared Storage API. Candidate organizations and their developers who wish to employ the Google Privacy Sandbox must formally enroll to be allowed to participate. There is an offline enrollment process with an enrollment form that must be submitted and reviewed by Google. Additionally, there is a second process, called attestation, which is used to confirm that a participant in the Privacy Sandbox has agreed to use specific APIs according to the rules established by Google.

Here is a English version of the core privacy attestation from the attestation GitHub repository:

The attesting entity states that it will not use the Privacy Sandbox APIs or services for the purpose of learning that you are the same user across different sites or apps, and that it will not otherwise circumvent the privacy protections of the Privacy Sandbox.

Developers who submit an enrollment form are then sent a file that contains the attestations for the APIs they requested to use. These are stored in a specific directory on their website (e.g. https://www.example.com/.well-known/privacy-sandbox-attestations.json) and checked regularly by Google to ensure they have not been tampered with.

We will discuss attestation at length in a later post, but for now it is enough to know that If the calling site has not included the Shared Storage API in a privacy sandbox enrollment process and attestations, calls to sharedStorageWorklet.addModule() will be rejected.

The Shared Storage Worklet allows only a single code module to be added. Even with a successful enrollment, repeated calls to addModule() on the same shared storage worklet will be rejected. This simplifies the code execution environment. More importantly, it reduces the potential for complex interactions between different modules, which helps minimize the risk of unexpected behavior or security vulnerabilities within the worklet.
The Shared Storage Worklet has a keep-alive capability. While the Shared Storage Worklet can only run a single code module at a time, the Shared Storage API specification does contain a keepalive: true option that allows the site to run multiple worklet operations sequentially in the same worklet. This is a specific variance from the standard worklet which have a lifetime for their global object which is specified by the browser vendor

This would seem to violate one of the core reasons why worklets were chosen for Privacy Sandbox operations. The reason for this variance has to do with performance. Running multiple operations sequentially on the same module avoids the overhead of reloading the entire module for each operation. This can improve performance and resource utilization, especially for scenarios involving frequent communication between the website and the worklet. In the case of the Shared Storage API, multiple calls may need to be made to shared storage to provide the right URL for a specific ad placement. An example would be when there is a frequency cap for that specific ad in a browser. If the frequency cap is reached, then another call has to be made to shared storage. Without the keepalive, this would require reloading the entire worklet and risk meeting the 25 ms response requirement that most programmatic bids are required to meet.

The Shared Storage Worklet has a privacy budget. The concept of a privacy budget will also be covered in more detail in a later post. Suffice for now to say that in the Privacy Sandbox, there is a metric for potential information leakage that could allow a bad actor to perform cross-site information exchange and thus re-identify an individual or fingerprint a browser. This metric is a maximum of information leakage that will be tolerated per origin site (not per browser), and as of this writing there is no clear standard for what that number should be. But each time a participant generates an action from a specific browser, there is a calculation made as to the amount of information leakage involved. This amount of leakage is subtracted from the privacy budget for that site. If a calling site has insufficient privacy budget remaining, then a call to the worklet will return an error

‍

What Can we Learn About Broader Worklet Use from the Shared Storage Worklet

So now let’s turn to the core worklets of the Protected Audience API - auctions worklets and bidding worklets. We spoke about reporting worklets above, so we’ll consider those covered for now. But as with all the Privacy Sandbox functions running in worklets, we will be covering them in a great deal of detail later.

Why are worklets the vehicle of choice for auctions and bidding functionality? There a three main areas of concern for the Privacy Sandbox for which worklets provide an excellent platform:

Performance
Security
Data Isolation

We’ll examine each of these in order

Consistent Performance

As any person familiar with real-time bidding is aware, there can be multiple auctions on a page and for each auction there are multiple bidders. Since the Google Privacy Sandbox moves the ad server into the browser, we now have a significant set of performance issues since browsers were never designed to handle this kind of real-time processing, and definitely not at scale with potentially tens of bidders or more for each auction. Worklets provide the ability to run multiple activities in parallel, with worklets being created and closing on different timelines, without impacting the main Javascript thread. Each auction would have its own worklet, as would each bidder whose interest groups are qualified for the auction (as an aside, qualifying bidders with desirable interest groups for a specific auction is one of the functions performed by the auction worklet). Web workers were never designed to handle this type of dynamic workload.Moreover worklets, as previously mentioned, allow for the creation of multiple instances with the same global scope. This enables parallelism within a single worklet instance. This is critically important for auctions and bidding as it could, for example, allow a bidder to bid on multiple auctions on a single page without having to create separate worklets and the computational and memory overhead they represent.

Much of the work in the early TurtleDove experiments and now FOT #1 are centered on optimizing performance of the auction and bidding worklets. There is still a very large question mark around how well worklets will scale once we move beyond the 1% of Chrome traffic being tested in Q3 2024. It is one of the reasons so much work and testing is going on for server-side auction and bidding functionality in a Trusted Execution Environment, and over time I do not doubt we will see innovation that pushes more of the browser side functionality to the server side without impacting the privacy standards the Sandbox is being designed to maintain.

Lastly, worklets also allow for consistency of performance within the browser when multiple worklets need to run the same functionality. An example of this was discussed in a particular issue in the FLEDGE Github repository. Certain functions, like private aggregation functions, were initially able to run in the main Javascript thread (top-level script context) of a worklet. But in the case where this top-level script ran once across all available worklets for different players in the auction, the effects of the top-level call to the functions in subsequent worklets was undefined and inconsistent. Moving these functions into the worklet provided both better performance and consistency of execution.

Security

Similar issues apply to the auction and bidding worklets as to the shared storage worklet when it comes to security features. Anyone who wishes to instantiate an auction or bidding worklet must have enrolled with the Protected Audience API and made its attestations, otherwise the request to add a worklet of this type will be rejected. Worklets ensure that attestation has occurred.

The limitation to a single code module also provides similar security against sloppy coding or the insertion of additional code modules by evil actors unbeknownst to the owner of the worklet.

Isolation

Isolation of user data between ad tech players to prevent reconstruction of a browser’s identity through cross-site data collection is always at the heart of anything to do with the Privacy Sandbox. Worklets tighter isolation - no access to the DOM, their reduced API surface, their restricted access to geolocation and browser data, as examples - provide a better isolation substrate for Privacy Sandbox functionality.

The fact that worklets can have an explicit lifetime is another critical feature for auction and bidding. Publishers or SSPs must put time limits on auctions in order to ensure that ads are returned to all available slots within the browser rendering window.

‍

Next up: Storage Services Underlying the Google Privacy Sandbox

‍