Architecting Transcend Consent for performance

ProductCompany
David Mattia
September 30th, 2022 · 11 min read

At a glance

  • Airgap.js is a client-side Consent Management Platform library that powers Transcend Consent, letting you manage tracking events generated by third-party trackers on your site.

  • Using Airgap.js, your site can allow some tracking events and quarantine others—replaying them only if and when your user later consents, at an appropriate place in their user journey. Airgap.js can also block unwanted tracking events entirely.

  • As Airgap.js can only regulate network requests after it loads, it’s critical that Airgap.js is loaded synchronously as soon as your site loads. This is a blocking request that must be completed before you can load other regulated resources. This means that Airgap.js performance is of utmost importance.

  • In this post, we’ll cover how Airgap.js was architected for performance at every stage of its lifecycle. In some cases, our research shows that websites using Airgap.js can even perform faster than the same site without Airgap.js.

Table of contents

Section 1: Airgap.js loads quickly

When you include an Airgap.js bundle on your site, you are including:

  • The core Airgap.js library that acts as a smart firewall over outgoing network traffic in the browser

  • Your company-specific data flow and cookie information that tells the Airgap.js firewall which flows/cookies require which user consent preferences

  • Injected region information, where the script will know what region the user is requesting the script from based on their IP address

This is all present in the one script, which is why we call it your “bundle”. In this section we explore the download part of the script itself, along with everything else that happens on the network side before the download can begin.

The Proof

Before we dive into the details of why Airgap loads so quickly, head to the PageSpeed Insights page for our company homepage. Here, we can use transcend.io as an example of a website using our own consent manager, but you can use this tool to test your own site for Airgap’s performance as well. This is especially useful if you are deploying your Airgap.js bundle to a dev/staging environment before launching it in production.

Under the “Avoid chaining critical requests” column you can see the requests that must complete before other content on the page can load. If you ignore the fact that our Marketing team uses a lot of fonts, you’ll notice that the bottom request is our Airgap.js bundle:

This shows us two things:

  • It took the browser 30ms to initiate a request to our CDN, start downloading, and complete downloading the bundle. In practice, we typically see results between 30ms-70ms with this tool.

  • This particular compressed bundle is 37.28 KiB. Remember that this includes both Airgap.js as well as our company configuration, so your bundle size may differ a bit.

If you go to that PageSpeed site, you will likely see slightly different values as each load re-computes the bundle size and load time.

A graph showing a website loading 50 times each with airgap.js enabled vs. disabled, showing no statistically significant difference in the First Contentful Paint metrics.

This is graph shows a website loading 50 times, each with airgap.js enabled vs. disabled, showing no statistically significant difference between First Contentful Paint metrics.

Downloading a bundle is fast

We built Airgap.js with a number of rules that keeps our size small:

  • Zero third-party dependencies: Our production Airgap.js build includes zero third-party dependencies, ensuring that we control every line of code that’s included. This is primarily a security measure, but has the benefit of helping to keep our bundle small.

  • Efficient bundling: we build our bundles with esbuild to tree-shake and minify our code to the smallest size we can make it.

  • Lossless compression: We enable both Brotli and Gzip compression so that our bundles become even smaller. Uncompressed, most bundles end up around 80-90KiB, but Brotli compression brings down the size to the final value shown in the `Network` tab of the browser above.

  • Global caching: We host Airgap bundles on an AWS CloudFront distribution and utilize caching wherever possible. With over 400 edge locations, CloudFront is very likely to have an edge location near your users. Read on to the next section to hear how we keep our cache hit rate high and our cache-misses efficient.

  • Testing: Our Continuous Integration and Deployment pipelines ensure that the core Airgap.js library meets our desired size requirements before any release.

How we efficiently customize the bundle by region

Airgap.js treats users differently based on their location and the privacy laws in effect in that region. The dynamic regime detection that allows us to customize the user experience happens inside our CDN, meaning that the 30-70ms download time shown earlier is possible even with location-specific bundles.

How do we do this efficiently? We first need to understand how AWS CloudFront works.

Image source

When a user requests a file from CloudFront, the request first goes to one of the 400 edge locations closest to the user. The edge location keeps a cache of requests made to the CDN and, if it has a copy of the file to send, it sends it back immediately without asking the origin server (the location containing the actual CDN files) for anything.

But if the edge location does not have a cached file to send to the user that meets their requirements, it asks the origin to provide such a file. As the edge location passes the file onto the user, it updates its cache.

AWS allows dynamic code to run on any of these four event points (the request from the user to the cache, the request from the cache to the origin, the response of the origin to the cache, and the response from the cache to the user) via Lambda@Edge functions. These functions run inside the edge location itself, meaning the code is executing close to your user’s physical location. 

One of the goals of Lambda@Edge functions is to minimize the amount of times that they run, as they incur performance penalties compared to the normal flow of CloudFront where only static files are passed around.

To respect this, we needed to ensure that our functions ran between the cache and the origin so that once our function had run, its result would be cached for other users in the same area. This led us to implement an origin-request function that fetches the bundle from the origin server and prepends location information about the user to it.

With this pattern, we then needed to make sure that two users talking to the same edge location who were in different states/regions (such as users in Minnesota and Wisconsin talking to an edge location in Ohio) would not share the same cache key, as we would want to have the `countryRegionName` field reflect the state of the requestor for both users.

To do so, we updated the cache key on the edge location so that the country and regions the users are from are included as cache key parameters—meaning that we don’t just cache bundles per edge location, but per edge location per region that has requested them.

As most edge locations only receive requests from a small area of the world, we can be sure that there are not too many cache keys per edge location and that our hit-rate should stay quite high.

View our global cache rate and latency for cache-misses on a public Datadog dashboard here.

As of the time of this writing, 99.83% of requests to our CDN are cached. For the 0.17% of requests that miss the cache, there’s an average latency of 341ms added on for the Lambda to compute the region to run. The more traffic you have to your website, the more likely it is that your cache hit rate would be even higher.

Section 2: Airgap.js operates efficiently during website loads

After the Airgap.js script loads, it will initialize itself. This can be done asynchronously to prevent our script from affecting initial site load and render times.

After initializing, Airgap.js can regulate all traffic on your website. It does this by looking at the user consent preferences, user region, the hostname/cookie name of the request being made, and the setting selected in our admin dashboard for what purpose each hostname/cookie serves on your site.

With those inputs, it can then allow, reject, or quarantine each request as your site loads and operates. The effect of this operation can be seen per request, and is typically between 35-350μs/request.

Transcend supplies a built-in user interface (UI) for users to control their preferences, but you are welcome to use your own UI as well. In either case, the UI can be asynchronously loaded so that it does not block any site functionality.

Unlike most consent management platforms, Airgap.js often doesn’t need to appear as soon as your website loads, meaning the time to download the UI largely goes unnoticed. The UI is a separate download from the main Airgap.js Bundle, but uses the same CDN with HTTP/2 so your browser should be able to efficiently fetch the UI components after fetching the main bundle.

The Proof

The following performance metrics were measured on an Apple M1 Pro laptop under light to medium load, using the transcend.io bundle configuration:

  • Request processing overhead

    • Average (hot JIT): ~35μs/request

    • Worst case (cold JIT): ~350μs/request

    • This is the entire overhead for 'pure network APIs' (e.g. fetch, XMLHttpRequest, etc.)

  • 'HTML string' DOM mutation processing overhead: ~65μs/request-causing node (in addition to request processing overhead)

    • HTML string consumers include innerHTML, outerHTML, iframe.srcdoc, and document.write

As a stress test, we benchmarked re-processing transcend.io's entire homepage (equivalent to calling document.documentElement.innerHTML = document.documentElement.innerHTML) and measured an average total latency of 24.6ms while processing ~250 request-causing elements per call. Performance overhead scales roughly linearly with the size of your bundle configuration.

To test the performance impact for yourself, you can install an Airgap.js bundle onto a site locally (without deploying the bundle to affect all of your users) through something called a UserScript. These instructions show you how to inject a bundle into your site so you can test important functionality and performance impacts before fully deploying the bundle into production.

A graph showing a website loading 50 times, each with airgap.js enabled vs. disabled, with the Time to Interactive metric performing slightly faster on average for the website with airgap.js enabled. This may be due to airgap.js blocking trackers and scripts that would otherwise affect the time for the page to become interactive.

This graph shows a website loading 50 times, each with airgap.js enabled vs. disabled, showing the Time to Interactive metric performing slightly faster on the website with airgap.js enabled. This may be due to airgap.js blocking some trackers/scripts that would otherwise affect the time for the page to become interactive.

How Airgap.js regulates traffic

There is no official browser API for intercepting outbound HTTP requests. Airgap.js regulates core JavaScript and HTML tools such as `fetch` APIs, `<script>` tags, `<img>` tags, and dozens of other ways that websites can talk to each other.

You can read about our journey to find the most secure and efficient way of intercepting network calls in this blog post, where we cover how we tried using sandboxed iframes, dynamic Content Security Policies, and other ideas before eventually settling on writing “Patchers” that override the global interfaces to inspect request metadata before sending each request through the native capabilities.

This Patcher paradigm is extremely lightweight, leading to the small footprint added for each request.

How Airgap.js communicates with Transcend’s backend

Airgap.js can send telemetry to Transcend’s backend that allow for us to provide you with aggregated, anonymized analytics about your user’s usage of the site, and can also be used to auto-classify the hostnames/cookies your site is using to ensure your settings are always up to date.

Our telemetry system was architected to avoid accidental collection of personal data by only recording encountered request domains and matching data flow and cookie rules, among other privacy-preserving anonymous statistics.

This is accomplished by using the Navigator.sendBeacon Web API when appropriate to asynchronously send analytics data. This API allows the browser to delay telemetry emission until enough resources are free, resulting in better site performance. Even if it sends the message when the page is being closed, it doesn’t block the loading of the next page.

With this approach, the performance and privacy impact of telemetry data is extremely minimal.

Section 3: Airgap.js can speed up web experiences

So far, we’ve discussed that Airgap.js can be downloaded quickly, initialized quickly, and is efficient while regulating traffic. But our claim isn’t just that Airgap.js is often quick to use—it's that it often can improve site performance. How?

It depends on the specifics of the site that it’s regulating, but essentially because Airgap.js can block requests or even entire scripts from running until a user consents to the purposes behind those resources (if they consent at all), your main thread can avoid being blocked by tools your user does not consent to.

Third party code on load performance

PageSpeed Insights from a sample website

In the example above, we can see this on a website that uses a number of common tools like Adobe Tag Manager, Google Tag Manager, Facebook, Google Analytics, Adroll, and more. Many of these scripts can block the main thread for hundreds of milliseconds or even a few seconds.

When exploring commonly used websites on https://pagespeed.web.dev/, it’s not at all uncommon to see these blocking third-party requests take up multiple seconds of time on the main thread.

When Airgap.js initializes, it will regulate each of these scripts (and the requests the scripts potentially send out) and may block many of them depending on the user’s previous consent settings (if you save their preferences to localstorage or your backend, which is configurable) or the default consent settings for the area where the user made the request from.

In summary, if your site blocks more time on the main thread than it takes to download and initialize Airgap.js, then congratulations: Respecting your user’s privacy actually made their web experience more performant. And we think that’s a pretty cool change compared to the dark patterns and nearly-unusable web many users have had to deal with for the past few years.

Section 4: Let Transcend performance test your website

We have a workflow we can run against any URL with Airgap.js enabled that runs hundreds of performance tests using Google Lighthouse and looks for statistically significant differences in page performance. Our team will gladly run for you as you set up your bundle.

The overall methodology is described on this documentation page and a sample of what a report looks like can be found here.

More articles from Transcend

California Age Appropriate Design Code Act (AB 2273): What You Need to Know

Learn what’s required, what’s prohibited, and what businesses need to do to prepare for the California Age Appropriate Design Code Act.

September 23rd, 2022 · 9 min read

Why we built CLI & Terraform Provider

Learn the what and why of our new infrastructure as code tools: CLI & Terraform Provider.

September 20th, 2022 · 12 min read

Privacy XFN

Sign up for Transcend's weekly privacy newsletter.

San Francisco, California Copyright © 2022 Transcend, Inc.
This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.
Link to $https://twitter.com/transcend_ioLink to $https://www.linkedin.com/company/transcend-io/Link to $https://github.com/transcend-io