Building the Browser Extension
LLM Fine-tuning for Anti-Tracking in Web Browsers
Welcome to the fifth and final session of our course: LLM-Fine-tuning for Anti-Tracking in Web Browsers.
In this series, we have explored how to leverage Large Language Models (LLMs) to improve online privacy by detecting and blocking trackers embedded in websites.
We began by understanding the limitations of traditional anti-tracking methods, moved into generating and preparing a rich dataset, fine-tuned a powerful yet efficient machine learning model, and built an API server to serve predictions in real time.
Now, we reach the final engineering milestone: building the browser extension that will act as the user-facing layer of our project.
Check out the video for a quick, hands-on demo. For more theory and background, read on.
1. Why Browser Extensions?
Browser extensions offer a unique ability: they can interact directly with the web pages users visit, monitor network requests, modify page behavior, and enhance user experience — all without requiring changes to the websites themselves.
For a privacy tool like ours, browser extensions are an ideal deployment method because:
They can intercept network requests before they load fully.
They operate locally on the user’s device, enhancing trust and minimizing external dependencies.
They provide a natural user interface via popups, notifications, or icons.
They can be easily distributed through browser extension stores (Chrome Web Store, Firefox Add-ons, etc.)
Thus, instead of creating a standalone app, integrating our LLM-powered anti-tracking model through a browser extension gives users instant and invisible protection as they browse the internet.
2. Anatomy of a Browser Extension
Before diving into code, it’s important to understand the components that together form a functioning browser extension.
Our extension will consist of the following key files:
manifest.json:
This is the configuration file that defines metadata about the extension, the permissions it needs, the background scripts it runs, and the user interface components it loads.background.js:
A background service worker that runs in the background and listens for web requests. It is responsible for intercepting outgoing HTTP/HTTPS requests and analyzing them for tracking behavior.popup.html and popup.js:
These files create a simple popup interface that appears when the user clicks on the extension icon. It displays the number of trackers detected and provides other useful information.popup-ui.js:
A supporting script that handles UI updates inside the popup window, such as displaying statistics or detailed logs.images/:
A folder containing icons used for the extension badge and browser toolbar button.
Each of these files plays a specific and essential role, working together to create a seamless user experience.
3. Setting Up the Extension’s Manifest
The manifest.json
file is the blueprint that the browser reads first when loading the extension.
In this file, we specify crucial details such as:
The extension’s name, version, and description.
The permissions it needs. For our project, we need:
webRequest
andwebRequestBlocking
permissions to intercept network requests.activeTab
permission to access information about the current tab.Access to all URLs (
<all_urls>
) because trackers can exist on any site.
The background script that should run continuously.
The popup interface that should appear when the extension icon is clicked.
Browsers enforce strict permission policies, so only explicitly declared permissions are allowed.
It’s important to request only what we need — too many permissions can raise security and privacy concerns for users.
Once the manifest is correctly written, the browser understands what our extension wants to do and how it behaves.
4. Intercepting Web Requests: The Role of Background.js
The heart of our real-time tracker detection lies in background.js.
This script runs invisibly in the background while the user browses, observing all outgoing HTTP and HTTPS requests.
Here’s the process flow:
Intercept the Request:
Using thechrome.webRequest.onBeforeRequest
API, the background script listens for any network request initiated by the browser.Extract the URL:
When a request is detected, the script extracts the destination URL from the request metadata.Send URL to API Server:
The background script then makes afetch
request to our local Flask API server (http://localhost:5000/predict
), sending the URL inside a JSON body.Receive Prediction:
The API server responds with a prediction: whether the URL is classified as a tracker, and the confidence score.Handle the Response:
If the response indicates the URL is a tracker, the background script records this event internally. It can increment a counter, log details for debugging, or trigger visual changes in the extension icon.Optionally Block the Request:
While in our basic implementation we primarily detect trackers, it’s possible to extend this logic to block tracker requests altogether using theblocking
capability of thewebRequest
API.
By offloading the heavy prediction work to the API server, the background script remains lightweight and responsive, merely orchestrating the flow of data and decisions.
5. Building the User Interface: Popup.html and Popup.js
While background scripts handle invisible logic, users need a way to see what is happening. This is where the popup interface comes in.
When the user clicks on the extension’s toolbar icon, a small popup window opens, showing:
The number of trackers detected on the current page.
A list of detected tracker domains (optional for detailed mode).
A confidence level or other statistics (optional).
The popup.html file defines the static structure of this popup:
Titles, counters, list containers, etc.
The popup.js (and supporting popup-ui.js) handle dynamic behaviors:
Updating the tracker counter in real time.
Populating a list with tracker URLs detected during the current browsing session.
Resetting counters when navigating to a new page.
This simple but informative UI allows users to stay aware of their browsing environment without intruding on their experience.
6. Loading and Testing the Extension
After writing all the files, which you can find on the project’s github repository, the extension must be loaded into the browser manually during development.
In Chrome, this involves:
Opening
chrome://extensions/
in the address bar.Enabling Developer Mode.
Clicking Load unpacked and selecting the directory containing our extension files.
Once loaded:
Navigate to different websites (e.g., Wikipedia, CNN).
Observe how the extension monitors network requests.
Watch the popup as it updates when trackers are detected.
It is also important to open the browser’s extension console (via “Inspect popup” or “Inspect background page”) to view debug logs, errors, or internal tracker statistics during development.
By visiting known tracker-heavy websites and checking detection rates, we can validate that the end-to-end system — from interception to prediction to UI update — works correctly.
7. Packaging and Deployment Considerations
When the extension is ready for production, it must be packaged and optionally submitted to browser extension marketplaces.
Packaging typically involves:
Removing development logs.
Compressing the extension directory into a
.zip
file.Submitting it to stores like the Chrome Web Store or Mozilla Add-ons, along with descriptions, screenshots, and compliance documentation.
If hosting the API server externally (instead of locally), it is important to:
Ensure HTTPS communication between extension and server.
Implement authentication or rate-limiting if needed.
Monitor server load and latency to maintain real-time UX.
Security and privacy are paramount, especially since our extension deals with user browsing data. Following best practices and transparent documentation will build user trust.
8. Summary
In this session, we translated all our backend efforts into a tangible user-facing product: the browser extension that provides real-time, AI-powered tracker detection. We learned:
- the anatomy of an extension,
- the orchestration of network interception and model querying,
- the creation of a lightweight user interface,
- the practical steps for testing and deployment.
By now, you have built an end-to-end system that leverages modern machine learning to improve user privacy — from data generation to real-world protection.
Looking Back and Moving Forward
Congratulations on reaching the end of this comprehensive tutorial series!
Take a moment to reflect and congratulate yourself on what you’ve learned so far:
- you learned how to generate synthetic, labeled datasets tailored to real-world problems.
- ou learned to fine-tune powerful Transformer models, leveraging pretraining, and adapting general-purpose LLMs to specialized domains with efficiency and precision.
- you gained hands-on experience in serving machine learning models through lightweight, performant APIs using Flask.
- You learned the structure and functioning of browser extensions.
All of this resulted in the creation of a fully functional, real-time, anti-tracking extension that is served by a fine-tuned Large Language Model hosted on your own server.
Join the Leaderboard!
Did you beat our model performance as part of model training & testing session? Join our community and see if you can be at the top of the Leaderboard. Simply send over your scores to webmail@digitalmunich.com.
Got questions or suggestions?
If you encountered any issues or if you have any suggestions please feel free to reach out to us using the form below.
Want more?
You can check out more free, hands-on short courses, or go to Lumen Home to kick-start or boost your career in data and AI.
Contact
Talk to us
Have questions? We’re here to help! Whether you’re curious to learn more, want guidance on applying, or need insights to make the right decision—reach out today and take the first step toward transforming your career.