Functions of a Web Browser
In order to understand the privacy implications of a Web browser, it is first necessary to explore its basic functions. Modern browsers perform a number of complex tasks and contain multiple privacy vulnerabilities as a result of this complexity.
Video Lecture
HTML Rendering and Layout
One of the main functions of a Web browser involves the processing, layout, and rendering of Web content. The typical Web page consists of a document in the Hypertext Markup Language (HTML). This HTML document either embeds or links to other content that makes up the page, including style rules (Cascading Style Sheets – CSS), scripts, images, videos, audio files, and so forth. In order to process a Web page, the browser must first start by processing the HTML content.
Processing HTML requires the same steps as processing the text of any other markup or programming language. First, lexical analysis is performed, which breaks the HTML document into tokens. As part of this process, the browser code determines where the HTML tags start and stop, among other things. From here, the parsing process can begin, where the converted HTML tags are loaded into an internal representation of the document structure within the browser.
Once the browser has the internal structure of the document available inside its memory, it can then perform layout and styling functions, deciding where each displayable element should be placed on the page. Visual adjustments, such as the application of colors, sizes, fonts, transformations, and other effects are determined at this stage. Once the layout process is complete, the browser can then render the result by drawing it to the screen. Rendering speed can be increased by using the graphics card to perform hardware acceleration.
Browser Engines
In reality, browsers are generally divided into several internal components. The part that processes HTML and performs the layout and rendering is called the browser engine. It is the central component of the browser, responsible for producing the visual output that we can see as a Web page. While a number of browser engines have been created over the years, three of them are widely used today.
The Blink1 engine is a part of the Chromium project. It is the browser engine that powers Chromium, Google Chrome, Microsoft Edge, Brave, and a number of other browsers. Meanwhile, the Gecko2 engine is the browser engine that powers Mozilla Firefox and its derivative browsers. Apple Safari makes use of the WebKit3 engine, which was originally forked from the KHTML engine of the Konqueror browser from the KDE desktop project. Blink is, in turn, a fork of WebKit.
Connection Management
Apart from processing and rendering the Web content, the browser engine is responsible for initiating HTTP client requests to Web servers and receiving and handling HTTP responses. Connections from the browser engine can be initiated by the user as a result of explicitly opening a URL. Each time an HTML page is processed, additional requests are made by the browser engine to retrieve any content included in the Web page, such as images, scripts, and media files. JavaScript programs running in a page can also cause the browser engine to make new connections to obtain resources or transmit data.
As part of managing requests, the browser engine also needs to perform DNS lookups. The Domain Name System (DNS) is a global directory that maps human-readable names to IP addresses.4 It is this system that allows us to remember a friendly name for a website instead of needing to remember an unfriendly numeric address. For example, www.coastal.edu is easier to remember than 199.120.21.79.
JavaScript
Another function of the browser is to process and execute script instructions. Most of these instructions are provided in the form of JavaScript,5 although a newer WebAssembly standard also exists.6 In any case, the process of converting from script code to something the browser can interpret begins with the same general kinds of processes that are used for understanding HTML code: lexical analysis and parsing. Once the browser has an internal representation of the script within its memory, the script can then be interpreted by the browser, converting the instructions inside the script into actions that are run on the computer. Script performance can be improved by using a Just-In-Time (JIT) compiler, which changes the JavaScript (or WebAssembly) instructions into CPU instructions that the computer’s processor can handle.
As is the case with the browser engine, the code required to handle scripts properly is significant and is broken into its own modular piece within the browser. This piece is called the JavaScript engine. There are three major JavaScript engines, each of which is paired with a corresponding major browser engine. The V8 JavaScript engine is developed by Google and is used with the Blink browser engine in Chromium, Chrome, Edge, Brave, and other browsers.7 SpiderMonkey is the JavaScript engine used in the Firefox browser alongside the Gecko browser engine.8 The WebKit browser engine in Apple Safari makes use of the JavaScriptCore component as its JavaScript engine.9
The Document Object Model
Since the browser is using two separate engines – one for handling the layout and one for processing JavaScript code – there needs to be a component to join these two pieces. The Document Object Model (DOM) is this component.10 It connects the browser engine to the JavaScript engine, allowing JavaScript code to interact with HTML elements.
Another function of the DOM is that it provides various Application Programming Interfaces (APIs) to JavaScript code. These APIs enable different browser features to be made available to websites.
Storing Data
Another common browser capability is the provision of local storage for Web-related content. Websites can ask the browser to store various pieces of information in the form of cookies and HTML Web Storage.11 In addition to storage requested by websites, the user can also store information to disk via different browser functions. Files can be downloaded, for example. Also, most browsers provide a method for users to store bookmarks (also called favorites) to pages found on the Web.
Browsers can store data automatically without the website or the user making a storage request. One common way such storage occurs is when the browser maintains a cache (pronounced the same way as “cash”) of the data it has received in HTTP responses. Cached data includes copies of HTML pages, images, media files, and other content that the browser has previously loaded. The idea behind the cache is to avoid re-downloading the same content multiple times if the user visits the same site more than once, or if a site contains multiple pages that include some of the same files.
Another way the browser stores data automatically is in the form of history. By default, most browsers maintain a record of all visited pages. A user can then look through the browser history to revisit a page that they have visited previously. While the history and cache are designed to improve the browsing experience, they are potential threats to user privacy. Any cache or history data stored on a computer can be recovered using forensics tools, potentially exposing the user’s browsing activity.
References and Further Reading
-
The Chromium Projects. “Blink (Rendering Engine).” ↩
-
MDN Web Docs Glossary. “Definitions of Web-related terms: Gecko.” ↩
-
Cloudflare. “What is DNS? | How DNS works.” ↩
-
Ecma International. “ECMA-262: ECMAScript 2022 language specification.” June 2022. ↩
-
WebAssembly Community Group. “WebAssembly Specification.” ↩
-
Apple Developer Documentation. “JavaScriptCore.” ↩
-
MDN Web Docs. “Document Object Model (DOM).” ↩
-
W3Schools. “HTML Web Storage API.” ↩