Importance of the Web Browser
We use one or more Web browsers daily for accessing Web pages and applications. However, the browser is also a major component of some desktop applications, introducing potential privacy vulnerabilities in unexpected places.
Video Lecture
The Browser is Unavoidable
Modern life depends on both websites and Web applications. We utilize Web-based services for our communications, banking, finance, shopping, and commerce. A connection to the Web is required for many work functions in a modern office, and then we use the same Web at home to unwind after a busy day. The Web browser is the program that we run on our computers and other devices to implement the client sides of websites and Web applications.
Web applications (and even some websites) have grown to imitate traditional computer programs. I’m old enough to remember using a computer before the days of the commercial Internet. Software applications were available to do many of the tasks that we would use a Web-based application to do today. For example, I had a program (purchased from a computer or office supply store) that could create driving directions from one address to another. It used a locally stored highway database to find a route, and then the directions could be sent to the printer for someone to use when driving the route. Today, we would use a Web-based service to perform the same steps, using a mobile phone with voice direction in the car instead of trying to look at printed paper.
Since 1995, Internet and Web technologies have advanced to the point where it is now possible to use a Web application instead of installing one on the desktop. Thanks to client-side features like JavaScript, it is possible to enter content into a Web page and have it saved on the server immediately, without having to press a submit button. Parts of the page can be updated in real time without having to refresh the entire page. Such rich applications are made possible by the technologies that are built into our browsers.
As a consequence of the availability of these technologies, companies are increasingly offering Web-based products and services in lieu of traditional, offline applications. In the “before times,” one could purchase a piece of software from a computer store, take it home and install it, and then use that software indefinitely. Today, companies would rather have an ongoing source of revenue and sell a recurring subscription to use the software instead of only making money once. In most cases, the software runs remotely on the company’s own servers, which also store the files and records created as a result of using the software. If a user cancels the software subscription, they also lose access to their data.
This subscription-oriented approach is called Software as a Service (SaaS) and is a type of cloud computing. The user subscribes to an SaaS offering instead of purchasing a perpetual license for software outright. Some companies even offer SaaS products free of a dollar cost; Web-based email services, video sharing services, and even some “free” financial services are examples. Of course, these SaaS offerings are not actually free. The software has to run somewhere, and the servers on which it runs cost money to buy, maintain, and power. The developers, system administrators, support staff, and other employees expect to be paid in money for their work. Consequently, these “free” offerings are actually paid SaaS subscriptions. The payment is in the form of the personal information of the users.
Feature Feedback Loop
A consequence of rich SaaS applications is that users expect them to work properly, regardless of which browser they happen to be using. Every browser maker therefore has pressure to add support for features that enable rich content, such as JavaScript and other new technologies. Since JavaScript is available in all major browsers, website designers can count on its availability regardless of the richness of the site. Even pages containing only static content can depend on JavaScript to render properly. Since JavaScript works for this purpose, browser makers and standards organizations have little motivation to create declarative alternatives – things akin to HTML and CSS – that would enable rich interactions without the privacy risks of giving website designers an unrestricted programming language.
The result of this situation is that browsers have a feature feedback loop. As soon as a new piece of technology becomes available in a popular browser, website designers will start using that technology for better or for worse. User demands for site compatibility will spur the developers of competing browsers to support the same technology. The rapid pace of innovation in these areas means that relatively few of the developers (and more importantly, the product managers) of this software are taking the time to think about potential privacy issues ahead of time.
Each browser feature that is added to support new kinds of rich applications adds to the privacy attack surface of the entire browser. It is easy to see how this situation developed with JavaScript. While JavaScript is the language that enables rich content in the first place, it does so at a cost of user privacy. JavaScript code on a page can read and set cookies. It can trigger Web beacons based on how the user interacts with the page. JavaScript is the enabler of most browser fingerprinting tools and libraries. It can even interact with the Document Object Model to implement cross-device tracking using network and audio interfaces exposed by the browser. Moreover, debugging JavaScript performance problems is a major technical driver for adding telemetry functionality to the browser programs.
Browser Settings
In an effort to provide a default common user experience, the major Web browsers permit most tracking code to run by default. While different browsers will block some tracking code out of the box, it is up to the user to enable stricter tracking prevention. Similarly, major browsers enable local data storage by default. Websites can set cookies and store other information, and the browser saves cache files and history data. As a consequence, the user can be tracked across browser sessions using the cookies, stored information, and even the cache data. In addition, the stored local data are forensically recoverable from the user’s computer or device.
As of early April 2023, all the major Web browsers have telemetry enabled out of the box, which lets the browser makers snoop on users’ activities. Additionally, many of the most popular browsers also support some kind of synchronization tool that lets the user share browsing information across different devices. While users may find it convenient to have access to the same browser history and bookmarks on their phones and computers, these synchronization services are typically backed by some sort of cloud system and could provide a way to implement cross-device tracking.
Out of the box, most browsers are also configured to use a search engine that collects considerable personal information about its users. Search suggestions are normally enabled by default, so anything the user types into the address bar is being sent to the search company, even if the user already knows where they want to go. The decision to default to a given search provider may be the result of a marketing agreement between the browser vendor and the search provider; it may not be a carefully considered decision to do what is in the user’s best privacy interests.
Economics of Browser Development
Modern Web browsers are complex pieces of software with a number of moving parts. There are at least two separate engines within the browser: one for processing HTML, layout, and rendering; and the other for executing JavaScript and other kinds of programming. Each browser adds its own additional features on top of the engines, adding to the complexity of the software. This complexity, in turn, increases the costs of developing, testing, and supporting the browser.
We need only consider the most popular browser engine of today (April 2023) to see how development economics lead to features that might be questionable for privacy. The Blink engine is developed by Google as part of the Chromium project. Google is an advertising technology company that generates most of its revenue from selling and targeting online advertisements. It is perhaps therefore not surprising that the Blink and V8 browser and JavaScript engine combination supports a variety of Application Programming Interfaces (APIs) that the Mozilla Foundation considers harmful for privacy and security.1
While it is true that Mozilla’s own browser (Firefox) and engines (Gecko and SpiderMonkey) do not implement some of these concerning APIs, the development of the Firefox browser is nevertheless tied to advertising. In particular, it is known that the Mozilla Foundation has received around $450,000,000 per year from Google to keep Google Search as the default search engine for Firefox.2 Since Mozilla has positioned Firefox as a browser with greater privacy than the other mainstream choices, it is not unreasonable to conclude that the entire browser development industry is ultimately driven by advertising and tracking interests.
With the exception of truly open source, non-commercial, niche browsers, it is in the browser vendor’s financial interest to make the browser capable of tracking the user. Even for browsers that have privacy features, the default settings as shipped normally disable or limit these features, ostensibly to avoid breaking poorly designed websites. However, the browser vendors often have a financial motivation as well.
Desktop Applications
It might be tempting to believe that we could somehow turn back the clock and replace a Web-based SaaS application with an offline desktop application as a way to improve privacy. Unfortunately, this is not the case, since even a special-purpose desktop application can be a browser in disguise. For example, the Electron framework enables desktop applications to be written in HTML, CSS, and JavaScript.3 Since Electron is a stripped down version of Chromium (in other words, the Blink + V8 engines), applications built this way are simply browsers that are designed to visit a restricted set of pages. As the application vendor has complete control over the resulting application, there might be few to no privacy controls available to the user compared to those available in a general-purpose browser.
Even if a Web framework isn’t used to create a desktop application, some of the bad development habits from the browser space can make their way into other programs. For example, desktop applications can include libraries that enable telemetry. Applications can directly make Web requests and perhaps support some Web tracking features (such as a cookies) even if no browser engine is embedded. In some ways, choosing Web applications that can be contained within a configurable browser might be a better choice for user privacy compared to using a desktop application that lacks privacy controls.
References and Further Reading
-
Alexander Maxhem. “Google Is Paying Mozilla $450M Per Year To Be The Default Search Engine On Firefox.” Android Headlines. August 14, 2020. ↩