Artificial Intelligence
Artificial Intelligence (AI) has the potential to enhance surveillance and compound privacy threats in ways that are not yet fully understood. AI algorithms can potentially draw inferences from indicators far more quickly than a human analyst, creating the potential for faster and more thorough abuses of personal data. At the same time, AI algorithms are prone to hallucinate incorrect information, yielding the potential for harmful and incorrect conclusions. In this lesson, we’ll look at some of the issues surrounding AI and its related emerging threats.
Surveillance and AI
Surveillance companies collect a lot of data. For example, Team Cymru collects traffic data covering about 90% of the Internet, which they then sell to private companies and government agencies, including the United States military1 and the FBI.2 Another company named HYAS buys location data collected from mobile phone apps to track individuals in the name of cybersecurity.3 These types of data sets are immense, and processing them is difficult.
Enter Artificial Intelligence. Using machine learning algorithms, it is possible to take a data set like the one collected by Team Cymru and perform traffic analysis on those data. Although the actual data flows might be encrypted, the sizes and timings of the flows can still be determined. Based on these points of communications metadata (or data about the actual data in the traffic), it is possible for an adversary to perform website fingerprinting to determine what sites and pages a user is accessing, even if those accesses occur over an encrypted communications channel like a Virtual Private Network (VPN) or The Onion Router (Tor).4 Unless countermeasures are developed for this type of analysis, surveillance economy participants may be able to use AI to identify even privacy-conscious users based solely upon their data flows.5
AI-Powered Inferences
A second major threat posed by Artificial Intelligence is that AI algorithms can be used to make inferences about people based upon their past or current behavior. Worse, the machine learning systems can use aggregated data that it may infer is about a person (or is about similar people) to improve these inferences.6 These inferences might not be correct, as illustrated when AI-powered facial recognition software identifies the wrong suspect and leads to the arrest of an innocent person.7
Another risk posed from AI-powered inferences is that they can be at least as biased, if not more biased, than a human looking at the same data. For example, AI algorithms are known to misidentify 10 to 100 times as many Black and Asian people when compared to white people, likely due to the use of biased training sets in the machine learning process. These biases have real-world impacts, as police departments using these AI tools tend to arrest the wrong people.8
Notes and References
-
Joseph Cox. “Revealed: US Military Bought Mass Monitoring Tool That Includes Internet Browsing, Email Data.” Vice. September 21, 2022. ↩
-
Joseph Cox. “Here is the FBI’s Contract to Buy Mass Internet Data.” Vice. March 27, 2023. ↩
-
Joseph Cox. “Private Intel Firm Buys Location Data to Track People to their ‘Doorstep’.” Vice. September 2, 2020. ↩
-
Tobias Pulls and Ethan Witwer. “Maybenot: A Framework for Traffic Analysis Defenses.” Proceedings of the 22nd Workshop on Privacy in the Electronic Society (WPES ‘23). Copenhagen, Denmark: November 26, 2023. ↩
-
Mullvad. “Introducing Defense against AI-guided Traffic Analysis (DAITA).” May 7, 2024. ↩
-
Alicia Solow-Niederman. “Information Privacy and the Inference Economy.” Northwestern University Law Review 117(2): 357-424. ↩
-
Karhmir Hill. “Your Face Is Not Your Own.” The New York Times Magazine. March 18, 2021. ↩
-
Christina Swarns. “When Artificial Intelligence Gets It Wrong.” The Innocence Project. September 19, 2023. ↩