The WIRED Guide to Data Breaches

Everything you need to know about the past, present, and future of data security—from Equifax to Yahoo—and the problem with Social Security numbers.
RADIO

It Seems like every week there’s another massive corporate security breach that exposes your personal data. Names, email addresses, passwords, Social Security numbers, dates of birth, credit card numbers, banking data, passport numbers, phone numbers, home addresses, driver’s license numbers, and medical records—they all get swept up by shadowy, amorphous hackers for fraud, identity theft, nation-state surveillance, and more. Sometimes the affected company will send you an email or letter suggesting that you change a password or credit card number, but for the most part, these incidents are invisible—until they aren’t.

Think of consumer data breaches as coming in two flavors: breaches of institutions that people choose to entrust with their data—like retailers and banks—and breaches of entities that acquired user data secondarily—like credit bureaus and marketing firms. Unfortunately, you can’t keep your information perfectly safe: It is often impossible to avoid sharing data, especially with organizations like governments and health insurers. Furthermore, in cases where a company or institution gives your information to an additional party, you’ve often agreed to share more data than you realize by clicking "I accept" on a dense user agreement.

Many of these incidents don’t necessarily even involve hackers. Data “exposures” occur when information that should have been locked down was accessible, but it’s unclear if anyone actually stole it.

Even after a data breach has occurred, though, and an unauthorized actor definitely has your data, you won’t necessarily see an immediate negative impact. Hackers who steal a trove of login credentials, for example, may quietly use them for under-the-radar crime sprees instead of selling or publishing the data. As a result, the repercussions of a breach can be very delayed, sometimes not fully manifesting for years.

Attackers tend to capitalize on certain types of data right away, namely financial information like credit card numbers. But other troves of data disappear into the criminal ecosystem and become a sort of ticking time bomb as personal details are combined and recombined with other stolen information. Victims of identity theft know the consequences of data breaches intimately and painfully. They may have their credit wrecked by thieves, lose all their money, or be dogged for years by a shadow hand meddling in their affairs and opening digital accounts in their name.

The problem is so abstract and far-reaching that you would be forgiven for feeling that it’s not worth grappling with at all. Unfortunately for victims, there is no such thing as perfect security, and no way to eliminate all data breaches. But massive institutional breaches don’t need to happen as often as they do. Many occur not because of complex and sophisticated hacking but because organizations have made basic and potentially avoidable mistakes in implementing their security schemes. These exposures are low-hanging fruit for hackers to pluck.

Yes, it’s a difficult, never-ending process for a large organization to secure its inevitably sprawling networks, but for decades many institutions just haven’t really tried. They’ve gone through some of the motions without actually making digital security a spending priority. Over the past 10 years, however, as corporate and government data breaches have ramped up—impacting the data of billions of people—institutional leaders and the general public alike have finally begun to understand the urgency and necessity of putting security first. Additionally, as ransomware attacks have evolved beyond encrypting a target's systems and demanding a ransom to include data theft and extortion, institutions have had additional incentives to bolster their digital defenses. This increased focus is beginning to translate into some concrete data protections and security improvements. But collective inaction for decades has created a security deficit that will take significant time and money to make up. And the reality that robust digital security requires never-ending investment is difficult for institutions to accept.

The History of Data Breaches

Data breaches have been increasingly common and harmful for decades. A few stand out, though, as instructive examples of how breaches have evolved, how attackers are able to orchestrate these attacks, what can be stolen, and what happens to data once a breach has occurred.

Digital data breaches started long before widespread use of the internet, yet they were similar in many respects to the leaks we see today. One early landmark incident occurred in 1984, when the credit reporting agency TRW Information Systems (now Experian) realized that one of its database files had been breached. The trove was protected by a numeric passcode that someone lifted from an administrative note at a Sears store and posted on an “electronic bulletin board”—a sort of rudimentary Google Doc that people could access and alter using their landline phone connection. From there, anyone who knew how to view the bulletin board could have used the password to access the data stored in the TRW file: personal data and credit histories of 90 million Americans. The password was exposed for a month. At the time, TRW said that it changed the database password as soon as it found out about the situation. Though the incident is dwarfed by last year’s breach of the credit reporting agency Equifax (discussed below), the TRW lapse was a warning to data firms everywhere—one that many clearly didn’t heed.

Large-scale breaches like the TRW incident occurred sporadically as years went by and the internet matured. By the early 2010s, as mobile devices and the Internet of Things greatly expanded interconnectivity, the problem of data breaches became especially urgent. Stealing username/password pairs or credit card numbers—even breaching a trove of data aggregated from already public sources—could give attackers the keys to someone’s entire online life. And certain breaches in particular helped fuel a growing dark web economy of stolen user data.

One of these incidents was a breach of LinkedIn in 2012 that initially seemed to expose 6.5 million passwords. The data was hashed, or cryptographically scrambled, as a protection to make it unintelligible and therefore difficult to reuse, but hackers quickly started “cracking” the hashes to expose LinkedIn users’ actual passwords. Though LinkedIn itself took precautions to reset impacted account passwords, attackers still got plenty of mileage out of them by finding other accounts around the web where users had reused the same password. That all too common lax password hygiene means a single breach can haunt users for years.

The LinkedIn hack also turned out to be even worse than it first appeared. In 2016 a hacker known as “Peace” started selling account information, particularly email addresses and passwords, from 117 million LinkedIn users. Data stolen from the LinkedIn breach has been repurposed and re-sold by criminals ever since, and attackers still have some success exploiting the data to this day, since so many people reuse the same passwords across numerous accounts for years.

Data breaches didn’t truly become dinner table fodder, though, until the end of 2013 and 2014, when major retailers Target, Neiman Marcus, and Home Depot suffered massive breaches one after the other. The Target hack, first publicly disclosed in December 2013, impacted the personal information (like names, addresses, phone numbers, and email addresses) of 70 million Americans and compromised 40 million credit card numbers. Just a few weeks later, in January 2014, Neiman Marcus admitted that its point-of-sale systems had been hit by the same malware that infected Target, exposing the information of about 110 million Neiman Marcus customers, along with 1.1 million credit and debit card numbers. Then, after months of fallout from those two breaches, Home Depot announced in September 2014 that hackers had stolen 56 million credit and debit card numbers from its systems by installing malware on the company’s payment terminals.

An even more devastating and sinister attack was taking place at the same time, though. The Office of Personnel Management is the administrative and HR department for US government employees. The department manages security clearances, conducts background checks, and keeps records on every past and present federal employee. If you want to know what’s going on inside the US government, this is the department to hack. So China did.

Hackers linked to the Chinese government infiltrated OPM’s network twice, first stealing the technical blueprints for the network in 2013, then initiating a second attack shortly thereafter in which they gained control of the administrative server that managed the authentication for all other server logins. In other words, by the time OPM fully realized what had happened and acted to remove the intruders in 2015, the hackers had been able to steal tens of millions of detailed records about every aspect of federal employees’ lives, including 21.5 million Social Security numbers and 5.6 million fingerprint records. In some cases, victims weren’t even federal employees, but were simply connected in some way to government workers who had undergone background checks. (Those checks include all sorts of extremely specific information, like maps of a subject’s family, friends, associates, and children.)

Pilfered OPM data never circulated online or showed up on the black market, likely because it was stolen for its intelligence value rather than its street value. Reports indicated that Chinese operatives may have used the information to supplement a database cataloging US citizens and government activity.

Today, data breaches are so common that the cybersecurity industry even has a phrase—“breach fatigue”—to describe the indifference that can come from such an overwhelming and seemingly hopeless string of events. And while tech companies, not to mention regulators, are starting to take data protection more seriously, the industry has yet to turn the corner. In fact, some of the most disheartening breaches yet have been disclosed in recent years.

Yahoo lodged repeated contenders for the distinction of all-time biggest data breach when it made an extraordinary series of announcements beginning in September 2016. First, the company disclosed that an intrusion in 2014 compromised personal information from 500 million user accounts. Then, two months later, Yahoo added that it had suffered a separate breach in August 2013 that exposed a billion accounts. Sounds like a pretty unassailable lead in the race to the data-breach bottom, right? And yet! In October 2017, the company said that after further investigation it was revising its estimate of 1 billion accounts to 3 billion—or every Yahoo account that existed in August 2013.

Incredibly, Yahoo isn't the only tech giant that's dealt with an absurd string of data breaches. In January, for example, the mobile telecom giant T-Mobile said that it had suffered a data breach beginning in November 2022 that impacted 37 million current customers—exposing information like names, email addresses, phone numbers, billing addresses, dates of birth, account numbers, and service plan details. The breach was sizable but might not have been so noteworthy if T-Mobile hadn't developed a reputation over the past decade for data breaches. The company had a mega breach in 2021, two breaches in 2020, one in 2019, and another in 2018. T-Mobile is one of the largest mobile carriers in the US and is estimated to have more than 100 million customers.

The severity and impact of these incidents is related not just to how frequent they are and how many people they affect, but to the nature of the stolen data. For example, the credit monitoring firm Equifax is notorious for disclosing a massive breach in September 2017 that exposed personal information for 147.9 million people, most of whom were living in the United States. The data included birth dates, addresses, some driver’s license numbers, about 209,000 credit card numbers, and Social Security numbers—meaning that almost half the US population potentially had their crucial secret identifier exposed. Because the information stolen from Equifax was so sensitive, it's widely considered the worst corporate data breach ever. At least for now. Equifax also completely mishandled its public disclosure and response in the aftermath.

There have since been numerous indications that Equifax had a dangerously lax security culture and lack of response procedures in place. Former Equifax CEO Richard Smith told Congress in October 2017 that he usually only met with security and IT representatives once a quarter to review the company's security posture. And hackers got into Equifax's systems for the breach through a known web framework vulnerability for which a patch had been available for months. A digital platform used by Equifax employees in Argentina was even protected by the ultra-guessable credentials "admin, admin"—a truly rookie mistake.

In India, the government identification database Aadhaar stores personal information, biometrics, and a 12-digit identification number for more than 1 billion Indian citizens, and it is incorporated by both the government and private companies into a range of foundational digital services. These interconnections, though, have led to numerous major breaches of Aadhaar data from both third parties and the Indian government itself. In fact, researchers estimate that all Aadhaar numbers and other corresponding data has been breached at some point, and the stolen data is actively abused for criminal scamming.

If any good has come from these massive, critical breaches, it's that the sheer severity has increasingly been a wake-up call for institutions around the world. On the other hand, the frequency of successful attacks doesn’t seem to have abated. And data aggregators like Equifax, that pull in an enormous amount of public and private information from myriad sources, have become a single point of failure of the digital age. More and more often, attackers target data analytics companies or digital services that are incorporated into other products and networks as a one-stop-shop for valuable information.

In short, the security breach train rolls on. The global hotel chain Marriott’s is another example of a repeat target. The company notably suffered a massive breach in 2014 that compromised data on nearly 340 million customers globally, including hundreds of millions of passport numbers. The company didn't discover the incident until 2018, though. Ultimately, Marriott was forced to pay fines over the incident, including one for about $24 million from the United Kingdom's Information Commissioner’s Office. As if that wasn't enough, though, the hotelier had a breach in January 2020 that impacted data of more than 5 million customers and another one in June 2022 in which hackers claimed to have stolen 20 gigabytes of Marriott data, including customer credit card details.

The Future of Data Breaches

Attackers are able to perpetrate most current data breaches relatively easily by exploiting an institution’s basic security oversights. If businesses and other institutions learned from these organizations’ mistakes, there could be a real reduction in the number of data breaches that occur overall. But improvement doesn’t come from making breaches impossible. The best mitigations come from accepting the possibility of a breach and significantly raising the barrier to entry or the resources required to carry one off. This way unskilled hackers or those who are just idly poking around won't be able to find as many blatant vulnerabilities to easily exploit.

An important concept in security, though, is the idea of the cat and mouse game. For determined, motivated, and well-resourced attackers, improved defenses spur malicious innovation. This is why security is an endless expense that institutions often try to minimize, cap, or avoid altogether—defenders need to think of everything, while attackers only need to find one small mistake. An unpatched web server or an employee falling for a phishing scam can be all it takes.

That’s also why some of the most groundbreaking examples of next-generation hacking come from targeted attacks to surveil high-profile individuals and groups—often political candidates, dissidents, activists, or spies attempting to infiltrate each others’ organizations. Hackers working to carry out these types of high-priority attacks will develop or pay large sums of money for so-called zero-day exploits to weaponize against their targets. These consist of two parts: information about an undisclosed vulnerability in a system, and software that is programmed to take advantage of that flaw to give some type of increased system access or control to whoever deploys the exploit. A software developer can’t defend a vulnerability they don’t know about, so zero-day exploits push the limits of what’s possible for attackers by giving them a secret path into a network or database.

More attackers may be forced to use zero-day exploits to carry out future breaches—increasing the resources required—if businesses, governments, and other institutions succeed in substantially improving their baseline cybersecurity postures through initiatives like consistent patching and network access control. But for now, enough easy targets remain that criminal attackers often don’t need to work very hard or spend a lot of money to perpetrate massive data breaches. Even just using publicly available internet scanning tools can reveal unprotected devices and databases where valuable information is tantalizingly exposed.

As one protection for their citizens, government should offer purpose-built universal identity schemes that incorporate numerous, diverse authenticators. That way, even if hackers compromise one piece of information, people can still regain control of their identities through other avenues that they still control.

Ideally, companies and other institutions that hold data would commit to invest forever in rigorously locking their systems down. But organizations always vacillate between factoring in cost, ease of use, and risk. There’s no simple way to reconcile the three. And even if there were, no security scheme is ever perfect. The best way to minimize the impact of a mega-breach, then, is not just to reduce the number of incidents, but to better manage the inevitable fallout.

Learn More
  • Inside the Cyberattack That Shocked the US Government
    WIRED’s dramatic account of the massive Office of Personnel Management hack. It’s truly the breach that had it all, compromising everything from basic information and Social Security numbers to government background-check data and even fingerprints for tens of millions of people. Plus, Chinese hackers orchestrated an epic heist.
  • Yes, It’s Time to Ditch LastPass
    After a string of past security incidents, attackers stole users' password vaults from the popular password manager in a 2022 breach. Password managers are a vital piece of the personal security ecosystem, and the incident cast serious doubt on LastPass's ability to secure its most valuable assets.
  • Yahoo Breach Compromises 3 Billion Accounts
    The most accounts ever compromised in one breach. Good times.
  • The Equifax Breach Was Entirely Preventable
    The Equifax debacle was a turning point in the history of corporate data breaches, because it exposed very sensitive data and put victims at a high risk of identity theft and other invasive attacks, all because of grossly inadequate corporate security protections. WIRED walked through how the company could have prevented the disaster.
  • Equifax’s Security Overhaul, a Year After Its Epic Breach
    A year after Equifax discovered its breach, WIRED checked in with the company on what it was doing internally to turn things around and prevent another digital security lapse. And while the overhaul sounded positive, experts were still skeptical about whether Equifax can ever be fully trusted again.
  • Marketing Firm Exactis Leaks Database With 340 Million Personal Records
    A massive data exposure at the targeted-marketing firm Exactis could have compromised hundreds of millions of records. Though no one knows if the data was actually stolen, it was easily accessible on the public internet, and anyone trawling for easy targets could have accessed it. The information would have been particularly valuable to an attacker because it contained detailed profiles on millions of Americans’ basic information, preferences, and habits.
  • Startup Breach Exposed Billions of Data Points
    The Apollo breach exposed billions of records and is a good example of how enticing “aggregated” data troves are to hackers. When an organization, like the sales intelligence firms Apollo or Exactis, collects data from numerous sources into a single repository, it essentially does criminals’ work for them. Everything is in one place, the data is organized for ease of use, and it’s generally searchable. Often much of the data in these types of breaches was already publicly accessible, but the crucial benefit to attackers is the one-stop shop.
  • Facebook’s First Full Data Breach Impacts Up to 90 Million Accounts
    Facebook is no stranger to controversies over data mishandling at this point. The data breach it disclosed in September 2018, though, was particularly notable because it was the first known example of an attacker exploiting flaws in Facebook’s architecture to actually break into users’ accounts and steal their data. Unlike the company’s other missteps—which were, of course, problematic in their own ways—this was a true data breach.

Last updated February 17, 2023.

Enjoyed this deep dive? Check out more WIRED Guides.