Records of 114 Million U.S. Citizen and Companies Exposed Online

By: Kayla Matthews

HackenProof, an Estonia-based cybersecurity firm, reported recently that it discovered a 73-gigabyte database containing 114 million records of approximately 83 million U.S. citizens and companies sitting unprotected online.

The data included information such as first and last name, employers, job title, address, phone number, email, and IP address.

“With an estimated number of affected citizens to be almost 83 million, it appears the hackers struck a gold mine,” said Tom Garrubba, senior director of Shared Assessments, a membership organization focused on third-party risk assurance. “The only hope left here is that there are some iron pyrite — or ‘fool’s-gold’ — records mixed in with the gold of actual current individual records.”

HackenProof reportedly found the leaking data during a routine security audit of publicly available servers of the Shodan search engine.

The company discovered at least three IPs with identical Elasticsearch clusters that were misconfigured to be publicly accessible. Shodan indexed the first IP on Nov. 14, but it could have been exposed for an unknown amount of time before that date.

One open instance exposed the information of approximately 57 million U.S. citizens, while another index within the same database exposed more than 25 million records.

The error has since been corrected, and the data is no longer publicly available.

The Source of the Leak

According to HackenProof, the primary cause of the data breach was a misconfiguration of the Elasticsearch instances. This misconfiguration allowed the public to access the data without authentication.

The cybersecurity firm said it could not verify the owner of the leaked data, but noted the structure of the field source was similar to one a data management company called Data & Leads Inc. had used. HackenProof could not get in touch with Data & Leads, and the data management company’s website went offline shortly after the incident.

Potential Impacts

Because of the volume and nature of the data, it could be quite useful to any bad actors who copied or downloaded the data while it was exposed. They could potentially use the information to run scams such as spear-phishing email campaigns. They might also be able to use the information to impersonate real people and gain access to their accounts directly.

“This is a vast sum of data to be available online in an unprotected format, and is yet another example of organizations not taking data protection in any way seriously,” said Ryan Wilk, vice president of customer success for NuData Security.”

The information available is a hacker’s dream, with more than enough information to pull off a social engineering campaign which could compromise a wide range of accounts, ranging from consumer accounts with retailers to bank accounts or sensitive documents. Programs of passive biometrics and two-factor authentication are needed across the board if we are to differentiate between legitimate and bad users following breaches such as this.”

Hackers may be able to use the data from the Elasticsearch breach in combination with information obtained from other breaches to make their efforts even more convincing,” said Michael Magrath, director of global regulations and standards at cybersecurity company OneSpan, Inc.

“The treasure trove of personally identifiable data on the legitimate web and the dark web just continues to grow, enabling fraudsters to steal identities or create new, synthetic identities using a combination of real and made-up information, or entirely fictitious information,” Magrath continued. “For example, the personal information obtained in one breach could be cross-referenced with data obtained from another breach and other widely publicized private-sector breaches. Having the databases in the same place makes things even easier for the bad guys.”

Sometimes, cybercriminals also exploit these types of mistakes by downloading or copying and then deleting it from the database. They then demand a ransom to return the information. Hackers may also use the vulnerability to plant malware that allows them to remotely connect to and control the server.

How to Prevent Data Leaks

Data leaks are nothing new. They’ve been happening for years, and efforts to stop them have been ongoing. However, companies continue to fall victim to them. How can businesses protect themselves from these kinds of incidents?

Watching for errors such as the misconfiguration that caused the Elasticsearch leak can help. Running tests like those HackenProof performed can help detect errors that previously slipped through the cracks. Using a virtual directory server can help organizations assign the necessary security.

“This is, of course, a major data breach and, at the root of it, appears to have been a user error — i.e., misconfiguration of the Elasticsearch instances allowing public access to the data without authentication,” Garrubba said. “We cannot stress enough the importance of established checks and balances, segregation of duties, etc., to be defined in procedures and followed with appropriate sign-offs by management.”

The bottom line is that organizations need to be proactive in preventing and checking for data leaks. There’s always a risk one will occur, but being careful and vigilant can minimize these risks and the damage ones that do happen can cause.

“Cyberattacks will continue, and it is imperative that public- and private-sector organizations not only deploy the latest in authentication and risk-based fraud detection technologies in their organizations, but also making sure all third-party partners have equal cybersecurity measures in place,” Magrath said.


Kayla Matthews writes about cybersecurity and technology for publications like Malwarebytes, Security Boulevard, InformationWeek and CloudTweaks. To read more from Kayla, visit her blog: