Centre for Internet & Society

India’s government leaks data like a sieve, and it’s putting people at risk.

The article by Aria Thaker was published in Quartz India on April 4, 2019. Karan Saini was quoted.


The latest instance was reported on April 01, when technology website ZDNet reported that an Indian government agency had left sensitive medical records of 12.5 million pregnant women online, in a database that wasn’t even password protected.

This is only the latest in a slew of dozens of data leaks and cybersecurity lapses that have plagued the Indian government over the past few years. These have occurred despite—or in part, because of—prime minister Narendra Modi’s aggressive push towards “Digital India.

Sloppy digitisation efforts and a lack of capacity in cybersecurity issues have caused so many leaks that they seem almost routine by now. Even the one reported by ZDNet, large and grievous as it is, has received little coverage in the mainstream media so far.

“This issue has to be an election priority—the focus on privacy and protection of citizen data,” Raman Chima, policy director at digital-rights organisation Access Now, told Quartz. But as India’s general election approaches, the near silence from politicians on these issues is deafening.

Endangering pregnant women and children

ZDNet reported that a database of medical records from 12.5 million pregnant women was left available online by the department of medical, health and family welfare of a northern Indian state. It does not name the state since the server is still available online, though the medical records have finally been removed, almost a month after security researcher Bob Diachenko contacted the department.

The information in this database was extremely sensitive, including patients’ names, contact details, disease information, pregnancy status and complications, and procedures, such as abortions, that they have undergone.

This wasn’t the first such data leak incident involving the government—nor even the first involving pregnant women.

“This could lead to significant bodily harms to a woman in a context where abortions, especially for unmarried women, are heavily stigmatised,” said Ambika Tandon, a policy officer who researches gender and tech issues for the think tank Centre for Internet and Society (CIS). Women who seek abortions or sensitive procedures “may resort to unsafe abortions at facilities that are not registered for fear of their personal information or physical and informational privacy being compromised,” Tandon said.

Yet this wasn’t the first such incident involving the government—nor even the first involving pregnant women.

Aadhaar and beyond

Aadhaar, the 12-digit personal identification number from India’s controversial, biometrics-backed database, has often been at the centre of previous such leaks in India. Much like a the social security number in the US, it is a sensitive piece of information as it helps identify individuals and is often linked to other government and financial services one uses.

Here’s a look at just a handful of the most prominent leaks, breaches, and vulnerabilities involving Aadhaar:

  • May 2017: Around 130 million individuals have their Aadhaar numbers, banking details, and more leaked on four government websites
  • January 2018: A reporter of The Tribune newspaper pays Rs500 ($7) to access a portal with demographic data from every Aadhaar holder in the country
  • March 2018: State-owned gas company Indane leaks private dataof its customers and all Aadhaar holders
  • April 2018: Andhra Pradesh leaks medical records of over 2 million pregnant women, as well as their Aadhaar numbers and contact details
  • June 2018: An unsecured Aadhaar API on over 70 subdomains of a government website allows anyone to access demographic-authentication services
  • July 2018: Data of 250,000 students taking a government medical entrance exam is leaked and sold online
  • January 2019: The State Bank of India, the country’s largest bank, leaks financial data of millions customers
  • February 2019: Indane strikes again. Aadhaar data of nearly 6.7 million dealers and distributors of the state-owned gas company is exposed on its dealers portal

This list is far from comprehensive. Aadhaar-related leaks alone comprise at least 37 such cases, which can be found listed on Indian tech site Medianama. These include instances of many state and central departments publishing Aadhaar numbers next to banking details, or even instances of colour photocopies of Aadhaar cards being published online.

Government entities, especially the Unique Identification Authority of India (UIDAI), which administers Aadhaar, have been known to take a long time—sometimes even months—to respond to leaks. Worse, they have often hounded journalists and whistleblowers raising awareness about these incidents. For instance, the UIDAI filed a police case against The Tribune’s reporter and, weeks later, the newspaper’s editor stepped down. CIS, which published the report about 130 million Aadhaar records being exposed, received a legal notice from the UIDAI.

Why do leaks happen?

India’s government agencies are undergoing rapid digitisation.

“There’s been a particular focus over the past few years to aggregate (data) from different databases that might normally exist in one area, or one scheme, and bring them together to one master spreadsheet,” said Chima of Access Now. “As a result of this, a lot of data is being collected in a few places and (the government) is not doing enough thinking…as to how to control and think about cybersecurity.”

Besides, government departments face major personnel and training issues in matters of cybersecurity.

“Most of the officials who collect information do not know about security practices, and they don’t really understand the challenges involved,” said Srinivas Kodali, a cybersecurity researcher who has uncovered many government leaks. “They think it’s normal data—they don’t understand the privacy implications of it.”

Beyond this, a lack of resources holds government bodies back from protecting user data. “Indian government departments require an increase in skill and resources to deal with information security,” said Karan Saini, a security researcher and policy officer at CIS, who has also reported leaks and vulnerabilities.

A lack of comprehensive privacy regulation exacerbates the problem. India does not currently have a data-protection law. A draft bill was put forth by the electronics and IT ministry last year, but it has been criticised by many for being too lenient on government institutions who handle citizen data.

How leaks and lapses endanger democracy

The issue of data leaks is deeply tied to one of the core threats to democracy today—voter microtargeting, which often takes the form of parties conveying conflicting messages to different social groups, in their attempts to get elected. Such targeting has been under the global spotlight since 2016, after secretive firm Cambridge Analytica reportedly used it in Donald Trump’s presidential campaign.

In India, “there’s a lot of these datasets of sensitive data about citizens available, which may have electoral implications in terms of how it affects people’s desire to vote, and the ability of parties to influence or micro-target them,” Chima said.

Parties have already demonstrated an appetite for citizen data. Modi’s Bharatiya Janata Party has already been known to target voters based on data such as electricity bills, which are thought to reveal a voter’s socioeconomic standing.

State-held data could be abused by politicians, especially those currently in power, to target or profile voters.

Now Aadhaar has come into the picture as well, with the government attempting, in a failed project, to link the biometrics-backed ID number with citizens’ voter ID numbers. The project was discontinued in 2015, but concerns remain about the way the data was collected, with ground reports suggesting that reams of documents are still floating in government offices and private homes. It has also been suggested that the lapsed linking project resulted in the disenfranchisement of millions, due to an opaque algorithm it used.

Worries abound that existing state-held data could be abused by politicians, especially those currently in power, to target or profile voters. State resident data hubs (SRDH), which use Aadhaar to log full profiles of individuals, including the government schemes they avail, have come under the scanner for being particularly dangerous for profiling.

Reports have already suggested the occurrence of such data being used for political gain. An app used by a leading party in Andhra Pradesh has been accused of using stolen state data to profile voters based on caste, though the party denied this.

A government or political actor’s access to state repositories of data, which might be aided by data leaks, could be a truly sinister political tool that sways an election.

Filed under: