Centre for Internet & Society

On 5th October, MediaNama held a #NAMAprivacy conference in Bangalore focused on Privacy in the context of Artificial Intelligence, Internet of Things (IoT) and the issue of consent, supported by Google, Amazon, Mozilla, ISOC, E2E Networks and Info Edge, with community partners HasGeek and Takshashila Institution. Part 1 of the notes from the discussion on IoT:

Link to the original published by Medianama on October 18 here


The second session of the #NAMAprivacy in Bangalore dealt with the data privacy in the Internet of Things (IoT) framework. All three panelists for the session – Kiran Jonnalagadda from HasGeek, Vinayak Hegde, a big data consultant working with ZoomCar and Rohini Lakshane a policy researcher from CIS – said that they were scared about the spread of IoT at the moment. This led to a discussion on the standards which will apply to IoT, still nascent at this stage, and how it could include privacy as well.

Hedge, a volunteer with the Internet Engineering Task Force (IETF) which was instrumental in developing internet protocols and standards such as DNS, TCP/IP and HTTP, said that IETF took a political stand recently when it came to privacy. “One of the discussions in the IETF was whether security is really important? For a long time, the pendulum swung the other way and said that it’s important and that it’s not big enough a trade-off until the bomb dropped with the Snowden revelations. The IETF has always avoided taking any political stance. But for the first time, they did take a political position and they published a request for comments which said: “Pervasive monitoring is an attack on the Internet” and that has become a guiding standard for developing the standards,” he explained.

He added that this led the development of new standards which took privacy into consideration by default.

“The repercussions has been pervasive across all the layers of the stack whether it is DNS and the development of DNS Sec. The next version of HTTP, does not actually mandate encryption but if you look at all the implementation on the browser side, all of them without exception have incorporated encryption,” he added.

Rohini added that discussion around the upcoming 5G standard, where large-scale IoT will be deployed, also included increased emphasis on privacy. “It is essentially a lot of devices connected to the Internet and talking to each other and the user. The standards for security and privacy for 5G are being built and some of them are in the process of discussion. Different standard-setting bodies have been working on them and there is a race of sorts for setting them up by stakeholders, technology companies, etc to get their tech into the standard,” she said.

The good thing about those is that they will have time to get security and privacy. Here, I would like to mention RERUM which is formed from a mix of letters which stands for Reliable, Resilient, and Secure IoT for smart cities being piloted in the EU. It essentially believes that security should include reliability and privacy by design. This pilot project was thought to allow IoT applications to consider security and privacy mechanisms early in the design, so that they could balance reliability. Because once a standard is out or a mechanism is out, and you implement something as large as a smart city, it is very difficult to retrofit these considerations,” she explained.

Privacy issues in home automation and IoT

Rohini pointed out a report which illustrates the staggering amount of data collection which will be generated by home automation. “I was looking for figures, and I found an FTC report published in 2015 where one IoT company revealed in a workshop that it provides home automation to less than 10,000 households but all of them put together account for 150 million data points per day. So that’s one data point for every six seconds per household. So this is IoT for home automation and there is IoT for health and fitness, medical devices, IoT for personal safety, public transport, environment, connected cars, etc.”

In this sort of situation, the data collected could be used for harms that users did not account for.

“I received some data a couple of years back and the data was from a water flowmeter. It was fitted to a villa in Hoskote and the idea was simple where you could measure the water consumption in the villa and track the consumption. So when I received the data, I figured out by just looking at the water consumption, you can see how many people are in the house, when they get up at night, when they go out, when they are out of station. All of this data can be misused. Data is collected specifically for water consumption and find if there are any leakages in the house. But it could be used for other purposes,” Arvind P from Devopedia said.

Pranesh Prakash, policy director at Centre for Internet and Society (CIS), also provided an example of a Twitter handle called “should I be robbed now” where it correlates a user’s vacation pictures says that they could be robbed. “What we need to remember is that a lot of correlation analysis is not just about the analysis but it is also about the use and misuse of it. A lot of that use and misuse is non-transparent. Not a single company tells you how they use your data, but do take rights on taking your data,” he added.

Vinayak Hedge also added that the governments are using similar methods of data tracking to catch bitcoin miners in China and Venezuela from smart meters.

“In China, there are all these bitcoin miners. I was reading this story in Venezuela, where bitcoin mining is outlawed. The way they’re catching these bitcoin miners is by looking at their electricity consumption. Bitcoin mining uses a huge amount of power and computing capacity. And people have come out with ingenious ways of getting around it. They will draw power from their neighbours or maybe from an industrial setting. This could be a good example for a privacy-infringing activity.”

Pseudonymization

Srinivas P, head of security at Infosys, pointed out that a possible solution to provide privacy in home automation systems could be the concept of pseudonymity. Pseudonymization is a procedure by which the most identifying fields within a data record are replaced by one or more artificial identifiers or pseudonyms.

“There are a number of home automation systems which are similar to NEST, which is extensively used in Silicon Valley homes, that connect to various systems. For example, when you are approaching home, it will know when to switch on your heating system or AC based on the weather. And it also has information on who stays in the house and what room and what time they sleep. And in a the car, it gives a full real-time profile about the situation at home. It can be a threat if it is hacked. This is a very common threat that is being talked about and how to introduce pseudo-anonymity. When we use these identifiers, and when the connectivity happens, how do we do so that the name and user are not there? Pseudonymity can be introduced so that it becomes difficult for the hacker to decipher who this guy is,” Srinivas added.

Ambient data collection

With IoT, it has never been able to capture ambient data. Ambient data is information that lies in areas not generally accessible to the user. An example for this is how users get traffic data from Internet companies. Kiran Jonnalagadda explained how this works:

“When you look at traffic data on a street map, where is that data coming from? It’s not coming from the fact that there is an app on the phone constantly transmitting data from the phone. It’s coming from the fact that cell phone towers record who is coming to them and you know if the cell phone tower is facing the road, and it has so many connections on it, you know that traffic is at a certain level in that area. Now as a user of the map, you are talking to a company which produces this map and it is not a telecom company. Someone who is using a phone is only dealing with a telecom company and how does this data transfer happen and how much user data is being passed on to the last mile user who is actually holding the phone.”

Jonnalagadda stressed on the need for people to ask who is aggregating this ambient data.

“Now obviously, when you look at the map, you don’t get to see, who is around you. And that would be a clear privacy violation and you only get to see the fact that traffic is at a certain level of density around the street around you. But at what point is the aggregation of data happening from an individually identifiable phone to just a red line or a green line indicating the traffic in an area. We also need to ask who is doing this aggregation. Is it happening on the telecom level? Is it happening on the map person level and what kind of algorithms are required that a particular phone on a cell phone network represents a moving vehicle or a pedestrian? Can a cell phone company do that or does a map company do that? If you start digging and see at what point is your data being anonymized and who is responsible for anonmyzing it and you think that this is the entity that is supposed to be doing it, we start realizing that it is a lot more complicated and a lot more pervasive than we thought it would be,” he said.

#NAMAprivacy Bangalore:

  • Will artificial Intelligence and Machine Learning kill privacy? [read]
  • Regulating Artificial Intelligence algorithms [read]
  • Data standards for IoT and home automation systems [read]
  • The economics and business models of IoT and other issues [read]

#NAMAprivacy Delhi:

  • Blockchains and the role of differential privacy [read]
  • Setting up purpose limitation for data collected by companies [read]
  • The role of app ecosystems and nature of permissions in data collection [read]
  • Rights-based approach vs rules-based approach to data collection [read]
  • Data colonisation and regulating cross border data flows [read]
  • Challenges with consent; the Right to Privacy judgment [read]
  • Consent and the need for a data protection regulator [read]
  • Making consent work in India [read]