Centre for Internet & Society

"Critical Data Studies (CDS) is a growing field of research that focuses on the unique theoretical, ethical, and epistemological challenges posed by 'Big Data.' Rather than treat Big Data as a scientifically empirical, and therefore largely neutral phenomena, CDS advocates the view that data should be seen as always-already constituted within wider data assemblages." The Big Data and Society journal has provisionally accepted a paper abstract of mine for its upcoming special issue on Critical Data Studies.



Through the last decade, the Government of India has given shape to an digital identification infrastructure, developed and operated by the Unique Identification Authority of India (UIDAI). The infrastructure combines the task of assigning unique identification numbers, called Aadhaar numbers, to individuals submitting their biometric and demographic details, and the task of authenticating their identity when provided with an Aadhaar number and associated data (biometric data, One Time Pin sent to the pre-declared mobile number, etc.). The aim of UIDAI is to provide universal authentication-as-a-service for all residents of India who approach any public or private agencies for any kind of service or transaction. Simultaneously, the Aadhaar numbers will function as unique identifiers for joining up databases of different government agencies, and hence allow the Indian government to undertake big data analytics at a governmental scale, and not only at a departmental one.

In this paper, I am primarily motivated by the challenge of finding points and objects to enter into a critical study of such an in-progress data infrastructure. As I proceed with an understanding that data is produced within its specific social and material context, the question then is to read through the data to reflect on its possible social and material context. This is complicated when approaching a big data infrastructure that is meant to produce data for explicitly intra-governmental consumption and circulation. The problem then is not one of reading through available big data, but one of reading through the assemblage and imaginaries of big data to reflect on the kind of data it will give rise to, and thus on the politics of the data assemblage and the database state it enables.


Logic of the Database State

Application of data to inform governmental acts have taken place at least since government has been understood as responsible for the welfare of the population and the territory. The measurement of the population and the territory – the number of people, their demographic features, amounts and locations of natural resources, and so on – have always been integral to the functioning of the modern nation-state. Database state is used in this paper to identify a particular mode of mobilisation of data within governmental acts, which is fundamentally shaped by the possibilities of big data extraction, appropriation, and analytics pioneered by a range of companies since late 1990s. The reason for not using big data state but database dtate is that big data refers to a body of technologies emerging in response to a set of data management and analysis challenges situated in a certain moment of development of information technologies, whereas database refers to a symbolic form (Manovich 1999): a form in which not only the population is made visible to the government (as a collection of visual, textual, numeric, and other forms of records), but also how the acts of government are made visible to the population (as a collection of performance indicators, budget allocation and utilisation tables, and other data visualised through dashboards, analog and digital).

The data production and management logic of this database state is specifically inspired by the notion of platform introduced by the so-called Web 2.0 companies: providing a common service layer upon which various other applications may also run, but under specific arrangements (including distribution of generated user data) with the original common layer provider. Data assemblages of the database state are expected to enable the government to function as a platform, as an intensely data-driven layer that widely gathers data about population individuals and feeds it back selectively to various providers of public and private services. This transforms the data assemblage from one vertical of governmental activities to a horizontal critical infrastructure for modularisation of governmental activities.


Studying the Emerging Database State in India

Government of India is presently debating the legal and technical validity of the digital identity infrastructure programme in the Supreme Court, while simultaneously carrying out the enrollment drive for the same, linking up assignment of unique identity numbers with a national drive for population registration, and rolling out citizen-facing services and applications that implement the Aadhaar number as a necessary key to access them. With the enrollment process going on and the integration with various governmental processes (termed seeding by Aadhaar policy literature) just beginning, I enter this study through two key sets of objects reflecting the imaginaries and the technical specifications of the emerging database state in India. The first entry point is through the various official documents of vision, intentions, plans, and reconsiderations, and the second entry point is through the Application Programming Interface (API) documentations published by UIDAI to specify how its identity authentication platform will collaborate with various public and private services.

The first section of the paper provides a brief survey of pre-UIDAI attempts by the Government of India to deploy unique identification numbers and Smart Cards for specific population groups, so as to understand the initial conceptualisation of this data assemblage of a digital identification platform. The second section foregrounds how this platform undertakes a transformation of the components and relations of the pre-existing data assemblage of the Government of India, as articulated in various official documents of promised utility and proposed collaborations. The third section studies the API documentations to track how such imaginaries are materially interpreted and operationalised through the design of protocols of data interactions with various public and private agencies offering services utilising the identity authentication platform.


Notes for Critical Data Studies

Expanding the early agenda note on Critical Data Studies by Craig Dalton and Jim Thatcher (2014), Rob Kitchin and Tracey P. Lauriault have taken steps towards emphasising the responsibility of this nebulous research strategy to chart and unpack the data assemblages (2014). This is exactly what I propose to do in this paper. While Kitchin and Lauriault provide a detailed list of the components of the apparatus of a data assemblage (2014: 7), I find the concepts of infrastructural components and infrastructural relations very useful in thinking through the emerging infrastructure of authentication. Thus, my approach to these tasks of charting and unpacking is focused on the infrastructural relations that the digital identity infrastructure re-configures, instead of the infrastructural components it mobilises (Bowker et al 2010). This tactical choice of focusing on the infrastructural relations is also necessitated by the practical difficulty in having comprehensive access to the individual components of the data assemblage concerned. Addressing questions of causality and quality becomes difficult when studying the assemblage sans the produced data, and rigorously analysing concerns of security and uncertainty pre-requires an actually existing data assemblage, with a public interface to investigating its leakages, breakages, and internal functioning. In the absence of such points of entry into the data assemblage, which I fear may not be an exceptional case, I attempt an inverted reading. Turning the data infrastructure inside out, in this paper I describe how the digital identity platform is critically reshaping the basis of governmental acts in India, through a specific model of production, extraction and application of big data.



Bowker, Geoffrey C., Karen Baker, Florence Millerand, & David Ribes. 2010. Toward Information Infrastructure Studies: Ways of Knowing in a Networked Environment. Jeremy Hunsinger, Lisbeth Klastrup, & Matthew Allen (Eds.) International Handbook of Internet Research. Springer Dordrecht Heidelberg London New York. Pp. 97-117.

Dalton, Craig, & Jim Thatcher. 2014. What does a Critical Data Studies Look Like, and Why do We Care? Seven Points for a Critical Approach to ‘Big Data.’ Society and Space. May 19. Accessed on July 08, 2015, from http://societyandspace.com/material/commentaries/craig-dalton-and-jim-thatcher-what-does-a-critical-data-studies-look-like-and-why-do-we-care-seven-points-for-a-critical-approach-to-big-data/.

Kitchin, Rob, & Tracey P. Lauriault. 2014. Towards Critical Data Studies: Charting and Unpacking Data Assemblages and their Work. The Programmable City Working Paper 2. July 29. National University of Ireland Maynooth, Ireland. Accessed on July 08, 2015 from http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2474112.

Manovich, Lev. 1999. Database as Symbolic Form. Convergence. Volume 5, Number 2. Pp. 80-99.


Note: Call for Papers for the special issue can found here: http://bigdatasoc.blogspot.in/2015/06/call-for-proposals-special-theme-on.html.


The views and opinions expressed on this page are those of their individual authors. Unless the opposite is explicitly stated, or unless the opposite may be reasonably inferred, CIS does not subscribe to these views and opinions which belong to their individual authors. CIS does not accept any responsibility, legal or otherwise, for the views and opinions of these individual authors. For an official statement from CIS on a particular issue, please contact us directly.