Centre for Internet & Society

Introduction

In 2008 Chris Anderson infamously proclaimed the 'end of theory'. Writing for Wired Magazine, Anderson predicted that the coming age of Big Data would create a 'deluge of data' so large that the scientific methods of hypothesis, sampling and testing would be rendered 'obsolete' [1]. For him and others, the hidden patterns and correlations revealed through Big Data analytics enable us to produce objective and actionable knowledge about complex phenomena not previously possible using traditional methodologies. As Anderson himself put it, 'there is now a better way. Petabytes allow us to say: "Correlation is enough." We can stop looking for models. We can analyze the data without hypotheses about what it might show. We can throw the numbers into the biggest computing clusters the world has ever seen and let statistical algorithms find patterns where science cannot' [2] .

In spite of harsh criticism of Anderson's article from across the academy, his uniquely (dis)utopian vision of the scientific utility of Big Data has since become increasingly mainstream with regular interventions from politicians and business leaders evangelising about Big Data's potentially revolutionary applications. Nowhere is this bout of data-philia more apparent than in India where the governments recently announced the launch of 'Digital India', a multi-million dollar project which aims to harness the power of public data to increase the efficiency and accessibility of public services [3]. In spite of the ambitious promises associated with Big Data however, many theorists remain sceptical about its practical benefits and express concern about its potential implications for conventional scientific epistemologies. For them the increased prominence of Big Data analytics in science does not signal a paradigmatic transition to a more enlightened data-driven age, but a hollowing out of the scientific method and an abandonment of casual knowledge in favour of shallow correlative analysis. In response, they emphasise the continued importance of theory and specialist knowledge to science, and warn against what they see as the uncritical adoption of Big Data in public policy-making [4]. In this article I will examine the challenges posed by Big Data technologies to established scientific epistemologies as well as the possible implications of these changes for public-policymaking. Beginning with an exploration of some of the ways in which Big Data is changing our understanding of scientific research and knowledge, I will argue that claims that Big Data represents a new paradigm of scientific inquiry are predicated upon a number of implicit assumptions about the nature of knowledge. Through a critic of these assumptions I will highlights some of the potential risks that an over-reliance on Big Data analytics poses for public policy-making, before finally making the case for a more nuanced approach to Big Data, which emphasises the continued importance of theory to scientific research.

Big Data: The Fourth Paradigm?

"Revolutions in science have often been preceded by revolutions in measurement".

In his book the Structure of Scientific Revolutions Kuhn describes scientific paradigms as 'universally recognized scientific achievements that, for a time, provide model problems and solutions for a community of researchers'[5]. Paradigms as such designate a field of intelligibility within a given discipline, defining what kinds of empirical phenomena are to be observed and scrutinized, the types of questions which can be asked of those phenomena, how those questions are to be structured as well as the theoretical frameworks within which the results can be analysed and interpreted. In short, they 'constitute an accepted way of interrogating the world and synthesizing knowledge common to a substantial proportion of researchers in a discipline at any one moment in time'[6]. Periodically however, Kuhn argues, that these paradigms can become destabilised by the development of new theories or the discovery of anomalies that cannot be explained through reference to the dominate paradigm. In such instances Kuhn claims, the scientific discipline is thrown into a period of 'crisis', during which new ideas and theories are proposed and tested, until a new paradigm is established and gains acceptance from the community.

More recently computer scientists Jim Gray, adopted and developed Kuhn's concept of the paradigm shift, charting history of science through the evolution of four broad paradigms, experimental science, theoretical science, computational science and exploratory science [7]. Unlike Kuhn however, who proposed that paradigm shifts occur as the result of anomalous empirical observations which scientists are unable to account for within the existing paradigm, Gray suggested that transitions in scientific practice are in fact primarily driven by advances and innovations in methods of data collection and analysis. The emergence of the experimental paradigm according to Gray can therefore be traced back to the ancient Greece and China when philosophers began to describe their empirical observations using natural rather an spiritual explanations. Likewise, the transition to the theoretical paradigm of science can be located in the 17th Century during which time scientists began to build theories and models which made generalizations based upon their empirical observations. Thirdly, Gray identifies the emergence of a computational paradigm in the latter part of the 20th Century in which advanced techniques of simulation and computational modelling were developed to help solve equations and explore fields of inquiry such as climate modelling which would have been impossible using experimental or theoretical methods. Finally, Gray proposed that we are today witnessing a transition to a 'fourth paradigm of science', which he termed the exploratory paradigm. Although also utilising advanced computational methods, unlike the previous computational paradigm which developed programs based upon established rules and theories, Gray suggested that within this new paradigm, scientists begin with the data itself; designing programs to mine enormous databases in the search for correlations and patterns; in effect using the data to discover the rules [8].

The implications of this shift are potentially significant for the nature of knowledge production, and are already beginning to be seen across a wide range of sectors. In the retail sector for example, data mining and algorithmic analysis are already being used to help predict items that a customers may wish to purchase based upon previous shopping habits[9]. Here, unlike with traditional research methodologies the analysis does not presuppose or hypothesise a relationship between items which it then attempts to prove through a process of experimentation, instead the relationships are identified inductively through the processing and reprocessing of vast quantities of data alone. By starting with the data itself, Big Data analysts circumvent the need for predictions or hypothesis about what one is likely to find, as Dyche observes 'mining Big Data reveals relationships and patterns that we didn't even know to look for'[10]. Similarly, by focussing primarily on the search for correlations and patterns as opposed to causation Big Data analysts also reject the need interpretive theory to frame the results instead researchers claim the outcomes are inherently meaningful and interpretable by anyone without the need for domain specific or contextual knowledge. For example, Joh observes how Big Data is being used in policing and law enforcement to help make better decisions about the allocation of police resources. By looking for patterns in the crime data they are able to make accurate predictions about the localities and times in which crimes are most likely to occur and dispatch their officers accordingly[11]. Such analysis according to Big Data proponents requires no knowledge of the cause of the crime, nor the social or cultural context within which it is being perpetrated, instead predictions and assessments are made purely on the basis of patterns and correlations identified within the historical data by statistical modelling.

In summary then, Gray's exploratory paradigm represents a radical inversion of the deductive scientific method, allowing researchers to derive insights directly from the data itself without the use of hypothesis or theory. Thus it is claimed, by enabling the collection and analysis of datasets of unprecedented scale and variety Big Data allows analysts to 'let the data speak for itself'[12], providing exhaustive coverage of social phenomena, and revealing correlations that are inherently meaningful and interpretable by anyone without the need for specialised subject knowledge or theoretical frameworks.

For Gray and others this new paradigm is made possible only by the recent exponential increase in the generation and collection of data as well as the emergence of new forms of data science, known collectively as "Big Data". For them the 'deluge of data' produced by the increase in the number of internet enabled devices as well as the nascent development of the internet of things, presents scientists and researchers with unprecedented opportunities to utilise data in new and innovative way to develop new insights across a wide range of sectors, many of which would have been unimaginable even 10 years ago. Furthermore, advances in computational and statistical methods as well as innovations in data visualization and methods of linking datasets, mean that scientist can now utilise the data available to its full potential or as professor Gary King quipped ' Big Data is nothing compared to a big algorithm'[13].

These developments in statistical and computational analysis combined with the velocity variety and quantity of data available to analysts have therefore allowed scientists to pursue new types of research, generating new forms of knowledge and facilitating a radical shift in how we think about "science" itself. As Boyd and Crawford note, ' Big Data [creates] a profound change at the levels of epistemology and ethics. Big Data reframes key questions about the constitution of knowledge, the processes of research, how we should engage with information, and the nature and the categorization of reality . . . [and] stakes out new terrains of objects, methods of knowing, and definitions of social life '[14]. For many these changes in the nature of knowledge production provide opportunities to improve decision-making, increase efficiency, encourage innovation across a broad range of sectors from healthcare and policing to transport to international development[15]. For others however, many of the claims of Big Data are premised upon some questionable methodological and epistemological assumptions, some of which threat to impoverish the scientific method and undermine scientific rigour [16].

Assumptions of Big Data

Given its bold claims the allure of Big Data in both the public and privates sectors is perhaps understandable. However despite the radical and rapid changes to research practice and methodology, there has nevertheless seemingly been a lack of reflexive and critical reflection concerning the epistemological implications of the research practices used in Big Data analytics. And yet implicit within this vision of the future of scientific inquiry lie a number of important and arguably problematic epistemological and ontological assumptions, most notably;

- Big Data can provide comprehensive coverage of phenomenon, capturing all relevant information.

- Big Data does not require hypothesis, a priori theory, or models to direct the data collection or research questions.

- Big Data analytics do not require theoretical framing in order to be interpretable. The data is inherently meaningful transcending domain specific knowledge and can be understood be anyone.

- Correlative knowledge is sufficient to make accurate predictions and guide policy decisions.

For many, these assumptions are highly problematic and call into question the claims that Big Data makes about itself. I will now look at each one in turn before proposing there possible implications for Big Data in Policy-making.

Firstly, whilst Big Data may appear to be exhaustive in its scope, it can only be considered to be so in the context of the particular ontological and methodological framework chosen by the researcher. No data set however large can scrutinize all information relevant to a given phenomenon. Indeed, even if it were somehow possible to capture all relevant quantifiable data within a specific domain, Big Data analytics would still be unable to fully account for the multifarious variables which are unquantifiable or undatafiable. As such Big Data does not provide an omnipresent 'gods-eye view', instead much like any other scientific sample it must be seen to provide the researcher with a singular and limited perspective from which he or she can observe a phenomenon and draw conclusions. It is important to recognise that this vantage point provides only one of many possible perspectives, and is shaped by the technologies and tools used to collect the data, as well as the ontological assumptions of the researchers. Furthermore, as with any other scientific sample, it is also subject to sampling bias and is dependent upon the researcher to make subjective judgements about which variables are relevant to the phenomena being studied and which can be safely ignored.

Secondly, claims by Big Data analysts to be able to generate insights directly from the data, signals a worrying divergence from deductive scientific methods which have been hegemonic within the natural sciences for centuries. For Big Data enthusiasts such as Prensky, 'scientists no longer have to make educated guesses, construct hypotheses and models, and test them with data-based experiments and examples. Instead, they can mine the complete set of data for patterns that reveal effects, producing scientific conclusions without further experimentation '[17]. Whereas, deductive reasoning begins with general statements or hypotheses and then proceeds to observe relevant data equipped with certain assumptions about what should be observed if the theory is to be proven valid; inductive reasoning conversely begins with empirical observations of specific examples from which it attempts to draw general conclusions. The more data collected the greater the probability that the general conclusions generated will be accurate, however regardless of the quantity of observations no amount of data can ever conclusively prove causality between two variables, since it is always possible that my conclusions may in future be falsified by an anomalous observation. For example, a researcher who had only ever observed the existence of white swans may reasonably draw the conclusion that 'all swans are white', whilst they would be justified in making such a claim, it would nevertheless be comprehensively disproven the day a black swan was discovered. This is what David Hume called the 'problem of induction'[18] and strikes at the foundation of Big Data claims to be able to provide explanatory and predictive analysis of complex phenomena, since any projections made are reliant upon the 'principle of uniformity of nature', that is the assumption that a sequence of events will always occur as it has in the past. As a result, although Big Data may be well suited to providing detailed descriptive accounts of social phenomena, without theoretical grounding it nevertheless remains unable to prove casual links between variables and therefore is limited in its ability to provide robust explanatory conclusions or give accurate predictions about future events.

Finally, just as Big Data enthusiasts claim that theory or hypotheses are not needed to guide data collection, so too they insist human interpretation or framing is no longer required for the processing and analysis of the data. Within this new paradigm therefore, 'the data speaks for itself' [19], and specialised knowledge is not needed to interpret the results which are now supposedly rendered comprehensible to anyone with even a rudimentary grasp of statistics. Furthermore, the results we are told are inherently meaningful, transcending culture, history or social context and providing pure objective facts uninhibited by philosophical or ideological commitments.

Initially inherited from the natural sciences, this radical form of empiricism thus presupposes the existence of an objective social reality occupied by static and immutable entities whose properties are directly determinable through empirical investigation. In this way, Big Data reduces the role of social science to the perfunctory calculation and analysis of the mechanical processes of pre-formed subjects, in much the same way as one might calculate the movement of the planets or the interaction of balls on a billiard table. Whilst proponents of Big Data claim that such an approach allows them to produce objective knowledge, by cleansing the data of any kind of philosophical or ideological commitment, it nevertheless has the effect of restricting both the scope and character of social scientific inquiry; projecting onto the field of social research meta-theoretical commitments that have long been implicit in the positivist method, whilst marginalising those projects which do not meet the required levels of scientificity or erudition.

This commitment to an empiricist epistemology and methodological monism is particularly problematic in the context of studies of human behaviour, where actions cannot be calculated and anticipated using quantifiable data alone. In such instances, a certain degree of qualitative analysis of social, historical and cultural variables may be required in order to make the data meaningful by embedding it within a broader body of knowledge. The abstract and intangible nature of these variables requires a great deal of expert knowledge and interpretive skill to comprehend. It is therefore vital that the knowledge of domain specific experts is properly utilized to help 'evaluate the inputs, guide the process, and evaluate the end products within the context of value and validity'[20].

Despite these criticisms however, Big Data is perhaps unsurprisingly increasingly becoming popular within the business community, lured by the promise of cheap and actionable scientific knowledge, capable of making their operations more efficient reducing overheads and producing better more competitive services. Perhaps most alarming from the perspective of Big Data's epistemological and methodological implications however, is the increasingly prominent role Big Data is playing in public policy-making. As I will now demonstrate, whilst Big Data can offer useful inputs into public policy-making processes, the methodological assumptions implicit within Big Data methodologies problems pose a number of risks to the effectiveness as well as the democratic legitimacy of public policy-making. Following an examination of these risks I will argue for a more reflexive and critical approach to Big Data in the public sector.

Big Data and Policy-Making: Opportunities and Risks

In recent year Big Data has begun to play an increasingly important role in public policy-making. Across the global, government funded projects designed to harvest and utilise vast quantities of public data are being developed to help improve the efficiency and performance of public services as well as better inform policy-making processes. At first glance, Big Data would appear to be the holy-grail for policy-makers - enabling truly evidence-based policy-making, based upon pure and objective facts, undistorted by political ideology or expedience. Furthermore, in an era of government debt and diminishing budgets, Big Data promises not only to produce more effective policy, but also to deliver on the seemingly impossible task of doing more with less, improving public services whilst simultaneously reducing expenditure.

In the Indian context, the government's recently announced 'Digital India' project promises to harness the power of public data to help modernise Indian's digital infrastructure and increase access to public services. The use of Big Data is seen as being central to the project's success, however, despite the commendable aspirations of Digital India, many commentators remain sceptical about the extent to which Big Data can truly deliver on its promises of better more efficient public services, whilst others have warned of the risk to public policy of an uncritical and hasty adoption of Big Data analytics [21]. Here I argue that the epistemological and methodological assumptions which are implicit within the discourse around Big Data threaten to undermine the goal of evidence based policy-making, and in the process widen already substantial digital divides.

It has long been recognised that science and politics are deeply entwined. For many social scientists the results of social research can be never entirely neutral, but are conditioned by the particular perspective of the researcher. As Shelia Jasanoff observed, 'Most thoughtful advisers have rejected the facile notion that giving scientific advice is simply a matter of speaking truth to power. It is well recognized that in thorny areas of public policy, where certain knowledge is difficult to come by, science advisers can offer at best educated guesses and reasoned judgments, not unvarnished truth' [22]. Nevertheless, 'unvarnished truth' is precisely what Big Data enthusiasts claim to be able to provide. For them the capacity of Big Data to derive results and insights directly from the data without any need for human framing, allows policy-makers to incorporate scientific knowledge directly into their decision-making processes without worrying about the 'philosophical baggage' usually associated with social scientific research.

However, in order to be meaningful, all data requires a certain level of interpretative framing. As such far from cleansing science of politics, Big Data simply acts to shift responsibility for the interpretation and contextualisation of results away from domain experts - who possess the requisite knowledge to make informed judgements regarding the significance of correlations - to bureaucrats and policy-makers, who are more susceptible to emphasise those results and correlations which support their own political agenda. Thus whilst the discourse around Big Data may promote the notion of evidence based policy-making, in reality the vast quantities of correlations generated by Big Data analytics act simply to broaden the range of 'evidence' from which politician can chose to support their arguments; giving new meaning to Mark Twain's witticism that there are 'lies, damn lies, and statistics'.

Similarly, for many an over-reliance on Big Data analytics for policy-making, risks leading to public policy which is blind to the unquantifiable and intangible. As already discussed above, Big Data's neglect of theory and contextual knowledge in favour of strict empiricism marginalises qualitative studies which emphasise the importance of traditional social scientific categories such as race, gender, and religion, in favour of a purely quantitative analysis of relational data. For many however consideration of issues such as gender, race, and religious sensitivity can be just as important to good public policy-making as quantitative data; helping to contextualise the insights revealed in the data and provide more explanatory accounts of social relations. They warn that neglect of such considerations as part of policy-making processes can have significant implications for the quality of the policies produced[23]. Firstly, although Big Data can provide unrivalled accounts of "what" people do, without a broader understanding of the social context in which they act, it fundamentally fails to deliver robust explanations of "why" people do it. This problem is especially acute in the case of public policy-making since without any indication of the motivations of individuals, policy-makers can have no basis upon which to intervene to incentivise more positive outcomes. Secondly, whilst Big Data analytics can help decision-makers to design more cost-effective policy, by for example ensuring better use of scarce resources; efficiency and cost-effectiveness are not the only metrics by which good policy can be judged. Public policy regardless of the sector must consider and balance a broad range of issues during the policy process including matters such as race, gender issues and community relations. Normative and qualitative considerations of this kind are not subject to a simplistic 1-0 quantification but instead require a great deal of contextual knowledge and insight to navigate successfully

Finally, to the extent that policy-makers are today attempting to harvest and utilise individual citizens personal data as direct inputs for the policy-making process, Big Data driven policy can in a very narrow sense be considered to offer a rudimentary form of direct democracy. At first glance this would appear to help to democratise political participation allowing public services to become automatically optimised to better meet the needs and preferences of citizens without the need for direct political participation. In societies such as India however, where there exist high levels of inequality in access to information and communication technologies, there remain large discrepancies in the quantities of data produced by individuals. In a Big Data world in which every byte of data is collected, analysed and interpreted in order to make important decisions about public services therefore, those who produce the greatest amounts of data, are better placed to have their voices heard the loudest, whilst those who lack access to the means to produce data risk becoming disenfranchised, as policy-making processes become configured to accommodate the needs and interests of a privilege minority. Similarly, using user generated data as the basis for policy decisions also leaves systems vulnerable to coercive manipulation. That is, once it has become apparent that a system has been automated on the basis of user inputs, groups or individuals may change their behaviour in order to achieve a certain outcome. Given these problems it is essential that in seeking to utilise new data resources for policy-making, we avoid an uncritical adoption of Big Data techniques, and instead as I argue below encourage a more balanced and nuanced approach to Big Data.

Data-Driven Science: A more Nuanced Approach?

Although an uncritical embrace of Big Data analytics is clearly problematic, it is not immediately obvious that a stubborn commitment to traditional knowledge-driven deductive methodologies would necessarily be preferable. Whilst deductive methods have formed the basis of scientific inquiry for centuries, the particular utility of this approach is largely derived from its ability to produce accurate and reliable results in situations where the quantities of data available are limited. In an era of ubiquitous data collection however, an unwillingness to embrace new methodologies and forms of analysis which maximise the potential value of the volumes of data available would seem unwise.

For Kitchen and others however, it is possible to reap the benefits of Big Data without comprising scientific rigour or the pursuit of casual explanations. Challenging the 'either or' propositions which favour either scientific modelling and hypothesis or data correlations, Kitchen instead proposes a hybrid approach which utilises the combined advantages of inductive, deductive and so-called 'abductive' reasoning, to develop theories and hypotheses directly from the data[24]. As Patrick W. Gross, commented 'In practice, the theory and the data reinforce each other. It's not a question of data correlations versus theory. The use of data for correlations allows one to test theories and refine them' [25].

Like the radical empiricism of Big Data, 'data-driven science' as Kitchen terms it, introduces an aspect of inductivism into the research design, seeking to develop hypotheses and insights 'born from the data' rather than 'born from theory'. Unlike the empiricist approach however, the identification of patterns and correlations is not considered the ultimate goal of the research process. Instead these correlations simply form the basis for new types of hypotheses generation, before more traditional deductive testing is used to assess the validity of the results. Put simply therefore, rather than interpreting data deluge as the 'end of theory', data-driven science instead attempts to harness its insights to develop new theories using alternative data-intensive methods of theory generation.

Furthermore unlike new empiricism, data is not collected indiscriminately from every available source in the hope that sheer size of the dataset will unveil some hidden pattern or insight. Instead, in keeping with more conventional scientific methods, various sampling techniques are utilised, 'underpinned by theoretical and practical knowledge and experience as to whether technologies and their configurations will capture or produce appropriate and useful research material'[26]. Similarly analysis of the data once collected does not take place within a theoretical vacuum, nor are all relationships deemed to be inherently meaningful; instead existing theoretical frameworks and domain specific knowledge are used to help contextualise and refine the results, identifying those patterns that can be dismissed as well as those that require closer attention.

Thus for many, data-driven science provides a more nuanced approach to Big Data allowing researchers to harness the power of new source of data, whilst also maintaining the pursuit of explanatory knowledge. In doing so, it can help to avoid the risks of uncritical adoption of Big Data analytics for policy-making providing new insights but also retaining the 'regulating force of philosophy'.

Conclusion

Since the publication of the Structure of Scientific Revolutions, Kuhn's notion of the paradigm has been widely criticised for producing a homogenous and overly smooth account of scientific progress, which ignores the clunky and often accidental nature of scientific discovery and innovation. Indeed the notion of the 'paradigm shift' is in many ways in typical of a self-indulgent and somewhat egotistical tendency amongst many historians and theorists to interpret events contemporaneous to themselves as in some way of great historical significance. Historians throughout the ages have always perceived themselves as living through periods of great upheaval and transition. In actual fact as has been noted by many, history and the history of science in particular rarely advances in a linear or predictable way, nor can progress when it does occur be so easily attributed to specific technological innovations or theoretical developments. As such we should remain very sceptical of the claims that Big Data represents a historic and paradigmatic shift in scientific practice. Such claims exhibit more than a hint of technological determinism and often ignore the substantial limitations to Big Data analytics. In contrast to these claims, it is important to note that technological advances alone do not drive scientific revolutions; the impact of Big Data will ultimately depend on how we decide to use it as well as the types of questions we ask of it.

Big Data holds the potential to augment and support existing scientific practices, creating new insights and helping to better inform public policy-making processes. However, contrary to the hyperbole surrounding its development, Big Data does not represent a sliver-bullet for intractable social problems and if adopted uncritically and without consideration of its consequences, Big Data risks not only to diminishing scientific knowledge but also jeopardising our privacy and creating new digital divides. It is critical therefore that we see through the hyperbole and headlines to reflect critically on the epistemological consequences of Big Data as well as its implications for policy making, a task unfortunately which in spite of the pace of technological change is only just beginning.

Bibliography

Anderson C (2008) The end of theory: The data deluge makes the scientific method obsolete. Wired, 23 June 2008. Available at: http://www.wired.com/science/discoveries/magazine/16-07/pb_theory (accessed 31 October 2015).

Bollier D (2010) The Promise and Peril of Big Data. The Aspen Institute. Available at: http://www.aspeninstitute.org/sites/default/files/content/docs/pubs/The_Promise_and_Peril_of_Big_Data.pdf (accessed 19 October 2015).

Bowker, G., (2013) The Theory-Data Thing, International Journal of Communication 8 (2043), 1795-1799

Boyd D and Crawford K (2012) Critical questions for big data. Information, Communication and Society 15(5): 662-679.

Cukier K (2010) Data, data everywhere. The Economist, 25 February (accessed 5 November 2015).

Department of Electronics and Information Technology (2015) Digital India, [ONLINE] Available at: http://www.digitalindia.gov.in/. [Accessed 13 December 15].

Dyche J (2012) Big data 'Eurekas!' don't just happen, Harvard Business Review Blog. 20 November. Available at: http://blogs.hbr.org/cs/2012/11/eureka_doesnt_just_ happen.html

Hey, T., Tansley, S., and Tolle, K (eds)., (2009) The Fourth Paradigm: Data-Intensive Scientific Discovery, Redmond: Microsoft Research, pp. xvii-xxxi.

Hilbert, M. Big Data for Development: From Information- to Knowledge Societies (2013). Available at SSRN: http://ssrn.com/abstract=2205145

Hume, D., (1748), Philosophical Essays Concerning Human Understanding (1 ed.). London: A. Millar.

Jasanoff, S., (2013) Watching the Watchers: Lessons from the Science of Science Advice, Guardian 8 April 2013, available at: http://www.theguardian.com/science/political-science/2013/apr/08/lessons-science-advice

Joh. E, 'Policing by Numbers: Big Data and the Fourth Amendment', Washington Law Review, Vol. 85: 35, (2014) https://digital.law.washington.edu/dspacelaw/bitstream/handle/1773.1/1319/89WLR0035.pdf?sequence=1;

Kitchen, R (2014) Big Data, new epistemologies and paradigm shifts, Big Data & Society, April-June 2014: 1-12

Kuhn T (1962) The Structure of Scientific Revolutions. Chicago: University of Chicago Press.

Mayer-Schonberger V and Cukier K (2013) Big Data: A Revolution that Will Change How We Live, Work and Think. London: John Murray

McCue, C., Data Mining and Predictive Analysis: Intelligence Gathering and Crime Analysis, Butterworth-Heinemann, (2014)

Morris, D. Big data could improve supply chain efficiency-if companies would let it, Fortune, August 5 2015, http://fortune.com/2015/08/05/big-data-supply-chain/

Prensky M (2009) H. sapiens digital: From digital immigrants and digital natives to digital wisdom. Innovate 5(3), Available at: http://www.innovateonline.info/index.php?view¼article&id¼705

Raghupathi, W., & Raghupathi, V. Big data analytics in healthcare: promise and potential. Health Information Science and Systems, (2014)

Shaw, J., (2014) Why Big Data is a Big Deal, Harvard Magazine March-April 2014, available at: http://harvardmagazine.com/2014/03/why-big-data-is-a-big-deal



[1] Anderson, C (2008) "The End of Theory: The Data Deluge Makes the Scientific Method Obsolete", WIRED, June 23 2008, www.wired.com/2008/06/pb-theory/

[2] Ibid.,

[3] Department of Electronics and Information Technology (2015) Digital India, [ONLINE] Available at: http://www.digitalindia.gov.in/. [Accessed 13 December 15].

[4] Boyd D and Crawford K (2012) Critical questions for big data. Information, Communication and Society 15(5): 662-679; Kitchen, R (2014) Big Data, new epistemologies and paradigm shifts, Big Data & Society, April-June 2014: 1-12

[5] Kuhn T (1962) The Structure of Scientific Revolutions. Chicago: University of Chicago Press.

[6] Ibid.,

[7] Hey, T., Tansley, S., and Tolle, K (eds)., (2009) The Fourth Paradigm: Data-Intensive Scientific Discovery, Redmond: Microsoft Research, pp. xvii-xxxi.

[8] Ibid.,

[9] Dyche J (2012) Big data 'Eurekas!' don't just happen, Harvard Business Review Blog. 20 November. Available at: http://blogs.hbr.org/cs/2012/11/eureka_doesnt_just_ happen.html

[10] Ibid.,

[11] Joh. E, (2014) 'Policing by Numbers: Big Data and the Fourth Amendment', Washington Law Review, Vol. 85: 35, https://digital.law.washington.edu/dspace-law/bitstream/handle/1773.1/1319/89WLR0035.pdf?sequence=1

[12] Mayer-Schonberger V and Cukier K (2013) Big Data: A Revolution that Will Change How We Live, Work and Think. London: John Murray

[13] King quoted in Shaw, J., (2014) Why Big Data is a Big Deal, Harvard Magazine March-April 2014, available at: http://harvardmagazine.com/2014/03/why-big-data-is-a-big-deal

[14] Boyd D and Crawford K (2012) Critical questions for big data. Information, Communication and Society 15(5): 662-679.

[15] Joh. E, 'Policing by Numbers: Big Data and the Fourth Amendment', Washington Law Review, Vol. 85: 35, (2014) https://digital.law.washington.edu/dspace-law/bitstream/handle/1773.1/1319/89WLR0035.pdf?sequence=1 ; Raghupathi, W., &Raghupathi, V. Big data analytics in healthcare: promise and potential. Health Information Science and Systems, (2014); Morris, D. Big data could improve supply chain efficiency-if companies would let it, Fortune, August 5 2015, http://fortune.com/2015/08/05/big-data-supply-chain/ ; , Hilbert, M. Big Data for Development: From Information- to Knowledge Societies (2013). Available at SSRN: http://ssrn.com/abstract=2205145

[16] Boyd D and Crawford K (2012) Critical questions for big data. Information, Communication and Society 15(5): 662-679; Kitchen, R (2014) Big Data, new epistemologies and paradigm shifts, Big Data & Society, April-June 2014: 1-12

[17] Prensky M (2009) H. sapiens digital: From digital immigrants and digital natives to digital wisdom. Innovate 5(3), Available at: http://www.innovateonline.info/index.php?view¼article&id¼705

[18] Hume, D., (1748), Philosophical Essays Concerning Human Understanding (1 ed.). London: A. Millar.

[19] Mayer-Schonberger V and Cukier K (2013) Big Data: A Revolution that Will Change How We Live, Work and Think. London: John Murray

[20] McCue, C., Data Mining and Predictive Analysis: Intelligence Gathering and Crime Analysis, Butterworth-Heinemann, (2014)

[21] Kitchen, R (2014) Big Data, new epistemologies and paradigm shifts, Big Data & Society, April-June 2014: 1-12;

[22] Jasanoff, S., (2013) Watching the Watchers: Lessons from the Science of Science Advice, Guardian 8 April 2013, available at: http://www.theguardian.com/science/political-science/2013/apr/08/lessons-science-advice

[23] Bowker, G., (2013) The Theory-Data Thing, International Journal of Communication 8 (2043), 1795-1799

[24] Kitchen, R (2014) Big Data, new epistemologies and paradigm shifts, Big Data & Society, April-June 2014: 1-12

[25] Gross quoted in Ibid.,

[26] Ibid.,

The views and opinions expressed on this page are those of their individual authors. Unless the opposite is explicitly stated, or unless the opposite may be reasonably inferred, CIS does not subscribe to these views and opinions which belong to their individual authors. CIS does not accept any responsibility, legal or otherwise, for the views and opinions of these individual authors. For an official statement from CIS on a particular issue, please contact us directly.