Centre for Internet & Society

In his much acclaimed book, The Filter Bubble, Eli Pariser explains how personalisation of services on the web works and laments that they are creating individual bubbles for each user, which run counter to the idea of the Internet as an inherently open place. While Pariser’s book looks at the practices of various large companies providing online services, he briefly touches upon the role of new media such as search engines and social media portals in new curation. Building upon Pariser’s unexplored argument, this article looks at the impact of algorithmic decision-making and Big Data in the context of news reporting and curation.


Everything which bars freedom and fullness of communication sets up barriers that divide human beings into sets and cliques, into antagonistic sects and factions, and thereby undermines the democratic way of life. —John Dewey

 Eli Pariser, in his book, The Filter Bubble,[1] refers to the scholarship by Walter Lippmann and John Dewey as integral to the evolution of the understanding of the democratic and ethical duties of the Fourth Estate. Lippmann was disillusioned by the role of newspapers in propaganda for the First World War. He responded with three books in quick succession — Liberty and the News,[2] Public Opinion[3] and The Phantom Public.[4] Lippmann brought attention the fact that the process of news-reporting was conducted through privately determined and unexamined standards. The failure of the Fourth Estate to perform its democratic functions, was, in the opinion of Lippmann, one of the prime factors responsible for the public not being an informed and rational entity. John Dewey, while rejecting Lippmann’s arguments that matters of public policy can only be determined by inside experts with training and education, did acknowledge the his critique of the media.

Pariser points to the creation of a wall between editorial decisionmaking and advertiser interests, as the eventual result of the Lippmann and Dewey debate. While accepting that this division between the financial and reporting sides of media houses has not been always observed, Pariser emphasises that the fact that the standard exists is important.[5] Unlike traditional media, the new media which relies on algorithmic decision-making for personalisation is not subject to the same standards which try to mitigate the influence of commercial interests on editorial decisions while performing many of the same functions as the traditional media.[6]  

How personalisation algorithms work

Kevin Slavin, at his famous talk in the TEDGLobal Conference, characterised algorithms as “maths that computers use to decide stuff” and that it was infiltrating every aspect of our lives.[7] According to Slavin’s view, algorithms can be seen as control technologies and shape our world constantly through media and information systems, dynamically modifying content and function through these programmed routines. Search engines and social media platforms perpetually rank user-generated content through algorithms.[8]

Personalisation technologies have various advantages. It translates into more relevant content, which for service providers means more clicks and revenue and for consumer, less time spent on finding the content.[9] However, it also leads to privacy compromise, lack of control and reduced individual capability.[10] Search engines like Google use the famous PageRank algorithm, which combined with geographical location and previous searches yields most relevant search results.[11] PageRank algorithm uses various real time variables dependent on both voluntary and involuntary user inputs. These variables include number of clicks, number of occurrences of the key terms and number of references by other credible pages etc. This data in turn determines the order of pages in search results and influences the way we perceive, understand and analyse information.[12] Maps showing real time traffic information retrieve data from laser and infrared sensors alongside the road and from information from devices of users. Once this real time data is combined with historical trends, these maps recommend rout to every user, hence influencing the traffic patterns.[13]

Even though this phenomenon of personalization may appears to be new, it has been prevalent in the society for ages.[14] The history of mass media culture clearly shows personalization has always been a method to increase market, market reach and customer satisfaction.[15] Newspapers have sections dedicated to special topics, radio and TV have channels dedicated to different interest groups, age groups and consumers.[16] These personalised sections in a newspaper and personalised channels on radio and television don’t just provide greater satisfaction to the readers or listeners or consumers, they also provide targeted advertisement space for the advertisers and content developers. However, digital footprints and mass collection of data have made this phenomenon much more granular and detailed. Geographical location of an individual can tell a lot about their community, their culture and other important traits local to a community.[17] This data further assists in personalisation. Current developments in technology not only help in better collection of data about personal preferences but also help in better personalisation.

Pariser mentions three ways in which the personalization technologies of this day are different from those of the past. First, for the very first time, individuals are alone in the filter bubble. While in traditional forms of personalisation, there were various individuals who shared the same frame of reference, now there is a separate sets of filters governing the dissemination of content to each individual.[18] Second, the personalisation technologies are entirely invisible now, and there is little that consumers can do to control or modify them.[19] Third, often the decision to be subject to these personalisation technologies is not an informed choice. A good example of this would be an individual’s geographical location.[20]

The neutrality of New Media?

More and more, we have noticed personalisation technologies having an impact on how we consume news on the Internet. Google News, Facebook’s News Feed which tries to put together a dynamic feed for both personal and global stories, and Twitter’s trending hashtag feature, have brought forward these services are key drivers of an emerging news ecosystem. Initially, this new media was hailed as a natural consequence of the Internet which would enable greater public participation, allow journalists to find more stories and engage with the readers directly.  An illustration of the same could be seen in the way Internet based news media and social networking websites behaved in the aftermath of Israel’s attacks on a United Nations run school in Gaza strip. While much of the international Internet media covered the story, Israel’s home media did not cover the story. The only exception to this was the liberal Israeli news website Ha’aretz.[21] Network graph details of Twitter, for a few days immediately after the incident clearly show the social media manifestation of the event in the personalised cyberspace. It is clearly visible that when most of the word was re-tweeting news of this heinous act of Israel, Israeli’s hardly re-tweeted this news. In fact they were busty re-tweeting the news of rocket attacks on Israel.[22]

The use of social media in newsmaking was hailed by many scholars as symptomatic of the decentralisation characteristic of the Internet. It has been seen as movement towards greater grassroots participation by negating the ‘gatekeeping’ role traditionally played by editors.  Thomas Poell and José van Dijck punch holes in theory of social media and other online technologies as mere facilitators of user participation and translators of user preferences through Big Data analytics.[23] They quote T. Gillespie’s work which talks of the narrative of these online services as platforms which are “open, neutral, egalitarian and progressive support for activity.”[24]

Pedro Domingos calls the overwhelming number of choices as the defining problem of the information age, and machine learning and data analytics as the largest part of this solution.[25] The primary function of algorithmic decision making in the context of consumption of content is to narrow down the choices. Domingos is more optimistic about the impact of these technologies, and he says “last step of the decision is usually still for humans to make, but learners intelligently reduce the choices to something a human can manage.”[26] On the other hand, Pariser is more circumspect about the coercive result of machine learning algorithms. Whichever way we lean, we have to accept that a large part of personalisation algorithms is to select and prioritize content by categorising it on the basis of relevance and popularity.  

Poell and van Dijck call this a new knowledge logic which in effect replaces human judgement (as, earlier exercised by editors) to some kind of proxy decisionmaking based on data. Their main thesis is that there is little evidence to suggest that the latter is more democratic than former and creates new problems of its own. They go on to compare the practices of various services including Facebook’s new graph and Twitter’s trending topic, and conclude that they prioritise breaking news stories over other kinds of content.[27] For instance, the algorithm for the trending topics depends not on the volume but the velocity of the tweets with the hashtag or term. It could be argued that given this predilection, the algorithms will rarely prefer complex content. If we go by Lippmann and Dewey’s idea that the role of the Fourth Estate is to inform public debate and accountability of those in positions of power, this aspect of Big Data algorithms does not correspond with this role.

Quantified Audience

Another aspect of use of Big Data and algorithms in New Media that requires attention is that the networked infrastructure enables a quantified audience. C W Anderson who has studied newsroom practices in the US looked at role played by audience quantification and rationalization in shifting newswork practices. He concluded that more and more, journalists are less autonomous in their news decisions and increasingly reliant on audience metrics as a supplement to news  judgment.[28] Poell and van Dijck review the the practices by some leading publications such a New York Times, L.A. Times and Huffington Post, and degree to which audience metrics  dictates editorial decisions. While New York Times seems to prioritise content on their social media portals based on expectation of spike in user traffic, L.A. Times goes one step further by developing content specifically aimed towards promoting greater social participation. Neither of these practices though compare to the reliance on SEO and SMO strategies of web-born news providers like Huffington Post. They have traffic editors who trawl the Internet for trending topics and popular search terms, the feedback from them dictates the content creation.[29]

Conclusion

The above factors demonstrate that the idea of New Media leading to the Fourth Estate performing its democratic functions does not take into account the actual practices. This idea is based on the erroneous assumption that technology, in general and algorithms, in particular are neutral. While the emergence of New Media might have reduced the gatekeeping role played by the editors, its strong prioritisation of content that will be popular reduce the validity of arguments that it leads to more informed public discussion. As Pariser said, the traditional media scores over the New Media inasmuch as there is an existence of a standard of division between editorial decisionmaking and advertiser interest. While this standard is flouted by media houses all the time, it exists as a metric to aspire to and measure service providers against. The New Media performs many of the same functions and maybe it is time to evolve some principles and ethical standards that take into account the need for it to perform these democratic functions.

Endnotes 

[1] Eli Pariser, The Filter Bubble: What the Internet is hiding from you (The Penguin Press, New York, 2011) 

[2] Walter Lippmann, Liberty and News (Harcourt, Brace and Howe, New York 1920) available athttps://archive.org/details/libertyandnews01lippgoog

[3] Walter Lippmann, Public Opinion (Harcourt, Brace and Howe, New York 1920) available at http://xroads.virginia.edu/~Hyper2/CDFinal/Lippman/cover.html

[4] Walter Lippmann, The Phantom Public (Transaction Publishers, New York, 1925)

[5] Supra Note 1 at 35.

[6] Supra Note 1 at 36.

[7] https://www.ted.com/talks/kevin_slavin_how_algorithms_shape_our_world/transcript?language=en

[8] Fenwick McKelvey, “Algorithmic Media Need Democratic Methods: Why Publics Matter”, available at http://www.fenwickmckelvey.com/wp-content/uploads/2014/11/2746-9231-1-PB.pdf.

[9] http://mashable.com/2011/06/03/filters-eli-pariser/#9tIHrpa_9Eq1

[10] Helen Ashman, Tim Brailsford, Alexandra Cristea, Quan Z Sheng, Craig Stewart, Elaine Torns and Vincent Wade, “The ethical and social implications of personalization technologies for e-learning” available at http://www.sciencedirect.com/science/article/pii/S0378720614000524.

[11] Sergey Brin and Lawrence Page, “The Anatomy of a Large-Scale Hypertextual Web Search Engine” available at http://infolab.stanford.edu/pub/papers/google.pdf.

[12] Ian Rogers, “The Google Pagerank Algorithm and How It Works” available at http://www.cs.princeton.edu/~chazelle/courses/BIB/pagerank.htm.

[13] Trygve Olson and Terry Nelson, “The Internet’s Impact on Political Parties and Campaigns”, available at http://www.kas.de/wf/doc/kas_19706-544-2-30.pdf?100526130942.

[14] Ian Witten, “Bias, privacy and and personalisation on the web”, available at http://www.cs.waikato.ac.nz/~ihw/papers/07-IHW-Bias,privacyonweb.pdf.

[15] Supra Note 1 at 10.

[16] https://www.americanpressinstitute.org/publications/reports/survey-research/social-demographic-differences-news-habits-attitudes/

[17] Charles Heatwole, “Culture: A Geographical Perspective” available at http://www.p12.nysed.gov/ciai/socst/grade3/geograph.html.

[18] Supra Note 1 at 10.

[19] Id.

[20] Supra Note 1 at 11.

[21] Paul Mason, “Why Israel is losing the social media war over Gaza?” available at http://blogs.channel4.com/paul-mason-blog/impact-social-media-israelgaza-conflict/1182.

[22] Gilad Lotan, Israel, Gaza, War & Data: Social Networks and the Art of Personalizing Propaganda available at www.huffingtonpost.com/entry/israel-gaza-war-social-networks-data_b_5658557.html

[23] Thomas Poell and José van Dijck, “Social Media and Journalistic Independence” in Media Independence: Working with Freedom or Working for Free?, edited by James Bennett & Niki Strange. (Routledge, London, 2015)

[24] T Gillespie, “The politics of ‘platforms,” in New Media & Society (Volume 12, Issue 3).

[25] Pedro Domingos, The Master Algorithm: How the quest for the ultimate learning machine will re-make the world (Basic Books, New York, 2015) at 38.

[26] Ibid at 40.

[27] Supra Note 23.

[28] C W Anderson, Between creative and quantified audiences: Web metrics and changing patterns of newswork in local US newsrooms, available at https://www.academia.edu/10937194/Between_Creative_And_Quantified_Audiences_Web_Metrics_and_Changing_Patterns_of_Newswork_in_Local_U.S._Newsrooms

[29] Supra Note 23.


The views and opinions expressed on this page are those of their individual authors. Unless the opposite is explicitly stated, or unless the opposite may be reasonably inferred, CIS does not subscribe to these views and opinions which belong to their individual authors. CIS does not accept any responsibility, legal or otherwise, for the views and opinions of these individual authors. For an official statement from CIS on a particular issue, please contact us directly.