Centre for Internet & Society

This is a short study on the nature of content creation related to Punjab on Eastern Punjabi Wikipedia, its challenges and opportunities, and observations and potential strategies to address the same. The report has been authored by Satpal Singh, with editorial oversight and support by Puthiya Purayil Sneha, and external review by Sumandro Chattapadhyay. This is part of a series of short-term studies undertaken by the CIS-A2K team in 2019–2020.


Introduction 

The objective of this study is to understand the challenges for content creation related to Punjab that exists on Eastern Punjabi Wikipedia. There are articles about Punjabi language and culture on Punjabi Wikipedia but there is a need for better understanding of the nature of this content   from the perspective of the readers’ interests, coverage of topics, and quality. A large community of interested Punjabi Wikimedians have been actively working over several years to introduce Wikimedia and related projects to people across the world, including those from their own community. An important part of achieving this goal is to contribute to and build more diverse and bettergood quality content about Punjab on Wikipedia. This short study is therefore an attempt to analyse the nature of existing content, challenges in content creation/curation and outreach, and some observations and strategies to address the same.


Eastern Punjabi Wikipedia

Wikipedia is a multilingual online encyclopedia. It is available in around 290+ languages. There are two Punjabi Wikipedia editions, which are Eastern Punjabi Wikipedia and Western Punjabi Wikipedia. Eastern Punjabi Wikipedia is in Gurmukhi script and Western Punjabi Wikipedia is in Shahmukhi script. This study focuses on the Eastern Punjabi Wikipedia. The Eastern edition domain came into existence on 3 June 2002, but the first three articles were only written in August 2004. There was not much contribution made during the next six years. In July 2012, it had reached 2,400 articles. Then a group of people, largely students from Punjabi University, Patiala  started contributing actively on Punjabi Wikipedia, as a result of which it became an active Wikipedia in 2013, and has stayed so until date. There are currently 35, 351 articles on the Gurmukhi Punjabi Wikipedia, with a number of registered users on Punjabi Wikipedia at  36,348. [1]

One group of people has been proactively involved in Punjabi Wikipedia for a long time, which is the Punjabi Wikimedians User Group. Apart from this, a number of  people from different parts of the world also contribute to Wikipedia. Punjabi Wikimedians got affiliation as a User Group from Wikimedia Foundation in November 2015. Punjabi Wikimedians was the first affiliated user group from India, and have been involved in several activities and initiatives undertaken towards content creation. They organized WikiConference India in 2016 at Chandigarh and their members have participated in various events and conferences. They  have also collaborated with other institutions in order to encourage content creation on Punjabi Wikipedia, one example is the collaboration with Punjabi Sahitya Academy. Apart from Wikipedia, this user group is also active on Punjabi Wikisource, Punjabi Wiktionary, Wikidata and Wikimedia Commons. The first meeting of the Punjabi Wiki community was organized in Patiala on 1 February 2015. After that the community conducted various monthly meetups in different parts of the Punjab. People from the community also joined various training programs and events in different parts of India and participated in various conferences in other countries. 


Research Objectives and Method

This study analyses various aspects of how content related to Punjab is created on Eastern Punjabi Wikipedia. This analysis would help in understanding the gap between what kind of content presently exists and what is needed, from the perspective of Punjabi language contributors and users. The objective of this study is to understand how much content related to Punjab exists on this Wikipedia at present; what is the nature of this content, what are challenges for content creation and possible strategies to address the same. There is a broader understanding that while content is being created proactively, there is still a need to analyse its quality and prevalent gaps if any, which would encourage more contributors and readers to actively engage with Punjabi Wikipedia. 

The method for this study consisted of analysis of the existing content on Eastern Punjabi Wikipedia, and conversations with selected Wikimedians on their assessment of content on topics related to Punjab. The main topics for this study were articles related to the Punjab region, including culture,  literature, and politics. To understand if there are specific challenges to the creation of content on these topics, interviews with a few selected long-time contributors and administrators were conducted, with an emphasis on aspects such as sourcing Punjabi language material, finding references, digitization, tagging etc. 

The objective of this study was also to understand where these conversations (undertaken as part of the study) may offer strategies to address knowledge gaps in specific areas of work. Questions were prepared and a total of five interviews of Punjabi Wiki community members were conducted. The people interviewed were chosen on the basis of their involvement and  experience of working in the community.

 

Observations and Analysis 

There are an estimated 33 million Eastern Punjabi speakers in the world,[2] and it is a widely spoken language in India especially in Punjab state. Over 70% people have access to the internet in Punjab on the phone.[3] The main objective of this study was to understand the nature of existing content on Punjabi Wikipedia, and various challenges in content creation, coverage of topics and quality. The following were some of the main observations and learnings from the study.: 


Challenges with Lack of Existing Content

The conversations with selected Wikimedia contributors and users offered an insight into what Punjabi readers and online contributors think about the content available on the internet in Punjabi language, and how Punjabi Wikipedia is impactful in this scenario, especially in addressing any gaps in this area. It was found during discussions with interviewees that there are less number of Punjabi language websites in the field of language, literature, politics, and general knowledge. Most of the websites are in English language. PunjabiPedia and Punjabi Wikipedia are encyclopedic websites which are providing knowledge in Gurmukhi script. Apart from this, websites like SikhiWiki are providing knowledge in the Roman script and Punjabi newspaper websites are providing their news updates in Gurmukhi script.  So, Punjabi Wikipedia is one of  the few available sites that offers information on a variety of topics in the local language. As a result, it may have a good viewership, but at the same time, there is also the additional problem of not having good or reliable online sources or references.

Another important point mentioned by people interviewed was that while there are 30,000+ articles on Punjabi Wikipedia and they are categorised across different topics, there is a lack of content about Punjab itself. It was suggested that, therefore, this should be an area of  priority for the community to work on. Even the most viewed articles of Punjabi Wikipedia do not meet the good article criteria of Wikipedia. For example, the article of Harmandir Sahib on Punjabi Wikipedia has not been written  according to good article criteria, as it too has no category and it is without proper sections.[4] There are not many references in most of the important articles. Another example is the article of the tenth Sikh Guru, Guru Gobind Singh, which has only 2 references.[5] Apart from this, articles about cities and villages of Punjab are mostly stub articles. The total number of villages in Punjab is about twelve thousand and a good number of the articles about these villages are available on Punjabi Wikipedia. They are too small and the need is to expand those articles. There are about 7,000+ articles in the stub category. So, such articles therefore need more work and improvement in terms of quality.


Methods of Creating New Content

Most of the content on Punjabi Wikipedia is about other countries or regions apart from Punjab or India. One of the reasons for this is that most of the editors are doing content translation, from existing content on English or other regional language Wikipedias into Punjabi Wikipedia. In order to fill the content gap about Punjab there should be content creation specifically on topics related to the state. Content translation tools, while helpful, have also contributed to the fact that people prefer translation and they use Google Translate in the content tool. The tool itself is accurate and works fine with Punjabi, but the issue is that most of the people are doing only translation, as it is the easiest way to contribute. Apart from the above, it was also noted by the interviewees that editathons about other countries or cultures, while useful, are not beneficial in the immediate context. Nitesh Gill [6] observes that there should be more discussion among the community members about upcoming events or editathons. She says: “Sometimes two or more events are going on in the same time period and the same contributors take part in those activities. It should be better that with cooperation we can have one event at one time. We will grow in a better way if we do something about this.” 

There is less viewership of those articles which are related to other countries or cultures apart from India, for example the article of Constantine Peter Cavafy. But articles of importance from the perspective of region and culture are not edited for a long time, such as the article about Punjabi storywriter Maninder Kang, which is smaller than the article of Constantine Peter Cavafy. For example, Stalinjeet Brar [7] notes that: “The article of former chief minister of Punjab, Prakash Singh Badal on Punjabi Wikipedia is a relevant article. He is a remarkable personality of Punjab in the history of politics. He left his position in 2017 but the article still shows that he is chief minister of Punjab. This is our major mistake. We have to work on this aspect.” [8] He also added that the statistics of cricketers like Virat Kohli, are not updated. In this regard, we can integrate Wikidata with Wikipedia articles so one change on Wikidata can provide automatically updated data on Punjabi articles. 

He also added that an assessment of existing events and initiatives, such as  Project Tiger would be useful to understand the challenges and opportunities for content creation on Indian language Wikipedias The first Project Tiger editathon happened from 1 March to 31 May 2018 and the second Project Tiger event was organised from 10 October 2019 to 11 January 2020, which was named “Project Tiger 2.0”. Project Tiger coordinators can assess the number of views of those articles which were created during the event and therefore arrive at a potential strategy for our target audience. It is therefore useful to undertake such an analysis and evaluation at the end of big events.


Strategies for New Content Creation

 It was observed that to fill this content gap about Punjab and its culture on Punjabi Wikipedia, the community needs to approach this topic accordingly. It was suggested by every interviewee that we have to organise an editathon to edit top viewed articles. To maintain continuity in  this approach, community members should cooperate with each other. Charan Gill [9] noted  that we should engage students and professors in different subjects to collaborate and work on these areas with editors. This  will be helpful in  producing good quality articles. The community needs to engage experts from different areas or fields. Manavpreet Kaur [10] suggests  that we should focus on good article criteria when we go to a college or institution to teach students how to edit Wikipedia. For example, she notes that there were very few articles about forensic science when she joined Punjabi Wikipedia and existing content was two or three line articles. She tried to fill this gap and as a professor of forensic science, she engaged her students to edit Wikipedia content related to forensic science. So, we can have this kind of approach to fill the content gap on Punjabi Wikipedia. We should encourage colleges in Punjab and teachers to participate in this free knowledge movement. Wikimedia projects are platforms for Punjabi language community to provide knowledge in their own language. Manavpreet and Nitesh also note that there are less number of women participants in this movement. To fill the gender gap we should also focus on engaging women contributors. According to Manav, the Wikidata Game was interesting for her, and noted that especially for new editors these types of games and editing techniques are so valuable  in order to engage the younger generations of editors with this movement.

Hardarshan [11] shared his view that to engage the new generation with Punjabi Wikipedia or the broader  free knowledge movement we should also work on basic articles related to the technological world. For example, articles on  computers and other devices , mobile games etc. He also notes that while Punjabi Wikipedia has  articles on advanced topics, most of them are translated, and not new content. But it does not have basic articles of good quality to offer an appropriate understanding of the topic. There is also a problem of translating technical vocabulary into the Punjabi language, which can be addressed by engaging experts and scholars as part of editathons and other initiatives for content creation . This way,  the energy and interests of volunteers  may also be harnessed  with  the right methods. Key members from the community can assign articles to newcomers, so that will help in further content creation.  Nitesh shared her observation that “I was a beginner type volunteer at one time and later on with experience I have organised various events within my community. So, we should encourage our team members or volunteers. They can be good organisers or leaders of this movement.”

To engage students, there should be more syllabus oriented content on Punjabi Wikipedia. To bring a change in the structure, Stalinjeet suggested that we should not blindly follow policies of other languages like English and French or policies of the Wikimedia Foundation in India. We need to rethink these policies in the context of the needs of the local languages. Prioritization of editathons is also necessary, including coordination and working collaboratively on when to participate in which event. 

In addition to the above, it has been noticed that it is difficult for communities with a small number of members to contribute collaboratively, so it is imperative to slowly increase the number of contributors as well. There are various Wikimedia projects that volunteers can join or they can contribute to any project according to their interest. They are not limited only to Punjabi Wikipedia. A good number of people are also active on Punjabi Wikisource as well. The need of the hour therefore, is to engage new people with Wikipedia or with this movement to make good changes to the modes of access to knowledge in Indian languages. Experienced Wikimedians should share their learnings, apart from training that is required for advanced editing.

 

Conclusion

As illustrated by observations above, content creation on (Eastern) Punjabi Wikipedia faces a specific set of challenges. The people interviewed as part of this study have offered various  suggestions on what can be done to address these limitations, and improve the quality of content. . Punjabi Wikipedia is an important source of information and knowledge for  Punjabi internet readers due to the lack of websites providing content in the language. So, the responsibility of its  reliability  to provide such content in a sustainable manner is even greater. Work on creating more content on these platforms  needs to be undertaken after understanding the response and ways of engagement  of the readers. People today want to read less and learn more. Punjabi Wikipedia articles therefore  need to be informative and  include as many references as possible. A crucial gap here is also the lack of  information on how to contribute to Punjabi Wikipedia in a productive and easy way. Good documentation of help pages and more frequent training would help in addressing  this shortcoming as well. 

In conclusion, the main strategy to address these knowledge gaps, as illustrated by the learnings from this study, is that we should update existing articles on Punjabi Wikipedia at priority, with a focus on  expanding top viewed stub articles. A focus on quality of content is therefore  more important than quantity. In addition to this, knowledge-sharing by experienced Wikipedians, diverse modes of training and engaging new contributors, and working on strategies for sustainable content creation would go a long way in addressing the content gaps on Punjabi Wikipedia. 

Read this report on Wikimedia Meta-Wiki here.


References:

[1] Statistics as of 09 February 2020, 05:32 PM

[2] “Punjabi, Eastern,” Ethnologue, accessed February 9, 2021, https://www.ethnologue.com/language/pan.

[3]  Roy, C. Vijay. “In Punjab, over 70% people access the internet on the phone” The Tribune India, May 2, 2019. https://www.tribuneindia.com/news/archive/business/in-punjab-over-70-people-access-internet-on-phone-766809. Accessed February 9, 2021. 

[4] As of 04 March 2021 09:59 AM

[5] As of 04 March 2021 09:59 AM

[6] She is a research scholar from Moga, Punjab and pursuing her Ph.D from University of Delhi in Punjabi literature. She has been contributing on Punjabi Wikipedia from 2015. Her remarkable contribution on Punjabi Wikipedia is that she has completed 1000WikiDays challenge, which means one article every day. See: https://pa.wikipedia.org/wiki/ਵਰਤੋਂਕਾਰ:Nitesh_Gill

[7]  Stalinjeet Brar is from Faridkot and has been contributing on Wikimedia projects since August 2014. He is doing his Ph.D in Punjabi language and his research topic is also a comparative study of Punjabi Wikipedia and PunjabiPedia. See: https://pa.wikipedia.org/wiki/ਵਰਤੋਂਕਾਰ:Stalinjeet_Brar
 
[8]  As of 04 March 2021, 09:59 AM
 
[9] Charan Gill is an experienced volunteer, aged 76 years old. He has been contributing on Wikimedia projects since 2008 He is the top contributor from Punjabi Wiki Community with more than 59,000 edits on Eastern Punjabi Wikipedia.. See:  https://pa.wikipedia.org/wiki/ਵਰਤੋਂਕਾਰ:Charan_Gill
 
[10] Manavpreet Kaur is from Forensic science field and completed her Phd in the same subject also. She engaged with Punjabi Wikipedia in 2014 and also as a volunteer she has completed 100WikiDays challenge and has done various forms of outreach for the community. See: https://pa.wikipedia.org/wiki/ਵਰਤੋਂਕਾਰ:Manavpreet_Kaur
 
[11]  
As a student of secondary school, Benipal hardarshan is one of the youngest Wikimedians from Punjabi Wiki community. He made his first edit in 2014 but is actively contributing from 2016. He is an active administrator on Punjabi WIkisource.. See: https://pa.wikipedia.org/wiki/ਵਰਤੋਂਕਾਰ:Benipal_hardarshan
The views and opinions expressed on this page are those of their individual authors. Unless the opposite is explicitly stated, or unless the opposite may be reasonably inferred, CIS does not subscribe to these views and opinions which belong to their individual authors. CIS does not accept any responsibility, legal or otherwise, for the views and opinions of these individual authors. For an official statement from CIS on a particular issue, please contact us directly.