Centre for Internet & Society
Localisation experiences in Kannada and Tulu Wikipedias

Dr Pavanaja at FUEL GILT Conference 2016, New Delhi

It has become a fashion to talk about the digital divide in India. These talks are going on for more than 2 decades now. Nevertheless there is no doubt about the need of taking the benefits of Information Communication Technology (ICT) for common man.

Originally published in my blog on October 4, 2016


FUELling Indic computing

The statement “ICT for common man has to be in his language” is always relevant. Many state governments and the central government of India are spending lot of money towards E-Governance. Of late many private vendors also have realised the importance of localising their applications into Indian languages. Be it a web based or mobile app based solution providing it in Indian language will increase the user-base by huge magnitudes.

This brings us to the issue of standardisation of terminologies used in localising the ICT applications into Indic. How does one write frequently used terms in Indic? Is there any standard? For example, how to write “Exit” in Kannada? Should I use “ನಿರ್ಗಮನ” or “ಹೊರಗೆ”? Enter FUEL Project. This project aims at standardising the Frequently Used Entries in Localisation. The project aims at having a consistency among the terms used across all applications, apps and websites. FUEL Project is run by volunteers. Corpus of translations of the terminologies is developed by contributions of enthusiastic people in true opensource spirit.

FUEL project has been organising annual conferences referred as Globalisation Internationalisation Localisation and Translation, in short GILT Conference. This year’s GILT Conference was held at New Delhi, India, during Sep 24 & 25, 2016. This year’s conference had general users, media persons and writers who are nothing but the normal consumers of the output of FUEL usage apart from the techies. There was a track exclusively dedicated for Mozilla localisation. The detailed agenda and speakers list is available here.

Localisation experiences in Kannada and Tulu Wikipedias

I have been a Wikipedian for more than a decade. I have contributed to Kannada Wikipedia ever since it was started way back in the year 2004. Tulu Wikipedia went live recently which was in incubation from 2008. I have been contributing to Tulu Wikipedia as well. I presented a paper at FUEL GILT Conference titled “Localisation experiences in Kannada and Tulu Wikipedias”.

Kannada and Tulu belong to Dravidian family of languages. Geographically and politically they both belong to Karnataka. Both languages share a lot of things and there are many common words used in both languages. At the same time Kannada and Tulu have their own distinct identities. One striking feature is that Tulu has less Sanskrit words compared to Kannada. Tulu had its own script and grammar but over the years the usage of Tulu script has been discontinued. Nowadays most Tulu people use Kannada script. This has led to the usage of many Kannada words in place of original Tulu words.

Kannada has an established history of science and technology writings as well as localised software applications. Naturally a well-developed glossary is available. It was somewhat easy to localise the strings used in Mediawiki, which is the platform used to host Wikipedias, into Kannada, which very fall under FUEL category. In Kannada, we have been using these formulas while creating technical glossary –

  1. Create altogether new term
  2. Translate the English term
  3. Transliterate the English term, i.e., write the English term in Kannada script

Depending on the situation, one of the 3 paths is followed. The same philosophy is extended for localising the strings used in Mediawiki. In many situations, we have either a part translation available or a similar example available in Kannada which made the localisation of Mediawiki not so difficult.

Tulu Wikipedia was in incubation since 2008. It became live on Aug 05, 2016. One of the main requirement to make it live was to have major strings translated into Tulu. Most of the commonly used strings have been translated now. But the case of Tulu is not so simple which has no history of science and technology writing. It was quite difficult to localise most of the strings. At many places Kannada strings have been used. Many more strings are yet to be localised. In case of Tulu, we have been using these formulas for Mediwaiki translations –

  1. Create altogether new term in Tulu
  2. Use more of Kannada words in the localised string
  3. Transliterate the English term, i.e., write the English term in Kannada script. Since Tulu is using Kannada script, the strings thus used will be the same for Kannada and Tulu

The questions and answers just after the presentation had some very interesting questions from other language Wikipedians as well as people from other walks of life who are into localisation. I could benefit by interacting with others and by gaining insights into some unexplored areas by me. Interactions with other participants which also included some Wikipedians continued after the presentation. Overall it was quite useful spending of two days. Looking forward for the next event like this.

Acknowledgements

Links:

The views and opinions expressed on this page are those of their individual authors. Unless the opposite is explicitly stated, or unless the opposite may be reasonably inferred, CIS does not subscribe to these views and opinions which belong to their individual authors. CIS does not accept any responsibility, legal or otherwise, for the views and opinions of these individual authors. For an official statement from CIS on a particular issue, please contact us directly.