Centre for Internet & Society
Indian language localization community meets in New Delhi

Image by: opensource.com

Localization is one of the less glamorous aspects of computing. Despite the fact that less than 6% of the world speaks English, a majority of projects don't feel inclined to accommodate the rest of the population. One of the primary reasons for sticking to English is the steep learning curve and the lack of standardization in various aspects of the localization process.

Indian language localization community meets in New Delhi

Image by: opensource.com

The post by Mayank Sharma was published by Opensource.com on October 3, 2016. Dr. U.B. Pavanaja was quoted.

The FUEL Project organized the GILT conference in New Delhi, India September 24-25 to highlight and address these issues. The annual event showcases the efforts of language technology organizations and volunteer communities, but this year's also gave a platform for non-technical users to voice their concerns. The Indic computing developers were joined by academics, reporters, language researchers, publishers, and entrepreneurs who rely on localization tools to connect and interact with audiences in the various regional languages in India. The brainstorming between the two groups, both on and off the stage, was one of the highlights of the conference.

Mozilla ran a two-day hackathon alongside the conference that was attended by teams from India, Nepal and Germany. Photo by Rajesh Ranjan. All Rights Reserved.

Focus on standardization

Another recurring theme discussed in detail at the conference was the need for standardization. The FUEL Project spearheads standardization efforts with its terminology management system to preserve consistency across translations. The project also created translation style guides for various languages, including Spanish, German, French, Scottish Gaelic, and several Indian languages. In addition to these guides, the project is also working on a couple of tools to help maintain the accuracy of the translations. One that caught the attention of the translators at the conference is the Unicode Text Rendering Reference System (UTRRS). It's a web app that lets you enter a character, word, or phrase and then compares it to a reference image generated by a text rendering engine.

The current state of localization

The conference began with an inaugural address by the keynote speakers. Rajesh Ranjan, who heads the FUEL Project and is currently the open source community manager at the Indian Government's National eGovernance Division (NeGD), kicked things off by talking about the evolution of the 8-year-old project. There was also an enlightening address by Jeff Beatty, who heads localization efforts at Mozilla. He talked about the role of his alma mater, the University of Limerick, in the initiation and growth of multilingual computing. Later, Vinay Thakur, director of project development at NeGD, discussed the Indian Government's increased interest in localization and listed the various initiatives currently underway.

This was also reiterated by Mahesh Kulkarni, assistant director at CDAC's GIST research labs. He talked about the scale of the government's plan for making all its official websites available in all the officially recognized 22 Indian languages.

Addressing problems

Kulkarni also chaired a panel discussion later in the day. The panel members talked about the issues plaguing the localization community and what it would take to solve them. Sudhanwa Jogalekar, a well-respected contributor to Indic computing, suggested that translators should get ISO certified as a first step toward standardization. Jogalekar pointed to the ISO 7001:2015 standard, which certifies conformity in translation services. Another panel member, Prabhat Ranjan, executive director of the technology think tank TIFAC, talked about the stress on translation in the Vision 2035 document recently released by the Indian Prime Minister Narendra Modi. Ranjan's team found English to Hindi translation easier when documents are first translated into another Indian language. Based on this experience, Ranjan bounced the idea of agreeing on a meta language to ease the translation process.

A chat with the Document Foundation

The conference also had a video conference session by the Document Foundation's Italo Vignoli about LibreOffice. While the talk was fairly overview-ish the Q&A generated some valuable suggestions that Vignoli promised to take up with the LibreOffice developers. One of the concerns raised by Pavanaja U.B. was that localizing the office suite was a cumbersome process, as it involved recompiling the entire application. Pavanaja, who is well-known in the localization community for creating the Kannada version of the Logo programming language, requested Vignoli ask LibreOffice developers to brainstorm a less tedious process for the localizers. Later in the day, Pavanaja also talked about his experience localizing Wikipedia in Kannada and Tulu languages.


Karunakar G demos an in-development spell checker for the Hindi language. Photo by Mayank Sharma. CC-BY 3.0.

The second day began with a session on the evolution and current status of the Unicode standard. It was delivered by Karunakar G, one of the stalwarts of the Indic localization community. A longtime localization developer, Karunakar also demoed the support for Indian languages in LibreOffice. He highlighted a few missing features, such as the lack of an Indic thesaurus and autocorrect functionality.

Sailfish OS

Karunakar was followed by Raju Vindane, who introduced the audience to the Sailfish OS. He also demoed the only Sailfish OS phone available in the Indian market, the Intex Aqua, which retails for about $90. Vindane mentioned that while the community is encouraged to contribute and improve the Indic translations to the Sailfish OS project, these wouldn't be included in the Indian phone, as Intex does its translations in-house.

Other highlights

Ryan Northey asks the community to explore the use of XLIFF (XML Localization Interchange File Format). Photo by Mayank Sharma. CC-BY 3.0.

The day also had introductory presentations by Ryan Northey, lead developer at Translate House, and Satdeep Gill from the WikiTongues project. Northey mentioned that there's been a disconnect between software development and localization, and that going forward localization should become a part of the software development cycle.

In addition to the scheduled sessions, there were several fruitful discussions during lunch and tea breaks. The presentation-free exchange of gray matter between the stalwarts and the young padawans were a delight to witness. The 2016 edition of the GILT conference helped bring together longtime developers and experts from the government with niche communities and individuals working on different aspects of localization in various parts of the country. The conference ended with the participants hoping that the Government's increased focus on localization would translate into a considerable leap in the quality and quantity of localized content and localization tools.