Centre for Internet & Society

Santali, an aboriginal South Asian language, has a brand new freely licensed font and set of cross-platform open source input tools on the way.

The article was published by Opensource.com on July 8, 2016.


More than 6.2 million people in four South Asian countries (India, Bangladesh, Nepal, and Bhutan) speak Santali. In India, it is one of the 22 major languages as mentioned in the eighth schedule of the Indian constitution. However, Santali is not the official language in regions where it is largely spoken, nor is it widely taught in schools. A large segment of the native speakers are socially and economically disadvantaged, which doesn't help either.

When it comes to mainstream media and the Internet, use of the native Santali alphabet, Ol Chiki, is limited. Right now there exists no single, fully Unicode-compliant website with Santali content. The Indian government's Ministry of Tribal Affairs, which is set up for the development of many aboriginal groups in the country, does not have its web portal in Santali or any other indigenous language. However, the government announced last year that it would make native Indian language input mandatory in mobile phones.

The need for a typeface, especially in a universal encoding standard like Unicode, became apparent during a three-month digitization project on Odia Wikisource, an Odia-language online library and sister project of Wikipedia. Many of the students who were part of the digitization project were native speakers. The students shared how they couldn't opt for education in their own language, thus affecting their knowledge and understanding of the written language.

The question whether digital activism can help revive indigenous languages was discussed at the 2015 Global Voices Citizen Media Summit in Cebu City, Philippines. After the event, a pilot project was started within the Center for Internet and Society's Access to Knowledge program to create a freely licensed font and input methods so that anyone can easily type in their native language.

The typeface family was designed by type designer Pooja Saxena and went through several rounds of review by language experts. However, the typeface is still one step away from reality. Because of this, two input methods will be made available along with the typeface; Sarjom Baha, a phonetic input method so that every common user can easily type the they pronounce the words, and InScript, a keyboard layout standard for Indian scripts. Even though the original plan was to create a editor community to contribute to the Santali Wikipedia and bring it live from Incubator, outputs will just be distributed for the users to use them.

The input method will also be available on Mediawiki so that the input methods will be available on Wikipedia and all its sister projects. Hopefully in the future, a group of contributors will use the tools, contribute, and bring the Santali Wikipedia live!

The views and opinions expressed on this page are those of their individual authors. Unless the opposite is explicitly stated, or unless the opposite may be reasonably inferred, CIS does not subscribe to these views and opinions which belong to their individual authors. CIS does not accept any responsibility, legal or otherwise, for the views and opinions of these individual authors. For an official statement from CIS on a particular issue, please contact us directly.