Google Maps Improves Search Transliteration for 10 Indian Languages



Google Maps introduces automatic transliteration for 10 Indian languages. These languages include Bangla, Gujarati, Hindi, Kannada, Malayalam, Marathi, Odia, Punjabi, Tamil, and Telugu. Transliteration is different from translation – it means writing the same words in a different script, enabling better search results in these native languages. Google has built an ensemble of learning models to transliterate names of Latin script point of interest (POI) areas in the country to enable a more accurate search result. This new automatic transliteration technology is aimed at helping those users in India who don’t speak English.

On its blog, Google announced the addition of automatic transliteration for these 10 Indian languages in Google Maps. This means native language users will now get more accurate results of POIs in India than before. Earlier, transliteration of native script was not driven by newly introduced algorithm and therefore search was not as accurate as it needed to be. With the new addition of learning models, Google says Hindi transliteration has seen 3.2x coverage improvement and 1.8x quality improvement. The Bengali native language transliteration has seen 19x coverage improvement and 3.3x quality improvement. Odia language transliteration has seen a 960x coverage improvement due to the new ensemble of learned models.

Google says the new ensemble of learned models, various transliteration dictionaries, and a module for acronyms has increased the quality and coverage by nearly twenty-fold in some languages. To explain how this newly improved automatic transliteration makes Google Maps search better, the tech giant says, “Common English words are frequently used in names of places in India, even when written in the native script. How the name is written in these scripts is largely driven by its pronunciation. For example, एनआईटी from the acronym NIT is pronounced ‘en-aye-tee’, not as the English word ‘nit.’ Therefore, by understanding that NIT is a common acronym from the region, Maps can derive the correct transliteration. In the past when Maps could not understand the context of एनआईटी, it would instead show a related entity that might be farther away from the user. With this development, we can find the desired result from the local language query. Additionally, users can see the POI names in their local language, even when they do not originally have that information.”

Google Maps looks to introduce ensemble for transliteration of other classes of entities and extend to other languages and scripts, including Perso-Arabic scripts.

What will be the most exciting tech launch of 2021? We discussed this on Orbital, our weekly technology podcast, which you can subscribe to via Apple Podcasts, Google Podcasts, or RSS, download the episode, or just hit the play button below.


Source link