The Greenlandic language is no stranger to online tools. Oqaasileriffik, the national language secretariat, has a number publicly available language aids. Yet, with the exception of a downloadable spell-checker for word-processors and online Greenlandic-Danish dictionary, what the organization has on offer appeals more to the linguist than the language user.
That is due to change with the development of a machine-translation program of the sort popularized by Google.
Naalakkersuisut, the elected government, has announced it will spend 10 million kroner ($1.5 million) over the next five years to develop a machine-translation program of its own for Greenlandic. Doing so will, in part, help reduce its budget for translation and interpretation services, and allow language staff to concentrate on more complicated tasks.
To start with, the project will focus on translating between the Kalaallisut dialect of Greenlandic (the official language) and Danish (the former official language). Mostly this is because this is where the need is greatest, according to Per Langgård, Oqaasileriffik’s language-technology lead, but it is also where there is an existing body of easily comparable texts.
The second bit is important because, to prime the pump of machine-translation tools, those making them need do more than just link individual words across languages, they also need to program them to understand how those words fit together to make sentences and paragraphs, oftentimes with different meanings depending on the context.
To do this, previous generations of translation tools have used a technology known as statistical machine translation, in which a program comes up with a version in the target language by comparing huge quantities of text and coming up with what is the most likely meaning.
Typically, this renders translations that give users the gist of what the original version was about, but little else. The method is adequate for languages that are close linguistically, or where there are large volumes of text that can be compared. Because Greenlandic is vastly different from other types of languages, it would render translations that were mostly gibberish.
“It is a technology that would be totally impossible to apply to a polysynthetic language,” Langgård says. By “polysynthetic,” he means a language in which single words are composed of multiple elements to make up a complex thought.
This results in words that, even for native speakers, are dauntingly long. The longest word in Greenlandic, for example, is reputed to be nalunaarasuartaatilioqateeraliorfinnialikkersaatiginialikkersaatilillaranatagoorunarsuarrooq, (hear it pronounced), meaning ‘once again they tried to build a radio station, but apparently it is still only on the drawing board’.
To get around this, the program will use artificial intelligence, which, instead of power, uses finesse to look for patterns between two languages. Generally, this results in translations that appear more natural to native speakers.
The process of making up the machine-translation program will have the added benefit of tidying up databases and other source texts in Greenlandic. The knock-on benefit of that, according to Langgård, will be that when a decision is made to add new languages, English being the most likely candidate, the process will be far faster.
Another benefit of using an artificial-intelligence-based program is that, like human intelligence, it improves the more it is used.
“Our translation tool is going to be intelligent,” Langgård says. “It will literally learn to understand Greenlandic.”
Even with this capacity, Oqaasileriffik is keeping its expectations low. The service will probably be most useful for short, work-related messages. Legal documents and news articles will still be too complicated, but where the line between clear and garbled falls will depend in large part on funding, Langgård reckons.
For the time being at least, making machine translations completely intelligible will still require the human touch.