CHC project aims to increase communication and collaboration between native speakers of Greenlandic and Danish

Despite being a part of the linguistic landscape of modern Denmark, Greenlandic is still comparatively understudied in terms of contemporary language technology. Ross Deans Kristensen-McLachlan, Machine Learning Team Lead at CHC, has recently taken up this challenge and secured seed funding from the Interacting Minds Center at Aarhus University for a project titled “Automatic neural machine translation for Greenlandic”. Also involved in the project are Johanne Sofie Krog Nedegård who has expertise in Greenlandic aphasia, and Kenneth Enevoldsen who leads the project at CHC to create Danish Foundation Models.

The goal of this project is firstly to create a useable machine translation tool to increase communication and collaboration between native speakers of Greenlandic and Danish, but also to reinvigorate Greenlandic language research in Denmark drawing on new, state-of-the-art approaches to language technology for low-resource languages.

Project abstract: 

What is a word? This seems like a simple question but it continues to stump linguists and philosophers who spill millions of words trying to explain what they’re spilling. It seems, too, like it should be a concern for people who create language technology. How can teach computers to use words if we don’t even know what they are?

Contemporary natural language processing (NLP) makes assumptions about words which, by and large, are based on how major Indo-European languages behave. Those same linguists and philosophers might baulk but the engineer can reply with empirical results demonstrating the efficacy of their systems on goal-oriented language tasks. If it works, it works. But does it actually work?

This project tests these assumptions by applying modern language technology to a lesser-studied part of Denmark’s linguistic landscape – Greenlandic. This fascinating language of some 57,000 speakers exhibits many rare linguistic phenomena such as ergative alignment and polysynthetic morphology. Our goal is to train an automatic machine translation model for Greenlandic to Danish and back again. In doing so, we’ll empirically evaluate how well the assumptions of NLP hold up when applied to an extremely low-resource and morphologically complex language like Greenlandic. Source: IMC

If you are interested in finding out more about the project, please contact us.