An automated system that reconstructs ancient languages could help recover the sound of words not spoken for thousands of years.
Like living things, languages evolve. Words mutate, sounds shift, and new tongues arise from old. Charting this landscape is usually done through manual research. But now a computer has been taught to reconstruct lost languages using the sounds uttered by those who speak their modern successors.
Alexandre Bouchard-Côté at the University of British Columbia in Vancouver, Canada, and colleagues have developed a machine-learning algorithm that uses rules about how the sounds of words can vary to infer the most likely phonetic changes behind a language's divergence.
For example, in a recent change known as the Canadian Shift, many Canadians now say "aboot" instead of "about". "It happens in all words with a similar sound," says Bouchard-Côté. The team applied the technique to thousands of word pairings used across 637 Austronesian languages – the family that includes Fijian, Hawaiian and Tongan.
The system was able to suggest how ancestor languages might have sounded and also identify which sounds were most likely to change. When the team compared the results with work done by human specialists, they found that over 85 per cent of suggestions were within a single character of the actual words.
For example, the modern word for "wind" in Fijiian is cagi . Using this and the same word in other modern Austronesian languages, the automatic system reconstructed the ancestor word beliu and the human experts reconstructed bali.
Reconstructing ancient languages can reveal details of our ancient history. Looking at when the word for "wheel" diverges in the family tree of European languages helps us date the human settlement of different parts of the continent, for instance.
The technique could improve machine translation of phonetically similar languages, such as Portuguese and French.
Endangered languages could also be preserved if they are phonetically related to more widely spoken tongues, says Bouchard-Côté. He is now working on an online version of the tool for linguists to use.