AI can help bridge Southeast Asia’s 1,000 languages—but the work ‘has to be done by Southeast Asians’ | Metaglossia: The Translation World | Scoop.it

The full extent of Southeast Asia's cultural diversity risks being ignored by AI models built on English and Mandarin Chinese.

AI can help bridge Southeast Asia’s 1,000 languages—but the work ‘has to be done by Southeast Asians’
BYDAVID AUSTIN
July 30, 2024 at 2:53 PM GMT+1
“AI models are built on data…and the region is not well represented in the digital space," said Leslie Teo, lead on the Southeast Asian Languages in One Network (Sea-Lion) project.
GRAHAM UDEN FOR FORTUNE
With over 1,000 languages, Southeast Asia is one of the most linguistically diverse regions in the world—and that’s a challenge for businesses trying to operate with talent and customers right next door.
“The language barrier can be a huge issue,” Kisson Lin, co-founder and chief operating officer for Singaporean AI startup Mindverse AI at the Fortune Brainstorm AI Singapore conference on Tuesday. “We have different colleagues from different regions speaking different languages. It’s not only that you [find] it hard to collaborate, but also hard to bond with each other.”
But can AI bridge the linguistic divide, without eradicating the cultural nuances within a diverse population of 600 million?
Solving this question can unlock new markets for global businesses. Lin pointed out that Alibaba’s sales revenue spiked once it started using AI to translate product information.
AI might even help India’s prolific, multilingual entertainment industry “propagate to the whole world,” says Sambit Sahu, senior vice president of silicon design for Ola Kutrim, an Indian AI startup.
Yet Leslie Teo, lead of the Southeast Asian Languages in One Network (Sea-Lion) project, said that hundreds of Southeast Asian languages present a unique challenge to developers. “AI models are built on data…and the region is not well represented in the digital space.” That means the richness of the area’s food, history, and culture—particularly from smaller language groups like Khmer and Lao—risks being left out.
The benchmarks for judging AI’s performance are also largely driven by English and Mandarin Chinese...