 Your new post is loading...
|
Scooped by
Charles Tiayon
January 25, 10:03 AM
|
"Peter Burke once noted that translators, like historians, are “serving two masters and attempting to reconcile fidelity to the original with intelligibility to their readers” (2007). It is particularly interesting to examine how translation functioned during the emergence of a new political-economic reality and its language in the formation of a well-ordered police state. Among the texts on political economy translated from German into Russian, the most important were those on cameral and police sciences. The history of their translation is embedded in the broader history of cultural transfer (Espagne). Manuscript translations of Wilhelm von Schröder (1640–1688) on the prince’s treasury from the early eighteenth century, the numerous multi-volume book and journal translations of Johann H. G. von Justi (1717–1771) on manufactures, welfare, and the science of state governance, and the influential translation of Joseph F. von Sonnenfels’s (1732–1817) book on politics and finance form the basis for reflections on the translation of the concepts of state, society, welfare, happiness (Glückseligkeit), and police (Policey). The paper is mainly focused on two questions: what was translated, and in what manner (with what intentions) were the translations carried out? At times, information about the translators and the intended audience provides additional context for understanding the adaptation, transfer, and reception of these ideas. In my research, I proceed from the non-neutrality of translation (not merely a linguistic transfer), but instead consider it as a process of profound transformation, appropriation, and semantic interaction. The political, aesthetic, and intellectual act of translation and transfer invites reflection on a broader “bracket” under the name of the Baroque that brings these translations together in the age of the Enlightenment. Danila Raskov is a researcher at the Department of Economic History, Uppsala U... He is the author of the books "The Economic Institutions of Old Believers" (2012) and "The Rhetoric of Institutional Economics" (2023), both published in Russian. He is currently working on the book Cameralism in the Building of the Russian Empire: Administrative and Intellectual Discourses on the Well-Ordered State." Date: 3 February 2026, 15:15–17:00 Location: IRES Library, Gamla torget 3, 3rd Floor Type: Lecture, Seminar Organiser: Institute for Russian and Eurasian Studies (IRES) IRES higher seminar Last modified: 2026-01-23 Contact +46 18 471 00 00 (switchboard) https://www.uu.se/en/department/russian-and-eurasian-studies/events/archive/2026-02-03-translations-of-cameralists-in-the-eighteenth-century-russian-empire #Metaglossia #metaglossia_mundus #métaglossie
Researchers across Africa, Asia and the Middle East are building their own language models designed for local tongues, cultural nuance and digital independence
"In a high-stakes artificial intelligence race between the United States and China, an equally transformative movement is taking shape elsewhere. From Cape Town to Bangalore, from Cairo to Riyadh, researchers, engineers and public institutions are building homegrown AI systems, models that speak not just in local languages, but with regional insight and cultural depth.
The dominant narrative in AI, particularly since the early 2020s, has focused on a handful of US-based companies like OpenAI with GPT, Google with Gemini, Meta’s LLaMa, Anthropic’s Claude. They vie to build ever larger and more capable models. Earlier in 2025, China’s DeepSeek, a Hangzhou-based startup, added a new twist by releasing large language models (LLMs) that rival their American counterparts, with a smaller computational demand. But increasingly, researchers across the Global South are challenging the notion that technological leadership in AI is the exclusive domain of these two superpowers.
Instead, scientists and institutions in countries like India, South Africa, Egypt and Saudi Arabia are rethinking the very premise of generative AI. Their focus is not on scaling up, but on scaling right, building models that work for local users, in their languages, and within their social and economic realities.
“How do we make sure that the entire planet benefits from AI?” asks Benjamin Rosman, a professor at the University of the Witwatersrand and a lead developer of InkubaLM, a generative model trained on five African languages. “I want more and more voices to be in the conversation”.
Beyond English, beyond Silicon Valley
Large language models work by training on massive troves of online text. While the latest versions of GPT, Gemini or LLaMa boast multilingual capabilities, the overwhelming presence of English-language material and Western cultural contexts in these datasets skews their outputs. For speakers of Hindi, Arabic, Swahili, Xhosa and countless other languages, that means AI systems may not only stumble over grammar and syntax, they can also miss the point entirely.
“In Indian languages, large models trained on English data just don’t perform well,” says Janki Nawale, a linguist at AI4Bharat, a lab at the Indian Institute of Technology Madras. “There are cultural nuances, dialectal variations, and even non-standard scripts that make translation and understanding difficult.” Nawale’s team builds supervised datasets and evaluation benchmarks for what specialists call “low resource” languages, those that lack robust digital corpora for machine learning.
It’s not just a question of grammar or vocabulary. “The meaning often lies in the implication,” says Vukosi Marivate, a professor of computer science at the University of Pretoria, in South Africa. “In isiXhosa, the words are one thing but what’s being implied is what really matters.” Marivate co-leads Masakhane NLP, a pan-African collective of AI researchers that recently developed AFROBENCH, a rigorous benchmark for evaluating how well large language models perform on 64 African languages across 15 tasks. The results, published in a preprint in March, revealed major gaps in performance between English and nearly all African languages, especially with open-source models.
Similar concerns arise in the Arabic-speaking world. “If English dominates the training process, the answers will be filtered through a Western lens rather than an Arab one,” says Mekki Habib, a robotics professor at the American University in Cairo. A 2024 preprint from the Tunisian AI firm Clusterlab finds that many multilingual models fail to capture Arabic’s syntactic complexity or cultural frames of reference, particularly in dialect-rich contexts.
Governments step in
For many countries in the Global South, the stakes are geopolitical as well as linguistic. Dependence on Western or Chinese AI infrastructure could mean diminished sovereignty over information, technology, and even national narratives. In response, governments are pouring resources into creating their own models.
Saudi Arabia’s national AI authority, SDAIA, has built ‘ALLaM,’ an Arabic-first model based on Meta’s LLaMa-2, enriched with more than 540 billion Arabic tokens. The United Arab Emirates has backed several initiatives, including ‘Jais,’ an open-source Arabic-English model built by MBZUAI in collaboration with US chipmaker Cerebras Systems and the Abu Dhabi firm Inception. Another UAE-backed project, Noor, focuses on educational and Islamic applications.
In Qatar, researchers at Hamad Bin Khalifa University, and the Qatar Computing Research Institute, have developed the Fanar platform and its LLMs Fanar Star and Fanar Prime. Trained on a trillion tokens of Arabic, English, and code, Fanar’s tokenization approach is specifically engineered to reflect Arabic’s rich morphology and syntax.
India has emerged as a major hub for AI localization. In 2024, the government launched BharatGen, a public-private initiative funded with 235 crore (€26 million) initiative aimed at building foundation models attuned to India’s vast linguistic and cultural diversity. The project is led by the Indian Institute of Technology in Bombay and also involves its sister organizations in Hyderabad, Mandi, Kanpur, Indore, and Madras. The programme’s first product, e-vikrAI, can generate product descriptions and pricing suggestions from images in various Indic languages. Startups like Ola-backed Krutrim and CoRover’s BharatGPT have jumped in, while Google’s Indian lab unveiled MuRIL, a language model trained exclusively on Indian languages. The Indian governments’ AI Mission has received more than180 proposals from local researchers and startups to build national-scale AI infrastructure and large language models, and the Bengaluru-based company, AI Sarvam, has been selected to build India’s first ‘sovereign’ LLM, expected to be fluent in various Indian languages.
In Africa, much of the energy comes from the ground up. Masakhane NLP and Deep Learning Indaba, a pan-African academic movement, have created a decentralized research culture across the continent. One notable offshoot, Johannesburg-based Lelapa AI, launched InkubaLM in September 2024. It’s a ‘small language model’ (SLM) focused on five African languages with broad reach: Swahili, Hausa, Yoruba, isiZulu and isiXhosa.
“With only 0.4 billion parameters, it performs comparably to much larger models,” says Rosman. The model’s compact size and efficiency are designed to meet Africa’s infrastructure constraints while serving real-world applications. Another African model is UlizaLlama, a 7-billion parameter model developed by the Kenyan foundation Jacaranda Health, to support new and expectant mothers with AI-driven support in Swahili, Hausa, Yoruba, Xhosa, and Zulu.
India’s research scene is similarly vibrant. The AI4Bharat laboratory at IIT Madras has just released IndicTrans2, that supports translation across all 22 scheduled Indian languages. Sarvam AI, another startup, released its first LLM last year to support 10 major Indian languages. And KissanAI, co-founded by Pratik Desai, develops generative AI tools to deliver agricultural advice to farmers in their native languages.
The data dilemma
Yet building LLMs for underrepresented languages poses enormous challenges. Chief among them is data scarcity. “Even Hindi datasets are tiny compared to English,” says Tapas Kumar Mishra, a professor at the National Institute of Technology, Rourkela in eastern India. “So, training models from scratch is unlikely to match English-based models in performance.”
Rosman agrees. “The big-data paradigm doesn’t work for African languages. We simply don’t have the volume.” His team is pioneering alternative approaches like the Esethu Framework, a protocol for ethically collecting speech datasets from native speakers and redistributing revenue back to further development of AI tools for under-resourced languages. The project’s pilot used read speech from isiXhosa speakers, complete with metadata, to build voice-based applications.
In Arab nations, similar work is underway. Clusterlab’s 101 Billion Arabic Words Dataset is the largest of its kind, meticulously extracted and cleaned from the web to support Arabic-first model training.
The cost of staying local
But for all the innovation, practical obstacles remain. “The return on investment is low,” says KissanAI’s Desai. “The market for regional language models is big, but those with purchasing power still work in English.” And while Western tech companies attract the best minds globally, including many Indian and African scientists, researchers at home often face limited funding, patchy computing infrastructure, and unclear legal frameworks around data and privacy.
“There’s still a lack of sustainable funding, a shortage of specialists, and insufficient integration with educational or public systems,” warns Habib, the Cairo-based professor. “All of this has to change.”
A different vision for AI
Despite the hurdles, what’s emerging is a distinct vision for AI in the Global South – one that favours practical impact over prestige, and community ownership over corporate secrecy.
“There’s more emphasis here on solving real problems for real people,” says Nawale of AI4Bharat. Rather than chasing benchmark scores, researchers are aiming for relevance: tools for farmers, students, and small business owners.
And openness matters. “Some companies claim to be open-source, but they only release the model weights, not the data,” Marivate says. “With InkubaLM, we release both. We want others to build on what we’ve done, to do it better.”
In a global contest often measured in teraflops and tokens, these efforts may seem modest. But for the billions who speak the world’s less-resourced languages, they represent a future in which AI doesn’t just speak to them, but with them."
Sibusiso Biyela, Amr Rageh and Shakoor Rather
20 May 2025
https://www.natureasia.com/en/nmiddleeast/article/10.1038/nmiddleeast.2025.65
#metaglossia_mundus
Arunava Sinha recounts how his career in translation began with ‘Chowringhee’ "Celebrated translator Arunava Sinha on Sunday talked about his 100th published translation and the evolving craft of carrying Bengali literature to a global readership. In conversation with theatre director and cultural commentator Sujoy Prasad Chatterjee at the Kolkata Literary Meet, Sinha said much of his career began “almost by accident”.
Sinha traced the unlikely beginnings of his journey to Chowringhee. “I translated Chowringhee in 1992 — 30 years after the novel came out in Bangla,” he said. “And then it lay in cold storage for 14 years.” The manuscript resurfaced only when a Penguin editor asked the author Shankar for it. “He had forgotten who’d done it,” Sinha recalled. “Fortunately, I’d put my name on it. And that was how it all began”.
Asked about translating living authors, Sinha said the process rarely changes. “I translate any author the same way,” he said, whether they are alive or not. What differs, he said, is the occasional luxury of clarification — being able to call an author to ask what a word or phrase truly intends. That said, he added, such moments are rare.
The discussion turned to the difference between translating prose and poetry. “With poetry, it’s elliptical,” he said. “It’s probably a failed endeavour to try and arrive at a singular meaning.” A translator, he explained, must pay attention to “sound, rhythm, music, imagery,” often choosing to foreground only one or two elements. “That’s why two translators will never converge in poetry the way they sometimes do in prose.”
Asked why he chose translation over writing original fiction, Sinha said it wasn’t a conscious choice. “It’s not even a career that earns me a living,” he said, “It was just something I tried. I enjoyed it. Other people enjoyed it too.”
Timing, he added, mattered. “English-language publishing in India was just starting out. They were desperately looking for books — and not all of them could be written in English.”
Sinha acknowledged that a translator’s personal politics can clash with the worldview embedded in a text. Referring to a historical novel that portrays outsiders as enemies, he said, “My politics is opposed to the latent politics of the text.”
Still, he chose to translate it. “I didn’t want to take the easy option of not translating,” he said. Instead, he added a note: “Don’t shoot the messenger. The translator is only telling you what the text says in another language.”
‘All of India is in Kolkata’: Jhumpa Lahiri on language and the city that never left her..." Agnivo Niyogi Published 25.01.26 https://www.telegraphindia.com/my-kolkata/people/arunava-sinha-recounts-how-his-career-in-translation-began-with-chowringhee-at-kolkata-literary-meet/cid/2144324
"In an astonishing parallel to how human language evolves, communities in Mozambique use different ‘dialects’ to coordinate cooperation with wild birds.
In sub-Saharan Africa, some communities work together with wild birds to find honey. Using distinctive calls, honey-hunting people coordinate with greater honeyguides (Indicator indicator), who lead them to bee hives. A new study demonstrates that the honey-hunters’ calls vary between communities, and evolve in much the same way as human dialects.
Honey-hunters use two call types to communicate with honeyguides: recruitment calls to invite the wild birds to the hunt, and coordination calls to stay in contact with the bird as it leads them to a hive.
Can animals ask humans for help? The weirdest animal friendships on the planet: Animal pairings that shouldn’t get along but are actually the best of friends Once a hive has been located, honey-hunters use smoke to subdue the bees and harvest their honey, while the honeyguides help themselves to beeswax and tasty grubs.
Both species therefore benefit from this interaction: honeyguides get food that they couldn’t otherwise access, and honey-hunters are guided to the hive. Indeed, previous research shows that honey-hunters are over three times more likely to find a hive if they have a honeyguide guiding them.
Both types of call that the honey-hunters use – recruitment and coordination – vary across cultures, in the same way that human language varies. But even neighbouring villages have slightly different calls, a bit like regional dialects.
To find out if these ‘dialects’ evolved because of cultural processes, like human language dialects do, researchers recorded calls from 131 honey-hunters across 13 villages in Mozambique’s Niassa Special Reserve.
Can other species learn ‘foreign languages’? Whales speak in dialects and elephants have names for each other: The incredible secrets of animal language They found that neighbouring villages tended to have similar calls, while villages that were further apart had more dissimilar calls. Meanwhile, environmental factors had no effect on call similarity. This means that call variation is driven by cultural rather than environmental processes.
“The highlights just how powerfully culture shapes us as a species, even in our interactions with wild, untrained animals,” explains Dr Jessica Van Der Wal, lead author of the study.
While honey-hunters learn recruitment and coordination calls from their fathers, scientists are unsure how honeyguides learn to cooperate with humans.
“They can’t learn from their parents, because they are brood parasites: like cuckoos, they lay their eggs in other birds’ nests,” says Dr Van Der Wal. “But they may learn by observing other honeyguides interacting with humans.”
Senior author Professor Claire Spottiswoode, of the University of Cape Town’s FitzPatrick Institute of African Ornithology (who leads the Honeyguide Research Project), added “Humans learn and maintain the local signals needed to cooperate with honeyguides, and honeyguides are in turn probably learning and so helping to reinforce these local human dialects – much as they learn larger-scale variation in human signals across Africa, more akin to different human languages.”
Figuring out how honeyguides learn to cooperate with humans is one of the team’s next research questions." Beki Hooper January 26, 202 https://www.discoverwildlife.com/animal-facts/birds/greater-honeyguide-human-dialects #Metaglossia #metaglossia_mundus #métaglossie
"March 3, 2026 | 6:00 - 7:00 pm Enarson Classroom Building Memes are more than jokes; they are cultural snapshots. This session explores how memes reflect humor, politics, identity, and social norms across different countries. Students will compare global meme trends and learn how culture shapes what goes viral and why.
The Office of International Affairs invites international and domestic students to join us for the weekly Global Engagement Night, where you can interact with one another and have cross-cultural discussions about different regions of the world and general topics affecting all college students.
All are welcome to join the conversation every Tuesday from 6-7 p.m. in 160 Enarson Classroom Building." https://oia.osu.edu/events/global-engagement-night-laughing-translation-global-meme-culture #Metaglossia #metaglossia_mundus #métaglossie
"House Democrats Propose Bill to Mandate Multilingual Federal Resources
The bill is in direct response to the Trump administration’s efforts to make English the official language of the United States.
A group of Democratic House lawmakers want to require the federal government to offer information and written materials in multiple languages — a direct response to the Trump administration’s efforts to cut resources for non-English speakers.
Reps. Grace Meng, Judy Chu, Juan Vargas and Dan Goldman — members of the Congressional Asian Pacific American Caucus — on Friday will introduce legislation that would codify a Clinton-era executive order mandating that federal agencies offer resources to non-English speakers, according to bill text viewed by NOTUS.
“Every American deserves equal access to federal services and programs in a language they can
Understand,” Meng, who chairs the caucus, said in a statement. “We will continue to fight against the Trump administration’s attacks on immigrants and the essential services that our communities rely on and deserve.”
The effort comes after President Donald Trump’s March executive order designated English the official language of the United States. Trump’s directive rescinded the Clinton-era order and all accompanying policy guidance on how federal agencies could best serve non-English speakers.
The administration subsequently canceled some translation and interpretation contracts across multiple federal agencies.
Multiple federal resources for non-English speakers were discontinued or paused after Trump’s executive order. NOTUS reported in July that the Department of Justice “temporarily suspended” LEP.gov, the main site for non-English speakers to access multilingual materials including translated information from federal agencies. The site is still suspended more than six months later, though multilingual resources required by law are still available through other channels.
The Democrats’ bill would require federal agencies to ensure that “individuals with [limited English proficiency] can meaningfully access the Federally-conducted programs and activities,” including by translating documents into commonly spoken languages, offering interpretation services to non-English speakers and employing bilingual agency staff.
The legislation proposed Friday mandates that the federal government reopen access to LEP.gov.
The bill is unlikely to see widespread Republican support, with many GOP lawmakers cheering Trump’s executive order. But some Republicans who represent districts with high non-English-speaking populations said last year that they do not agree with Trump’s English-only efforts.
“I don’t understand why you’d make an issue out of it,” Rep. Carlos Gimenez, a Republican from Florida, told NOTUS in July. “I understand what the president’s trying to do with one language. But, if somebody feels more comfortable, or may speak English but reads Spanish, I don’t have a problem either way.”
The bill would also require the attorney general to create a system for people to submit complaints about barriers to language access in the federal government. LEP.gov previously aggregated links for each federal agency’s processes for filing discrimination complaints under the Civil Rights Act. Users can still submit those complaints, typically through an agency’s civil rights office, but there is no single portal or resource to do so.
Chu called the Trump administration’s actions “an attack on our immigrant communities.”
“In my district, translation services are essential for parents applying for a home loan, seniors accessing Medicare, immigrants starting a small business, and disaster survivors accessing the FEMA’s resources,” she said in a statement, adding that the proposed legislation would “ensure no one is denied health care, housing, or disaster assistance because English is not their first language.”"
Shifra Dayak
January 23, 2026
https://www.notus.org/congress/house-democrats-bill-mandate-multilingual-federal-resources
#Metaglossia
#metaglossia_mundus
#métaglossie
"Red Cross and Capgemini implemented BabelSpeak to ensure that anyone with the need will be able to access translation services without worrying about the expense Client challenge: In confidential conversations in which language is a barrier and that may involve sensitive, personal issues leave vulnerable groups – such as refugees – entirely dependent on interpreters to be understood.
Solution: Red Cross engaged Capgemini to build BabelSpeak, a tool for real-time translation and transcription in approximately 100 languages that simultaneously secures sensitive data and information.
Benefits:
Interpretation services made more accessible and efficient Cost of interpretation reduced up to 90% Every conversation is fully secured and kept private Digital interpreter helps vulnerable individuals be understood in encounters with authorities In collaboration with the Norwegian Red Cross, Capgemini has conducted extensive testing and improvement of the AI-based interpretation service BabelSpeak. By doing so, the partners have given a voice to marginalized groups in society and broken down linguistic barriers.
Endless need for translation The need for interpretation services is vast. While the Norwegian government spends around NOK 1.4 billion annually on interpretation services, this is still not enough to meet demand.
The Red Cross in particular has an almost endless demand for the translation of sensitive conversations. These interpretation needs often involve deeply personal content – and in many cases, the conversations can be about sensitive issues that have major consequences for the people involved. However, as a non-profit organization, the Red Cross must work with a limited budget while professional translation often proves expensive.
This meant that it needed another option that came with a more affordable budget. To that end, the Red Cross partnered with Capgemini, which had been involved with the development of BabelSpeak, an AI-driven interpretation solution.
AI makes translation more accessible BabelSpeak translates between 100 different languages in four ways: speech to speech, text to text, speech to text, and text to speech. Early versions of the digital interpreter were so promising that Capgemini decided to further develop the concept. As the solution was continually refined, Telenor’s recently established AI Factory becoming a key partner to ensure the entire solution was under Norwegian control.
Because Babelspeak and the information it generates are entirely processed and operated in Norway, it can deliver services to, among others, the police, military, healthcare services, and municipalities. These sectors require strict compliance with security, data processing, and storage regulations, requirements that are shared by many other organizations and companies.
“Capgemini is committed to ‘AI for good’ – using artificial intelligence for socially beneficial purposes,” says Thordur Arnason, Vice President at Capgemini Invent and project owner for BabelSpeak. “Being understood is a human right, and BabelSpeak helps ensure safe and effective communication for vulnerable groups in society. It saves society significant costs – and contributes to avoiding misunderstandings with potentially high risk.”
Trustworthy interpreter Such a complex field as interpretation comes with natural challenges that affected the development of BabelSpeak. For instance, there are significant differences in data availability for various languages. Some Arabic dialects did not perform as well as initially expected, an issue that was clarified via extensive testing with a language expert.
“We hypothesized that formal and technical language – such as legal or bureaucratic – would be a major challenge,” explains Arnason. “However, BabelSpeak performed very well on vocabulary-related content. Instead, variations, nuances, and dialects in sentence structure were the most difficult to get right.”
Throughout the development, we ensured that a native speaker was always on hand to review translations for quality and precision. The project team then used the results of this oversight to further refine and improve the system.
From locked in to locked out Since the introduction of BabelSpeak at Red Cross, the non-profit has been enabled to provide meaningful translation services to vulnerable groups. Primary among them have been refugees and formerly incarcerated individuals.
“BabelSpeak gives a voice to those sitting at the bottom of the table. Language barriers are especially challenging for refugees with experiences that make it difficult to trust authorities and interpreters. For them, BabelSpeak feels safer,” says Jeanette Steig Lid, Key Account Manager at the Norwegian Red Cross.
The Red Cross runs the “Network After Imprisonment” program, which aims to help those who want to leave a criminal lifestyle. Convicted criminals often experience the reality that there is a short path from being locked up (prison) to becoming locked out (of society) upon their release. At the same time, personal finances can be challenging.
Many also have a great need to manage their finances, build a network, and build a normal life in society. The network offers a range of activities to support this transition and effective communication is essential to success in this area.
BabelSpeak provides these groups with a new resource that offers critical services, ensuring that nobody has to face life-changing events without a crippling linguistic barrier adding to their challenges.
The collaboration with the Red Cross gave the BabelSpeak team the opportunity for extensive testing, giving them access to valuable experience and insights that would otherwise have been inaccessible. Both parties look forward to continuing the collaboration on a larger scale." https://www.capgemini.com/news/client-stories/supporting-marginalized-communities-with-a-digital-interpreter/ #Metaglossia #metaglossia_mundus #métaglossie
The Department would like to remind the public to be careful when going for medical treatment and needing translation services at hospitals abroad.
"On January 23, 2026, the Department of Anti-Technology Crime launched an investigation and crackdown on a case of money fraud under the guise of pretending to be a translator for victims who were patients going abroad for medical treatment and deceiving victims into transferring money to their bank accounts.
With the coordination of the Phnom Penh Municipal Court Prosecutor, the department arrested 1 suspect, named J.L.N, a female, Cambodian.
Through investigation, authorities determined that the suspect used her bank account to defraud the victim.
The suspect has currently been referred by the Case Development Department to the Phnom Penh Municipal Court for legal proceedings.
The Department would like to remind the public to be careful when going for medical treatment and needing translation services at hospitals abroad.
Before making payment for medical treatment, please confirm and check clearly to avoid scammers who forge hospital documents to transfer money to bank accounts from unknown sources, which can eventually lead to loss of property."
https://www.khmertimeskh.com/501833209/fraudulent-medical-translator-arrested/
#Metaglossia
#metaglossia_mundus
#métaglossie
"Upcoming Translation Events (Virtual & In-Person): February 2026
Tuesday, February 10:
The International Library Presents Cristina Rivera Garza on Autobiography of Cotton with Rita Indiana | Pulitzer Prize-winning author Cristina Rivera Garza discusses her new novel, Autobiography of Cotton, translation by Christina MacSweeney. In-person. Hosted by The Center for the Art of Translation...
Starts at 7:00 p.m. (ET).
Thursday, February 12:
The 2026 Albertine Translation Prize Ceremony | On February 12, the annual Albertine Translation Prize Ceremony returns, honoring translators and American publishers of English translations of contemporary French works. This year Villa Albertine is partnering with the Colloquy series at World Poetry Books to celebrate the art of translation in a new and exciting way. Bringing together translators and readers, the evening will feature translation jousts: two translators and a moderator will engage in a lively discussion of their different renderings of the same French texts, one a work of fiction, the other a poem. In-person. Hosted by Albertine. Starts at 6:00 p.m. (ET).
Wednesday, February 18:
Colloquy #20: Translating Theory, Translators in Conversation | An event series presented by World Poetry Books in collaboration with Montez Press Radio and partnering New York City institutions and bookstores. Translators engage with live audiences in an exploration of the art of translation. For Colloquy #20, Patrick Lyons and Paul Reitter will be in conversation. In-person. Hosted by World Poetry Books. More info here. Starts at 7:00 p.m. (ET).
Thursday, February 26:
The Mushroom Gatherer: A Conversation with Author & Translator | Book launch event for The Mushroom Gatherer with readings and a moderated conversation with the author, Viktorie Hanišová, and translator, Véronique Firkusny, followed by a reception with refreshments and book signing. In-person. Hosted by Czech Center New York. More info forthcoming. Starts at 6:30 p.m. (ET)
Sanderling: An Evening with Anne Weber, Neil Blackadder and Tess Lewis | Deutsches Haus at NYU and Goethe Institut New York presents a reading by Anne Weber from her book Sanderling (Indigo Press, 2025, translated by Neil Blackadder), followed by a conversation between her and the book's translator Neil Blackadder, which will be moderated by the author and literary translator Tess Lewis. The conversation will focus on Anne Weber’s deeply personal reckoning with what it means to be German, how the past continues to manifest in the present, how this “Zeitreisetagebuch” (Diary of Time Travel) came to be, and finally, how the translation from German to English was facilitated by the author and translator. In-person. Hosted by Deutsches Haus at NYU. More info here. Starts at 6:00 p.m. (ET).
If you have an upcoming literary translation event and you'd like us to feature it on our website, please fill out this form.
https://arts.columbia.edu/content/upcoming-translation-events-virtual-person-february-2026
#Metaglossia
#metaglossia_mundus
#métaglossie
"Bible society wants indigenous languages as medium of instruction in schools
Mr Sanusi said the abolition of indigenous languages as a medium of instruction in schools could undermine cultural identity.
The Bible Society of Nigeria (BSN) has called on the federal government to reconsider the abolition of indigenous languages as a medium of instruction in schools.
The general secretary/chief executive officer of BSN, Samuel Sanusi, made the call in Lagos on Saturday during a news conference to announce activities for the society’s 60th anniversary on February 8, 2026.
Mr Sanusi said the abolition of indigenous languages as a medium of instruction in schools could undermine cultural identity.
“This issue is of concern to the BSN because it has spent six decades making the word of God available and affordable to Nigerians in their preferred languages and formats,” Mr Sanusi said.
On the anniversary celebration, Mr Sanusi said the theme of the event would be “Celebrating Impact and Building a Legacy of Hope.”
According to Mr Sanusi, BSN is a member of the United Bible Societies, a global fellowship of 155 national Bible societies operating in over 200 countries and territories.
He disclosed that BSN was currently working on 11 Bible translations and revision projects at different stages of completion across Nigeria.
The BSN general secretary said the anniversary programme would begin with a Bible exhibition on February 2, 2026, at the Lagos Bible Guest House, Palmgrove, Ilupeju.
Mr Sanusi said, ”There will be a Bible walk on February 3 from the National Stadium, Surulere, to Obanikoro.
“Partners’ appreciation dinner and the dedication of a second studio for the deaf Bible translation project in Ibadan would take place on February 4.
“Foreign guests will arrive on February 5 for a CEOs’ conference scheduled to coincide with the anniversary celebration.”
He said the Founders’ Day Lecture and Awards Ceremony would be held on February 6, to be chaired by former President Goodluck Jonathan, with Pastor Poju Oyemade as the guest speaker.
Mr Sanusi added that a thanksgiving service would be held on February 8, 2026, at The Covenant Nation, Lagos.
He also announced that BSN had produced a 13-episode documentary on its activities, to be aired on DOVE TV from late January or early February.
The BSN boss said a commemorative book titled “Six Decades of Impact: Transmitting the WORD, Transforming Lives” would also be publicly presented.
He disclosed that between 2014 and 2023, BSN translated and produced 222 chronological Bible stories in Nigerian sign language for the deaf community.
According to him, BSN spent over N105.3 million on the Sign Language Bible Project between 2023 and 2024.
Mr Sanusi said BSN distributed 7,870,296 copies of assorted Bibles in the last five years.
“According to the United Bible Societies’ 2024 Scripture Distribution Report, BSN accounts for 1.15 million out of the 5.45 million full Bibles distributed across Africa,” the BSN boss explained.
Mr Sanusi disclosed that the Macedonian Call project, launched in 2018, had benefited over 50,000 people in more than 20 IDP camps nationwide.
He said BSN planned to spend over N306 million on the project in 2026, subject to donor support.
(NAN)"
News Agency of Nigeria • January 24, 2026
https://gazettengr.com/bible-society-wants-indigenous-languages-as-medium-of-instruction-in-schools/
#Metaglossia
#metaglossia_mundus
#métaglossie
"How to Build a Neural Machine Translation System for a Low-Resource Language
An introduction to neural machine translation
Kaixuan Chen
Jan 24, 2026
In the wake of the AI boom, the pace of technological iteration has reached an unprecedented level. Previous obstacles now seem to have viable solutions. This article serves as an “NMT 101” guide. While introducing our project, it also walks readers step by step through the process of fine-tuning an existing translation model to support a low-resource language that is not included in mainstream multilingual models.
Background: Dongxiang as a Low-Resource Language
Dongxiang is a minority language spoken in China’s Gansu Province and is classified as vulnerable by the UNESCO Atlas of the World’s Languages in Danger. Despite being widely spoken in local communities, Dongxiang lacks the institutional and digital support enjoyed by high-resource languages. Before diving into the training pipeline, it helps to briefly understand the language itself. Dongxiang, as its name suggests, is the mother tongue of the Dongxiang people. Descended from Central Asian groups who migrated to Gansu during the Yuan dynasty, the Dongxiang community has linguistic roots closely tied to Middle Mongol. From a writing-system perspective, Dongxiang has undergone a relatively recent standardization. Since the 1990s, with governmental promotion, the language has gradually adopted an official Latin-based orthography, using the 26 letters of the English alphabet and delimiting words by whitespace.
Although it is still classified under the Mongolic language family, due to the prolonged coexistence with Mandarin-speaking communities through history, the language has a trove of lexical borrowing from Chinese (Mandarin). Dongxiang exhibits no overt tense inflection or grammatical gender, which may be an advantage to simplify our model training.
Based on the Dongxiang dictionary, approximately 33.8% of Dongxiang vocabulary items are of Chinese origin.
Further background on the Dongxiang language and its speakers can be found on our website, which hosts an official English-language introduction released by the Chinese government.
Our Model: How to Use the Translation System
We build our translation system on top of NLLB-200-distilled-600M, a multilingual neural machine translation model released by Meta as part of the No Language Left Behind (NLLB) project. We were inspired by the work of David Dale. However, ongoing updates to the Transformers library have made the original approach difficult to apply. In our own trials, rolling back to earlier versions (e.g., transformers ≤ 4.33) often triggered conflicts with other dependencies. In light of these constraints, we provide a full list of libraries in our project’s GitHub requirements.txt for your reference.
Our model was fine-tuned on 42,868 Dongxiang–Chinese bilingual sentence pairs. The training corpus combines publicly available materials with internally curated resources provided by local government partners, all of which were processed and cleaned in advance. Training was conducted using Adafactor, a memory-efficient optimizer well suited to large transformer models. With the distilled architecture, the full fine-tuning process can be completed in under 12 hours on a single NVIDIA A100 GPU. All training configurations, hyperparameters, and experimental settings are documented across two training Jupyter notebooks. Rather than relying on a single bidirectional model, we trained two direction-specific models to support Dongxiang–Chinese and Chinese–Dongxiang translation. Since NLLB is already pretrained on Chinese, joint training under data-imbalanced conditions tends to favor the easier or more dominant direction. As a result, performance gains on the low-resource side (Dongxiang) are often limited. However, NLLB does support bidirectional translation in a single model, and a straightforward approach is to alternate translation directions at the batch level.
Here are the links to our repository and website.
GitHub Repository
GitHub-hosted website
The model is also publicly available on Hugging Face.
Chinese → Dongxiang
Dongxiang → Chinese
Model Training: Step-by-Step Reproducible Pipeline
Before following this pipeline to build the model, we assume that the reader has a basic understanding of Python and fundamental concepts in natural language processing. For readers who may be less familiar with these topics, Andrew Ng’s courses are a highly recommended gateway. Personally, I also began my own journey to this field through his course.
Step 1: Bilingual Dataset Processing
The first stage of model training focuses on constructing a bilingual dataset. While parallel corpora for major languages can often be obtained by leveraging existing web-scraped resources, Dongxiang–Chinese data remains difficult to acquire. To support transparency and reproducibility, and with consent from the relevant data custodians, we have released both the raw corpus and a normalized version in our GitHub repository. The normalized dataset is produced through a straightforward preprocessing pipeline that removes excessive whitespace, standardizes punctuation, and ensures a clear separation between scripts. Dongxiang text is restricted to Latin characters, while Chinese text contains only Chinese characters.
Below is the code used for preprocessing:
import re
import pandas as pd
def split_lines(s: str):
if "\\n" in s and "\n" not in s:
lines = s.split("\\n")
else:
lines = s.splitlines()
lines = [ln.strip().strip("'").strip() for ln in lines if ln.strip()]
return lines
def clean_dxg(s: str) -> str:
s = re.sub(r"[^A-Za-z\s,\.?]", " ", s)
s = re.sub(r"\s+", " ", s).strip()
s = re.sub(r"[,.?]+$", "", s)
return s
def clean_zh(s: str) -> str:
s = re.sub(r"[^\u4e00-\u9fff,。?]", "", s)
s = re.sub(r"[,。?]+$", "", s)
return s
def make_pairs(raw: str) -> pd.DataFrame:
lines = split_lines(raw)
pairs = []
for i in range(0, len(lines) - 1, 2):
dxg = clean_dxg(lines[i])
zh = clean_zh(lines[i+1])
if dxg or zh:
pairs.append({"Dongxiang": dxg, "Chinese": zh})
return pd.DataFrame(pairs, columns=["Dongxiang", "Chinese"])
In practice, bilingual sentence-level pairs are preferred over word-level entries, and excessively long sentences are split into shorter segments. This facilitates more reliable cross-lingual alignment and leads to more stable and efficient model training. Isolated dictionary entries should not be inserted into training inputs. Without surrounding context, the model cannot infer syntactic roles, or learn how words interact with surrounding tokens.
Bilingual dataset (by Author)
When parallel data is limited, a common alternative is to generate synthetic source sentences from monolingual target-language data and pair them with the originals to form pseudo-parallel corpora. This idea was popularized by Rico Sennrich, whose work on back-translation laid the groundwork for many NMT pipelines. LLM-generated synthetic data is another viable approach. Prior work has shown that LLM-generated synthetic data is effective in building translation systems for Purépecha, an Indigenous language spoken in Mexico.
Step 2: Tokenizer Preparation
Before text can be digested by a neural machine translation model, it must be converted into tokens. Tokens are discrete units, typically at the subword level, that serve as the basic input symbols for neural networks. Using entire words as atomic units is impractical, as it leads to excessively large vocabularies and rapid growth in model dimensionality. Moreover, word-level representations struggle to generalize to unseen or rare words, whereas subword tokenization enables models to compose representations for novel word forms.
The official NLLB documentation already provides standard examples demonstrating how tokenization is handled. Owing to NLLB’s strong multilingual capacity, most widely used writing systems can be tokenized in a reasonable and stable manner. In our case, adopting the default NLLB multilingual tokenizer (Unigram-based) was sufficient to process Dongxiang text.
Summary statistics of tokenized Dongxiang sentences (by Author)
Whether the tokenizer should be retrained is best determined by two criteria. The first is coverage: frequent occurrences of unknown tokens (<unk>) indicate insufficient vocabulary or character handling. In our sample of 300 Dongxiang sentences, the <unk> rate is zero, suggesting full coverage under the current preprocessing. The second criterion is subword fertility, defined as the average number of subword tokens generated per whitespace-delimited word. Across the 300 samples, sentences average 6.86 words and 13.48 tokens, corresponding to a fertility of approximately 1.97. This pattern remains consistent across the distribution, with no evidence of excessive fragmentation in longer sentences.
Overall, NLLB demonstrates robust behavior even on previously unseen languages. As a result, tokenizer retraining is generally unnecessary unless the target language employs a highly unconventional writing system or even lacks Unicode support. Retraining a SentencePiece tokenizer also has implications for the embedding layer. New tokens start without pretrained embeddings and must be initialized using random values or simple averaging.
Step 3: Language ID Registration
In practical machine translation systems such as Google Translate, the source and target languages must be explicitly specified. NLLB adopts the same assumption. Translation is governed by explicit language tag, referred to as src_lang and tgt_lang, determining how text is encoded and generated within the model. When a language falls outside NLLB’s predefined scope, it must first be explicitly registered, along with a corresponding expansion of the model’s embedding layer. The embedding layer maps discrete tokens into continuous vector representations, allowing the neural network to process and learn linguistic patterns in a numerical form.
In our implementation, a custom language tag is added to the tokenizer as an additional special token, which assigns it a unique token ID. The model’s token embedding matrix is then resized to accommodate the expanded vocabulary. The embedding vector associated with the new language tag is initialized from a zero-centered normal distribution with a small variance, scaled by 0.02. If the newly introduced language is closely related to an existing supported language, its embedding can often be trained on top of the existing representation space. However, linguistic similarity alone does not guarantee effective transfer learning. Differences in writing systems can affect tokenization. A well-known example is Moldovan, which is linguistically identical to Romanian but is written in the Latin script, while it is written in Cyrillic in the so-called Pridnestrovian Moldavian Republic. Despite the close linguistic relationship, the difference in script introduces distinct tokenization patterns.
The code used to register a new language is presented here.
def fix_tokenizer(tokenizer, new_lang: str):
old = list(tokenizer.additional_special_tokens)
if new_lang not in old:
tokenizer.add_special_tokens(
{"additional_special_tokens": old + [new_lang]})
return tokenizer.convert_tokens_to_ids(new_lang)
fix_tokenizer(tokenizer,"sce_Latn")
# we register Dongxiang as sce_Latn, and it should append to the last
# output 256204
print(tokenizer.convert_ids_to_tokens([256100,256204]))
print(tokenizer.convert_tokens_to_ids(['lao_Laoo','sce_Latn']))
# output
['lao_Laoo', 'sce_Latn']
[256100, 256204]
model = AutoModelForSeq2SeqLM.from_pretrained("facebook/nllb-200-distilled-600M")
model.resize_token_embeddings(len(tokenizer))
new_id = fix_tokenizer(tokenizer, "sce_Latn")
embed_dim = model.model.shared.weight.size(1)
model.model.shared.weight.data[new_id] = torch.randn(embed_dim) * 0.02
Step 4: Model Training
We fine-tuned the translation model using the Adafactor optimizer, a memory-efficient optimization algorithm designed for large-scale sequence-to-sequence models. The training schedule begins with 500 warmup steps, during which the learning rate is gradually increased up to 1e-4 to stabilize early optimization and avoid sudden gradient spikes. The model is then trained for a total of 8,000 optimization steps, with 64 sentence pairs per optimization step (batch). The maximum sequence length is set to 128 tokens, and gradient clipping is applied with a threshold of 1.0.
We initially planned to adopt early stopping. However, due to the limited size of the bilingual corpus, nearly all available bilingual data was used for training, leaving only a dozen-plus sentence pairs reserved for testing. Under these conditions, a validation set of sufficient size was not available. Therefore, although our GitHub codebase includes placeholders for early stopping, this mechanism was not actively used in practice.
Below is a snapshot of the key hyperparameters used in training.
optimizer = Adafactor(
[p for p in model.parameters() if p.requires_grad],
scale_parameter=False,
relative_step=False,
lr=1e-4,
clip_threshold=1.0,
weight_decay=1e-3,
)
batch_size = 64
max_length = 128
training_steps = 8000
warmup_steps = 500
It is also worth noting that, in the design of the loss function, we adopt a computationally efficient training strategy. The model receives tokenized source sentences as input and generates the target sequence incrementally. At each step, the predicted token is compared against the corresponding reference token in the target sentence, and the training objective is computed using token-level cross-entropy loss.
loss = model(**x, labels=y.input_ids).loss
# Pseudocode below illustrates the underlying mechanism of the loss function
for each batch:
x = tokenize(source_sentences) # input: source language tokens
y = tokenize(target_sentences) # target: reference translation tokens
predictions = model.forward(x) # predict next-token distributions
loss = cross_entropy(predictions, y) # compare with reference tokens
backpropagate(loss)
update_model_parameters()
This formulation actually carries an implicit assumption: that the reference translation represents the single correct answer and that the model’s output must align with it token by token. Under this assumption, any deviation from the reference is treated as an error. Even when a prediction conveys the same idea using different wording, synonyms, or an altered sentence structure.
The mismatch between token-level supervision and meaning-level correctness is particularly problematic in low-resource and morphologically flexible languages. At the training stage, this issue can be alleviated by relaxing strict token-level alignment and treating multiple paraphrased target sentences as equally valid references. At the inference stage, instead of selecting the highest-probability output, a set of candidate translations can be generated and re-ranked using semantically informed criteria (e.g., chrF).
Step 5: Model Evaluation
Once the model is built, the next step is to examine how well it translates. Translation quality is shaped not only by the model itself, but also by how the translation process is configured at inference time. Under the NLLB framework, the target language must be explicitly specified during generation. This is done through the forced_bos_token_id parameter, which anchors the output to the intended language. Output length is controlled through two parameters. The first is the minimum output allowance (a), which guarantees a baseline number of tokens that the model is allowed to generate. The second is a scaling factor (b), which determines how the maximum output length grows in proportion to the input length. The maximum number of generated tokens is set as a linear function of the input length, computed as a + b × input_length. In addition, max_input_length limits how many input tokens the model reads.
This function powers the Dongxiang → Chinese translation.
import torch
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
device = "cuda" if torch.cuda.is_available() else "cpu"
MODEL_DIR3 = "/content/drive/MyDrive/my_nllb_CD_model"
tokenizer3 = AutoTokenizer.from_pretrained(MODEL_DIR3)
model3 = AutoModelForSeq2SeqLM.from_pretrained(MODEL_DIR3).to(device)
model3.eval()
def translate3(text, src_lang="zho_Hans", tgt_lang="sce_Latn",
a=16, b=1.5, max_input_length=1024, **kwargs):
tokenizer3.src_lang = src_lang
inputs = tokenizer3(text, return_tensors="pt", padding=True,
truncation=True, max_length=max_input_length).to(model3.device)
result = model3.generate(
**inputs,
forced_bos_token_id=tokenizer3.convert_tokens_to_ids(tgt_lang),
max_new_tokens=int(a + b * inputs.input_ids.shape[1]),
**kwargs
)
outputs = tokenizer3.batch_decode(result, skip_special_tokens=True)
return outputs
Model quality is then assessed using a combination of automatic evaluation metrics and human judgment. On the quantitative side, we report standard machine translation metrics such as BLEU and ChrF++. BLEU scores were computed using standard BLEU-4, which measures word-level n-gram overlap from unigrams to four-grams and combines them using a geometric mean with brevity penalty. ChrF++ was calculated over character-level n-grams and reported as an F-score. It should be noted that the current evaluation is preliminary. Due to limited data availability at this early stage, BLEU and ChrF++ scores were computed on only a few dozen held-out sentence pairs. Our model achieved the following results:
Dongxiang → Chinese (DX→ZH)
BLEU-4: 44.00
ChrF++: 34.3
Chinese → Dongxiang (ZH→DX)
BLEU-4: 46.23
ChrF++: 59.80
BLEU-4 scores above 40 are generally regarded as strong in low-resource settings, indicating that the model captures sentence structure and key lexical choices with reasonable accuracy. The lower chrF++ score in the Dongxiang → Chinese direction is expected and does not necessarily indicate poor translation quality, as Chinese permits substantial surface-level variation in word choice and sentence structure, which reduces character-level overlap with a single reference translation.
In parallel, bilingual evaluators fluent in both languages reported that the model performs reliably on simple sentences, such as those following basic subject–verb–object structures. Performance degrades on longer and more complex sentences. While these results are encouraging, they also indicate that further improvement is still required.
Step 6: Deployment
At the current stage, we deploy the project through a lightweight setup by hosting the documentation and demo interface on GitHub Pages, while releasing the trained models on Hugging Face. This approach enables public access and community engagement without incurring additional infrastructure costs. Details regarding GitHub-based deployment and Hugging Face model hosting follow the official documentation provided by GitHub Pages and the Hugging Face Hub, respectively.
This script uploads a locally trained Hugging Face–compatible model.
import os
from huggingface_hub import HfApi, HfFolder
# Load the Hugging Face access token
token = os.environ.get("HF_TOKEN")
HfFolder.save_token(token)
# Path to the local directory containing the trained model artifacts
local_dir = "/path/to/your/local_model_directory"
# Target Hugging Face Hub repository ID in the format: username/repo_name
repo_id = "your_username/your_model_name"
# Upload the entire model directory to the Hugging Face Model Hub
api = HfApi()
api.upload_folder(
folder_path=local_dir,
repo_id=repo_id,
repo_type="model",
)
Following model release, a Gradio-based interface is deployed as a Hugging Face Space and embedded into the project’s GitHub Pages site. Compared to Docker-based self-deployment, using Hugging Face Spaces with Gradio avoids the cost of maintaining dedicated cloud infrastructure.
Reflection
Throughout the project, data preparation, not model training, dominated the overall workload. The time spent cleaning, validating, and aligning Dongxiang–Chinese data far exceeded the time required to fine-tune the model itself. Without local government involvement and the support of native and bilingual speakers, completing this work would not have been possible. From a technical perspective, this imbalance highlights a broader issue of representation in multilingual NLP. Low-resource languages such as Dongxiang are underrepresented not due to inherent linguistic complexity, but because the data required to support them is expensive to obtain and relies heavily on human expertise.
At its core, this project digitizes a printed bilingual dictionary and constructs a basic translation system. For a community of fewer than one million people, these incremental steps play an outsized role in ensuring that the language is not excluded from modern language technologies. Finally, let’s take a moment to appreciate the breathtaking scenery of Dongxiang Autonomous County!
River gorge in Dongxiang Autonomous County (by Author)
Contact
This article was jointly written by Kaixuan Chen and Bo Ma, who were classmates in the Department of Statistics at the University of North Carolina — Chapel Hill. Kaixuan Chen is currently pursuing a master’s degree at Northwestern University, while Bo Ma is pursuing a master’s degree at the University of California, San Diego. Both authors are open to professional opportunities.
If you are interested in our work or would like to connect, feel free to reach out:
Project GitHub: https://github.com/dongxiangtranslationproject
Kaixuan Chen: chenkaixuann77@gmail.com
Bo Ma: brian0517m@gmail.com
Written By
Kaixuan Chen"
https://towardsdatascience.com/how-to-build-a-neural-machine-translation-system-for-a-low-resource-language/
#Metaglossia
#metaglossia_mundus
#métaglossie
"As a rare Irish-language translator, Timothy McKeon enjoyed steady work for European Union institutions for years. But the rise of artificial intelligence tools that can translate text and, increasingly, speech nearly instantly has upended his livelihood and that of many others in his field.
He says he lost about 70% of his income when the EU translation work dried up. Now, available work consists of polishing machine-generated translations, jobs he refuses “on principle” because they help train the software taking work away from human translators. When the edited text is fed back into the translation software, “it learns from your work.”
“The more it learns, the more obsolete you become,” he said. “You’re essentially expected to dig your own professional grave.”
While workers worldwide ponder how AI might affect their livelihoods – a topic on the agenda at the World Economic Forum in Davos this week – that question is no longer hypothetical in the translation industry. Apps like Google Translate already reduced the need for human translators, and increased adoption of generative AI has only accelerated that trend.
A 2024 survey of writing professionals by the United Kingdom’s Society of Authors showed that more than a third of translators had lost work due to generative AI, which can create sophisticated text, as well as images and audio, from users’ prompts. And 43% of translators said their income had dropped because of the technology.
In the United States, data from 2010-23 analyzed by Carl Frey and Pedro Llanos-Paredes at Oxford University showed that regions where Google Translate was in greater use saw slower growth in the number of translator jobs. Originally powered by statistical translation, Google Translate shifted to a technique called neural translation in 2016, resulting in more natural-sounding text and bringing it closer to today’s AI tools.
“Our best baseline estimate is that roughly 28,000 more jobs for translators would’ve been added in the absence of machine translation,” Frey told CNN.
“It’s not a story of mass displacement but I think that’s very likely to follow.”
Timothy McKeon refuses to edit machine translations, saying it's like digging "your own professional grave." Courtesy Timothy McKeon The story is similar globally, suggests McKeon: He is part of the Guerrilla Media Collective, an international group of translators and communications professionals, and says everyone in the collective supplements their income with other work due to the impact of AI.
‘The entire US is looking at Wisconsin’ Christina Green is president of Green Linguistics, a provider of language services, and a court interpreter in Wisconsin.
She worries her court role could soon vanish because of a bill that would allow courts to use AI or other machine translation in civil or criminal proceedings, and in certain other cases.
Green and other language professionals have been fighting the proposal since it was introduced in May. “The entire US is looking at Wisconsin” as a precedent, Green said, noting that the bill’s opponents had so far succeeded in stalling it.
While Green still has her court job, her company recently lost a major Fortune 10 corporate client, which she said opted to use a company offering AI translation instead. The client accounted for such an outsized share of her company’s business that she had to make layoffs.
Christina Green has had to let staff go because her translation company has lost a large amount of work to AI. Courtesy Alvin Connor/Havone Studios “People and companies think they’re saving money with AI, but they have absolutely no clue what it is, how privacy is affected and what the ramifications are,” Green said.
‘Governments are not doing enough’ Fardous Bahbouh, based in London, is an Arabic-language translator and interpreter for international media organizations, including CNN. She has seen a considerable reduction in written work in recent years, which she attributes to technological developments and the financial pressures facing media outlets.
Bahbouh is also studying for a PhD focusing on the translation industry. Her research shows that technology, including AI, is “hugely impacting” translators and interpreters.
Governments should be doing more to protect foreign-language professionals from the threat posed by AI, according to Fardous Bahbouh. Courtesy Fardous Bahbouh “I worry a great deal that governments are not doing enough to help them transition into other work, which could lead to greater inequality, in-work poverty and child poverty,” she told CNN.
Many translators are indeed looking to retrain “because translation isn’t generating the income it previously did,” according to Ian Giles, a translator and chair of the Translators Association at the UK’s Society of Authors. The picture is similar in the United States: Many translators are leaving the profession, Andy Benzo, president of the American Translators Association, told CNN.
And Kristalina Georgieva, the head of the International Monetary Fund, said in Davos Thursday that the number of translators and interpreters at the fund had gone down to 50 from 200 due to greater use of technology.
Governments should also do more for those remaining in the translation industry, by introducing stronger labor protections, Bahbouh argued.
Human professionals still needed Despite advances in machine translation and interpretation, technology can’t replace human language workers entirely just yet.
Andy Benzo, president of the American Translators Association, says the risks of using AI translation in "high-stakes" fields are "humongous." Courtesy Andy Benzo While using AI tools for everyday tasks like finding directions is “low-risk,” human translators will likely need to be involved for the foreseeable future in diplomatic, legal, financial and medical contexts where the risks are “humungous,” according to Benzo.
“I’m a translator and a lawyer and in both professions the nuance of each word is very specific and the (large language models powering AI tools) aren’t there yet, by far,” she said...
Giles, who translates commercial fiction from Scandinavian languages into English, used to supplement his income with translation work from companies, but that has now disappeared. Meanwhile, literary commissions have continued to come in, he said.
There’s also one key element of communication that AI can’t replace, according to Oxford University’s Frey: Human connection.
“The fact that machine translation is pervasive doesn’t mean you can build a relationship with somebody in France without speaking a word of French,” he said." Lianne Kolirin https://edition.cnn.com/2026/01/23/tech/translation-language-jobs-ai-automation-intl #Metaglossia #metaglossia_mundus #métaglossie
"Iowa State University researchers received unexpected answers to the question of how often people humanize artificial intelligence in news writing.
ISU English professor Jo Mackiewicz and Jeanine Aune, an English teaching professor and director of the university’s advanced communication program, recently published a study on how prevalent the use of anthropomorphizing, or humanizing, language is when it comes to artificial intelligence programs.
They were surprised to find that AI was not typically described in human terms. Even the use of humanizing verbs like “learns,” when considered in context, did not always treat the technology like a person.
“It’s the exact opposite of what I was expecting,” Aune said.
Aune said the pair got the idea for this research from a conference they both attended. The discussion included the suggestion that educators emphasize that AI is a tool that cannot replace communication principles and practices and to avoid anthropomorphizing the technology when they talk about it.
Iowa State University alumni were also involved in the study, including current Brigham Young University associate professor of linguistics Matthew Baker and University of Northern Colorado assistant professor of English Jordan Smith.
Heading into their research, Aune and Mackiewicz said they both had their assumptions as to what they’d find — that humanizing the technology would be widespread. Prior research has shown people anthropomorphize when working with robotics, and Aune said they’ve read articles written about how people use and perceive AI that have suggested humanization is happening.
The research team used News on the Web to study language relating to AI, Mackiewicz said, as it includes the most recent data aggregated from news articles originating from 20 different countries. The database has topped 20 billion words, and it includes some of the earliest news articles about AI to recent articles covering the technology.
“When we started to do an analysis of the individual pairings, that’s where the findings became most interesting, because the overall finding was that, oh, they don’t really pair as frequently as prior research or opinion pieces would have you think,” Mackiewicz said.
Words the team was on the lookout for focused on “verbs that reflect cognition,” Mackiewicz said, also known as “mental verbs” — needs, learns, means, and understands are just a few examples. They also narrowed the search to references to AI or ChatGPT, as it was one of the first AI tools to become publicly available and known.
What they found is that these mental verbs are not often paired with the identified terms, and there was nuance to the instances where they were used together.
“AI means, or AI needs, or ChatGPT knows — you put those together, just on the surface, they seem anthropomorphizing, but they’re not necessarily,” Mackiewicz said.
Anthropomorphization “exists on a spectrum,” Mackiewicz said, where a word like “means” could apply to different senses and meanings depending on how it is being used. Other words have different meanings in their common usage or usage by different disciplines, such as “learns.” Mackiewicz said the terms “AI” and forms of “learn” were paired together frequently in the data, which to some would seem to say that the technology is learning like a human could, but to others would mean something completely different.
It’s this nuance that shows why strict guidelines of what words can and cannot be used in relation to AI are flawed, she said, and what advice the researchers would offer instead is that people need to be careful about knowing who their audience is when writing about AI and thinking about how their words could be interpreted.
Mackiewicz said that in a way, this work showed the value of human beings. It’s easy for a computer to count how many times words are paired together, but it’s harder to analyze it within its context and determine the language’s actual meaning.
While it would be a bigger lift than this study, the researchers said they would be interested in studying what language is used to refer to AI in other areas of writing, like social media. This would require them to create their own database and scrape social media for the necessary information, Mackiewicz said, but it could yield interesting results.
While journalists utilize style guides created by the Associated Press or other organizations when deciding how to write on certain topics and receive editing on their writing, Mackiewicz said, social media generally doesn’t follow such rules.
“We now even had some students refer to ChatGPT with a male pronoun, not talk about it as ‘it,’” Aune said. “So I personally really advocate being intentional with the language, so talking about output text and it uptakes the prompt, you could really kind of emphasize that it is a tool that’s really helpful, but it’s not replacing our brains.”"
https://iowacapitaldispatch.com/2026/01/23/iowa-state-university-researchers-dive-into-language-that-humanizes-ai-systems/
#Metaglossia
#metaglossia_mundus
#métaglossie
"The Language Flagship Program at Brigham Young University was discontinued... because of government funding cuts.
The program, funded by the Department of Defense, was an undergraduate program that offered students opportunities to enhance their language skills to a professional level.
According to the flagship center website, students were provided with experiences that focused on global engagement, professional skills, language proficiency and community and service. At BYU, Chinese and Arabic were offered through the program.
Kirk Belnap, a professor in the Department of Asian and Near Eastern Languages, was the director of the flagship program.
Belnap said a loss associated with the cancellation of the program was the departure of Ahmad Karout, a native Arabic speaker from Syria who worked as the academic director and coordinator for the program.
“We don’t have an Arab who is full-time on the faculty, so losing Ahmad Karout was a real loss for us,” Belnap said.
One of the opportunities offered through the program was the chance to travel abroad for intensive language study.
Chinese Flagship students were sent to China before COVID-19 and then to Taiwan post-pandemic. The Arabic study abroad was in Morocco.
Nicholas Heil joined the Arabic flagship program in 2020 and participated in the capstone from 2023 to 2024 before its discontinuation.
“It provided a lot of opportunities that were really cool, and then the capstone was really helpful for getting to an advanced level of Arabic,” Heil said.
“I think we were cut because we are a private university and they knew that they could cut us and we’d be okay," Belnap said.
Even without this program available, BYU has created an independently funded Master of Arts in Professional Language (MAPL). This program provides similar opportunities for students.
The MAPL program first admitted students learning Chinese in fall 2022. It began teaching Arabic in 2024 and administrators hope to offer French sometime this year. Steve Riep is the director of the MAPL program and has been with the program since it started.
While the flagship program was more flexible, Riep said the MAPL program requires a professional focus.
“We give them an opportunity to develop professional-level fluency in a target language, in a professional domain,” Riep said. “So it would be in whatever their professional area is. That could be microbiology, engineering, business or even diplomacy.”
A study abroad is also offered through MAPL, allowing students to travel to Taiwan and Morocco to take a specified profession class in their intended language. Students also complete an internship abroad as a part of their education.
“The program is growing; it’s begun sort of small, but our hope is that this year we’re going to have a much larger pool of applicants than we’ve had in the past. We’re excited about the future,” Riep said." Sophia Howcroft, January 23, 2026 https://universe.byu.edu/campus/byu-cuts-arabic-chinese-flagship-programs-following-federal-funding-loss #Metaglossia #metaglossia_mundus #métaglossie
"Bill would protect language access for 25 million individuals in the U.S. with limited English proficiency, including 32 percent of Asian Americans and 12 percent of Native Hawaiians and Pacific Islanders.
WASHINGTON, D.C. — Today, Chair of the Congressional Asian Pacific American Caucus (CAPAC) Rep. Grace Meng (NY-06), Chair Emerita Rep. Judy Chu (CA-28), and CAPAC members Rep. Dan Goldman (NY-10) and Rep. Juan Vargas (CA-51) introduced the Language Access for All Act of 2026 to codify language access requirements for federal agencies, including translation and interpretation services under threat from the Trump administration.
In March 2025, President Trump signed Executive Order (EO) 14224 that declared English as the official language of the United States and revoked EO 13166, a 25-year-old mandate that required agencies and recipients of federal funding to provide meaningful language access to individuals with limited English proficiency (LEP). The Trump administration's Department of Justice issued new guidance that minimizes multilingual services and redirects resources towards English language education and assimilation.
These policy changes threaten language access for the over 25 million individuals in the United States—eight percent of the U.S. population—with limited English proficiency. Asian Americans have among the highest language access needs of any racial group, with 32 percent having LEP. 12 percent of Native Hawaiians and Pacific Islanders also have significant language access needs. And while Spanish language speakers make up the majority of those who speak another language in the United States, nearly 40 percent report speaking English “less than very well” in the most recent U.S. Census.
The Language Access for All Act of 2026 modernizes and strengthens the federal government’s approach to language access by codifying EO 13166 and establishing a coordinated, accountable framework to ensure meaningful access for individuals with limited English proficiency. The legislation promotes consistency across agencies, increases transparency and public engagement, and updates federal language access policy to reflect evolving technologies.
“Every American deserves equal access to federal services and programs in a language they can understand. Language access is essential to ensure individuals are able to access small business loans or receive the right medical care,” said Rep. Grace Meng, Chair of the Congressional Asian Pacific American Caucus. “I am proud to introduce the Language Access for All Act alongside my colleagues to safeguard translation services for individuals with limited English proficiency, including millions in the Asian American, Native Hawaiian, and Pacific Islander community. We will continue to fight against the Trump administration's attacks on immigrants and the essential services that our communities rely on and deserve.”
“For more than 25 years, both Democratic and Republican presidents have supported language accessibility across the federal government. Trump’s roll back of these protections is simply wrong. In my district, translation services are essential for parents applying for a home loan, seniors accessing Medicare, immigrants starting a small business, and disaster survivors accessing the FEMA’s resources. That is why I’m proud to co-lead the Language Access for All Act to ensure no one is denied health care, housing, or disaster assistance because English is not their first language. Language access is a civil right and rolling back these services is an attack on our immigrant communities,” said Congressmember Judy Chu (CA-28), CAPAC Chair Emerita.
“In my district, in the most linguistically diverse city on earth, language access mandates across the federal government helps ensure that everyone has meaningful access to housing loans, health care, workforce programs, and life-saving emergency alerts. Trump's attempt to roll back language access is just one of his many xenophobic attempts to attack immigrant communities, and it is completely unacceptable. I am proud to be introducing legislation to restore common sense requirements and ensure information about basic government services is made available to all,” said Rep. Dan Goldman.
“For decades, federal language access services have helped millions of people file taxes, get emergency alerts, apply for loans, and access health care. Trump’s decision to designate English as our country’s official language and attempt to scrap these critical services is dead wrong,” said Rep. Juan Vargas. “No one should be locked out of federal programs because of the language they speak. This legislation is critical as we fight to push back on Trump’s anti-immigrant agenda and keep in place the services our communities rely on.”
The Language Access for All Act of 2026:
Requires federal agencies to ensure that individuals with LEP can meaningfully access the federally conducted programs and activities of the agency, including through translation and interpretation.
Creates a public complaint system to track complaints regarding barriers to meaningful access at agencies.
Requires agencies to develop and maintain language access plans consistent with EO 13166, with public notice and comment, and to submit plans to Congress and publish them on LEP.gov.
Establishes language access technical standards that allow individuals with LEP to access agency content and applies to all agency communications, including AI and automated language assistance services.
Ensures AI-assisted language services do not replace qualified translators and interpreters, comply with federal privacy requirements, and are continuously tested for bias, discrimination, and errors.
Creates an interagency language access working group to provide guidance, coordination, and technical assistance.
Requires each agency to designate a language access coordinator to lead implementation and serve as a point of contact.
Together, these reforms aim to improve service delivery, reduce barriers to access, and ensure federal agencies are equipped to meet the language needs of the public.
This legislation builds on previous actions from CAPAC lawmakers to protect language access.
In April 2025, Members demanded answers from the 15 federal agencies regarding potential disruptions to services and programs for individuals with limited English proficiency, and demanded answers from President Trump and the Department of Justice about the steps the administration is taking to ensure essential translation services continue uninterrupted.
In July 2025, CAPAC leadership issued a joint statement condemning the Department of Justice’s new memorandum that slashes access to multilingual government services.
The Language Access for All Act of 2026 is endorsed by 53 organizations..."
January 23, 2026
Press Release
https://capac.house.gov/press-release/meng-chu-goldman-and-vargas-introduce-bill-protect-multilingual-services-federal
#Metaglossia
#metaglossia_mundus
#métaglossie
"Des chercheurs européens ont mis au point un modèle baptisé EuroLLM-22B. Complètement open source, il intègre les 24 langues officielles de l'UE et 11 autres langues.
Face à la dominance des LLM anglo-saxons ou chinois, le projet EuroLLM entend se démarquer à travers le nombre de langues adressées et son caractère totalement open source. Le modèle a été développé par l’Universidade de Lisboa (Instituto Superior Técnico), l’Université d’Édimbourg, l’Université Paris‑Saclay, Sorbonne Université, Naver Labs, Unbabel et l’Université d’Amsterdam. EuroLLM a également reçu le soutien des programmes européens comme Horizon Europe pour financer la recherche et l’innovation et EuroHPC dédiée au calcul haute performance. Au sein du laboratoire MICS (Mathématique, informatique et système complexe) de CentraleSupélec, deux doctorants en informatique Hippolyte Gisserot-Boukhlef et Nicolas Boizard ont développé le modèle EuroLLM-22B (22,6 milliards de paramètres), entraîné sur le supercalculateur MareNostrum 5 du Barcelona Supercomputing Center.
Dès sa conception, EuroLLM-22B a voulu intégrer les 24 langues officielles de l’Union européenne. Il a ajouté 11 langues supplémentaires parmi lesquelles l’arabe, le catalan, le galicien, le norvégien, le russe, le turc, l’ukrainien, le chinois, l’hindi, le japonais et le coréen. Au démarrage, le groupe universitaire EuroLLM a livré un premier modèle modeste avec 1,7 milliard, puis 9 milliards de paramètres. « Nous avons progressivement augmenté l’échelle pour relever des tâches de complexité croissante : mathématiques, code, traduction », explique Hippolyte Gisserot-Boukhlef. Aujourd’hui, le modèle atteint 22,6 milliards de paramètres. Les premiers résultats donnent de bonnes performances dans la compréhension, la traduction et la génération de texte multilingue.
Un ADN complétement open source
L’autre aspect sur lequel insiste le co-fondateur d’EuroLLM-22B, c’est le caractère complètement open source du LLM. « En partant d’un modèle existant, même open weight, une partie de la recette d’entrainement reste inconnue. Or pour revendiquer le full open source, il faut que tout soit transparent : poids, données et méthodologie », souligne-t-il. Une référence à la distinction entre les fournisseurs qui proposent des LLM dits ouverts comme Llama, Gemma, Qwen ou Deepseek, mais qui ne publient pas leurs données d’entraînement. EuroLLM-22B, qui a utilisé un dataset de 4 000 milliards de tokens pour son training en puissant dans les ressources de Wikipédia et Arvix (travaux de recherche universitaires), se compare à des modèles comme Olmo développé par AI2 et Apertus élaboré par un consortium d’universités suisses.
Pour les développements futurs d’EuroLLM, plusieurs pistes sont explorées. « Nous travaillons sur des architectures de type Mixture of Experts, qui offre la possibilité de réduire les coûts de calcul tout en maintenant un haut niveau de performance », précise Nicolas Boizard. Par ailleurs, la multimodalité (audio et vidéo) est un axe de réflexion en s’appuyant sur les capacités de calcul du supercalculateur Jupiter à partir de 2026. « L’objectif est de créer des modèles performants sur plusieurs types de données, de les combiner et d’améliorer globalement leurs résultats », conclut le co-fondateur."
Louise Costa et Pierre Khan
22 janvier 2026
https://www.lemondeinformatique.fr/actualites/lire-eurollm-22b-parie-sur-un-modele-open-source-et-multilingue-98963.html
#Metaglossia
#metaglossia_mundus
#métaglossie
"Une nouvelle traduction pour "La Mort à Venise" de Thomas Mann
Plus d’un siècle après la première publication de "La Mort à Venise", une nouvelle traduction française restitue la force et la complexité d’un texte majeur, dont le trouble, dans une Venise gagnée par le choléra, demeure intact.
Avec Claire de Oliveira, traductrice, maîtresse de conférence à l'Université Paris-Sorbonne. Thomas Mann est né en juin 1875, il y a un peu plus de 150 ans, ce chiffre rond a occasionné plusieurs manifestations, lectures et relectures en 2025 partout dans le monde. En 2026, son œuvre entre dans le domaine public dans de nombreux pays et vient alors relancer les lectures et les adaptations de ses textes. Dans ce sens, une nouvelle traduction de « La Mort à Venise », texte publiée en 1912, paraît en 2026 avec d'autres nouvelles aux éditions Christian Bourgois. Nous y lisons la ville italienne, plusieurs mystérieuses apparitions masculines, le désir homosexuel, le vieillissement, la création et la mort. Venise y est, tour à tour somptueuse, malsaine et soumise à une épidémie de choléra.
La langue de Thomas Mann Claire de Oliveira trouve la langue de Thomas Mann "somptueuse, très originale et en lutte contre une peur de l'inachèvement et du silence. Pour Thomas Mann , l'un des moyens d'échapper au mutisme est ce qu'il appelle l'imposture. L'incipit de « La mort à Venise » est une parodie du style d'Achenbach qui est lui-même le personnage principal de cette nouvelle."
Le grand sujet de « La mort à Venise » pour Claire de Oliveira est "la passion interdite de l'homosexualité. À mot couvert, Thomas Mann nous parle dans ce texte de l'homosexualité pénaliséeet réprimée. L'engouement pour cet adolescent du même sexe est exprimé par des métaphores qui le subliment, qui se situent au-delà du délabrement pour s'élever vers la contemplation platonicienne de la beauté" dans laquelle Thomas Mann "élabore une conception inédite de l'esthétique."
Prendre soin du lecteur dans la traduction Claire de Oliveira prend soin du lectorat contemporain de Thomas Mann grâce à un appareil critique comprenant "des notes explicatives" pour que "certaines allusions culturelles" puissent "être décelées par les lecteurs contemporains francophones." Claire de Oliveira a également été, dans sa traduction, attentive aux "'intertextes au sens large, car Thomas Mann est un très grand lecteur et ne cesse de faire des allusions à des auteurs qu'il adore." Il y a notamment des "références extrêmement cryptées à Gustave Flaubert."
La question de l'auditrice Yulia @yulia_morel à l'attention de Claire de Oliveira : "En tant que traductrice, comment garder quelque chose de l'esprit ou de l'identité allemande propre à Thomas Mann en français ? Faut-il forcément chercher à le préserver ou accepter que la traduction transforme notre lecture ?"" Jeudi 22 janvier 2026 Provenant du podcast Le Book Club https://www.radiofrance.fr/franceculture/podcasts/le-book-club/une-nouvelle-traduction-pour-la-mort-a-venise-de-thomas-mann-2557648 #Metaglossia #metaglossia_mundus #métaglossie
"ChatGNB, une IA utilisée pour la traduction par les fonctionnaires du N.-B.
Le Nouveau-Brunswick a maintenant sa propre plateforme bilingue d'Intelligence artificielle (IA). Créée il y a un an, ChatGNB est réservée aux employés du gouvernement. Dans la seule province bilingue, l'outil est notamment utilisé pour la traduction de l’anglais vers le français.
La plateforme est actuellement en phase pilote et prend en charge la génération d’idées, les brouillons, les explications et les résumés pour les besoins liés au travail, explique Mir Hyder, agent de communications du ministère des Finances et du Conseil du Trésor du gouvernement du Nouveau-Brunswick.
Cet outil est aussi actuellement utilisé pour traduire des documents de l’anglais vers le français, à l’interne.
La province note que l’outil a été développé pour répondre à la nécessité de satisfaire aux exigences en matière de confidentialité et de sécurité qui n’étaient pas disponibles auparavant dans les services commerciaux.
ChatGNB est sous la responsabilité du bureau du directeur des systèmes d’information de la province et possède un cadre de contrôles stricts en matière de confidentialité, de sécurité et de gouvernance, assure-t-on du côté du gouvernement.
Lorsque Radio-Canada à demandé si ChatGNB pourrait être utilisée pour les communications bilingues officielles de la province ou avoir de possibles répercussions sur les traducteurs contractuels, la province demeure vague.
L’outil d’IA est utilisé pour traduire des documents de l’anglais vers le français, le gouvernement explorant de nombreuses façons de rendre notre travail plus efficace et de réduire les coûts. À ce jour, l’accent a été mis sur la traduction de documents internes, lorsque cela était approprié, écrit par courriel Mir Hyder, du ministère des Finances et du Conseil du Trésor.
Une mise en garde des traducteurs
Il n’existe pas de syndicat pour les traducteurs au Nouveau-Brunswick.
Le Bureau des traducteurs de la province a dirigé Radio-Canada au ministère des Finances et du Conseil du Trésor...
Traductions approximatives et fautes : le PCNB a du mal avec le français
Dans une déclaration écrite, la Corporation des traducteurs, traductrices, terminologues et interprètes du Nouveau-Brunswick (CTINB) commente que, bien que l’IA soit un outil puissant et innovant, son utilisation soulève d’importantes préoccupations dans le milieu quant à la précision des traductions qu’elle génère.
Particulièrement en ce qui concerne les documents médicaux, légaux ou techniques ou ceux nécessitant des nuances venant du contexte culturel.
Dans le cas précis de ChatGNB, la CTINB souligne que la plateforme semble être un bon exemple d’une façon de réduire le temps et les coûts à l’interne, mais met tout de même une mise en garde quant à une possible utilisation de l’outil pour des communications externes.
Le consensus de l’industrie est clair : la totalité des contenus traduits par l’IA doivent être considérés comme des brouillons qui nécessitent une révision en profondeur par des experts humains.
Une citation deSergey Petrov, président de la Corporation des traducteurs, traductrices, terminologues et interprètes du Nouveau-Brunswick
Sergey Petrov souligne que l’essor rapide de l’IA dans la société est une réalité qui force les traducteurs dans l’ensemble à s’adapter professionnellement.
Elle n’élimine pas le besoin d’expertise humaine, mais le redéfinit, avance-t-il. Le rôle du traducteur professionnel en tant que rédacteur principal pourrait évoluer comme éditeur, valideur et stratège linguistique.
Une pente très glissante
La professeure agrégée au département de traduction et des langues à l'Université de Moncton, Arianne Des Rochers, craint pour sa part les dérives potentielles liées à l'utilisation de ChatGNB.
Elle s'inquiète des erreurs et biais de cet outil, qui peut être utilisé par tous les fonctionnaires du gouvernement provincial, sans révision de traducteurs formés
C’est une inquiétude parce que ça aurait une incidence sur la qualité des communications gouvernementales, des textes et des services en français, avance la professeure.
Selon elle, cela peut mener à un pas en arrière pour la minorité linguistique francophone.
On aurait des services inégaux parce qu’il y a une langue qui est desservie par l’intelligence artificielle, tandis que l’autre demeure plus sous contrôle humain.
Une citation deArianne Des Rochers, professeure agrégée au département de traduction et des langues à l'Université de Moncton
Elle s'interroge à savoir si l'exigence de bilinguisme dans la fonction publique pourrait ainsi être revue si l'IA devient la solution pour toutes les traductions du gouvernement. Je pense que c’est une pente très glissante et qu’il faut demeurer très très prudent.
La SANB voit d’un bon œil l’usage du IA
La Société de l’Acadie du Nouveau-Brunswick (SANB), vouée à la défense et à la promotion des droits et des intérêts de la communauté acadienne et francophone de la province, se dit en accord avec l’utilisation possible de l’IA pour les traductions de l’anglais vers le français des communications officielles gouvernementales.
Avec l’IA, il y a de bonnes choses et il ne faut pas mettre ça de côté. Il faut aller de l’avant avec, mais, il ne faut pas écarter le fait que de la traduction par l’IA, il faut que ce soit vérifié par des experts en traduction. Il ne faut pas éliminer ça , affirme la présidente de la SANB, Nicole Arseneau-Sluyter.
C’est sûr que ça va réduire les coûts des traducteurs, mais il faut quand même qu’il y ait des traducteurs. Je ne suis pas contre l’IA, mais le dernier mot doit être fait par des experts, poursuit-elle.
Nicole Arseneau-Sluyter ajoute qu’il sera important aussi que le projet n’ait pas de répercussion négative sur le fait francophone dans la province.
Il ne faut pas qu’on perde nos services en français, déjà qu’on les perd assez là. Il ne faut pas qu’on perde du terrain au niveau du français, ajoute Nicole Arseneau-Sluyter.
Pascale Savoie-Brideau (Consulter le profil)"
https://ici.radio-canada.ca/nouvelle/2220784/ia-traduction-bilingue-francais-anglais
#Metaglossia
#metaglossia_mundus
#métaglossie
"Quand la langue française envahit l’espagnol On pointe volontiers du doigt les anglicismes. Mais des termes français s’introduisent aussi depuis des siècles dans la langue espagnole. Si certains de ces gallicismes sont assimilés, donnant naissance à de nouveaux mots, d’autres sont importés tels quels. De quoi susciter parfois la perplexité de l’hispanophone et mettre à rude épreuve sa volonté d’inclusion, écrit ce quotidien madrilène.
Traduit de l’espagnol Publié le 19 janvier 2026 à 05h00 Il suffit de jeter un coup d’œil à une carte pour saisir toute l’importance de la relation qui unit depuis des siècles l’Espagne et son voisin du nord, par la force des choses. La France est un passage obligé ; sans elle, point de communication ou de lien avec le reste de l’Europe. Il y a certes les Pyrénées – on a largement ironisé sur leur présence et leur rôle. La voie maritime permet aussi, bien sûr, d’atteindre d’autres terres.
Pourtant, quoi qu’on en pense, la France coiffe l’Espagne. Nous l’aurons crainte, aimée, haïe ; nous aurons tour à tour cultivé la gallophilie et la gallophobie. Nous donnons des surnoms péjoratifs à ses habitants, par exemple gabacho [de l’occitan gavach, “qui parle mal”] ou encore franchute [qui peut être traduit par “franchouillard”]. Nous qualifions d’afrancesados [“francisés”] les Espagnols qui ont collaboré avec Bonaparte… Mais le fait est qu’ils sont là, la France et les Français. Et, dans l’ensemble, nous leur devons beaucoup, et même énormément.
Notre langue doit beaucoup à la leur. Du Moyen Âge à nos jours, le français a enrichi l’espagnol de nombreux mots, parfois évidents (jardín, parterre, bulevar, coqueta) et parfois moins (le mot ideología a des racines gréco-latines, mais il n’existerait pas sans son modèle français, “idéologie”)..." https://www.courrierinternational.com/article/vu-d-espagne-quand-la-langue-francaise-envahit-l-espagnol_238889 #Metaglossia #metaglossia_mundus #métaglossie
"Plafonnement du CPF : pourquoi un amendement vise à exclure les formations linguistiques Arnaud il y a 3 jours
Alors que le débat sur le plafonnement du Compte Personnel de Formation entre dans sa phase décisive, un amendement déposé à l’Assemblée nationale attire particulièrement l’attention des acteurs de la formation. Il propose d’exclure explicitement les formations menant à une certification linguistique du mécanisme de plafonnement introduit par le Sénat dans le cadre du projet de loi de finances pour 2026.
Derrière cette formulation technique se cache un enjeu de fond, à la fois social, économique et stratégique pour la politique de formation professionnelle.
Ce que prévoit exactement l’amendement Sur le plan juridique, l’amendement est simple dans sa rédaction mais structurant dans ses effets. Il vise à compléter un alinéa du texte budgétaire afin d’exclure du plafonnement :
les formations menant à une certification visant l’atteinte ou l’amélioration d’un niveau de connaissance d’une langue.
Autrement dit, si cet amendement était adopté, les formations linguistiques certifiantes inscrites au Répertoire spécifique ne seraient pas soumises au plafond de droits CPF mobilisables, contrairement à d’autres certifications relevant du même répertoire.
Il s’agit bien d’une exclusion ciblée, et non d’une remise en cause générale du principe de plafonnement.
Le contexte : un plafonnement voté par le Sénat Pour comprendre la portée de cet amendement, il faut revenir sur le texte adopté par le Sénat.
Dans le cadre du projet de loi de finances pour 2026, les sénateurs ont voté un dispositif visant à plafonner les droits mobilisables sur le CPF pour les actions de formation sanctionnées par des certifications du Répertoire spécifique. L’objectif affiché est double :
responsabiliser davantage les usages du CPF, maîtriser la dépense publique dans un contexte budgétaire contraint. Ce principe de plafonnement s’appliquerait de manière indifférenciée à toutes les certifications concernées, sans distinction de nature ou de finalité.
C’est précisément cette approche uniforme que l’amendement de l’Assemblée nationale vient questionner.
Pourquoi les formations linguistiques posent un cas particulier L’exposé sommaire de l’amendement développe plusieurs arguments structurants, qui méritent d’être explicités.
Un levier clé pour les publics éloignés de l’emploi Les formations linguistiques jouent un rôle central dans les parcours d’insertion et de réinsertion professionnelle. Elles concernent directement :
les demandeurs d’emploi de longue durée, les personnes peu ou pas qualifiées, les salariés en situation de précarité, les publics issus de parcours migratoires, ou encore les personnes exerçant dans des territoires où la maîtrise des langues conditionne l’accès à l’emploi. Dans ces situations, la compétence linguistique n’est pas un simple “plus”. Elle constitue souvent un pré-requis d’employabilité, voire un facteur déterminant de sécurisation des parcours.
Plafonner strictement l’accès à ces formations reviendrait, selon les auteurs de l’amendement, à affaiblir un levier majeur de réduction des inégalités d’accès à l’emploi.
Une logique de progression sur le temps long Autre point central de l’argumentaire : la nature même des compétences linguistiques.
Contrairement à certaines certifications du Répertoire spécifique qui correspondent à une habilitation ponctuelle ou à une compétence immédiatement acquise, la langue s’inscrit dans une dynamique de progression continue. Maintenir, consolider ou améliorer un niveau linguistique suppose souvent des formations successives, étalées dans le temps.
Appliquer un plafonnement uniforme risquerait donc de freiner cette progression naturelle et d’entrer en contradiction avec l’objectif fondamental du CPF, qui est la montée en compétences tout au long de la vie.
Des compétences transversales, pas uniquement liées au poste L’amendement insiste également sur un point juridique et économique important.
Les formations linguistiques ne peuvent être assimilées à de simples formations d’adaptation immédiate au poste de travail, qui relèveraient exclusivement de la responsabilité de l’employeur. Elles développent des compétences transversales, transférables d’un emploi à un autre, d’un secteur à un autre, voire d’un pays à un autre.
À ce titre, elles s’inscrivent pleinement dans la vocation du CPF comme outil d’autonomie professionnelle des actifs, et non comme simple instrument de formation interne à l’entreprise.
Une cohérence avec les textes récents sur le CPF L’amendement souligne enfin que cette exclusion s’inscrirait dans la continuité de choix déjà opérés par le législateur, notamment lors des textes récents consacrés à la lutte contre la fraude au CPF.
Dans ces réformes, la spécificité des certifications linguistiques a déjà été reconnue, précisément pour permettre leur mobilisation répétée dans une logique de progression et non d’usage unique.
L’exclusion du plafonnement serait donc cohérente avec cette reconnaissance antérieure, sans remettre en cause l’équilibre général du dispositif.
Un amendement ciblé, sans remise en cause globale Il est important de le souligner : cet amendement ne conteste ni le principe du plafonnement, ni l’objectif de maîtrise de la dépense publique. Il propose un ajustement ciblé, fondé sur la nature spécifique des formations linguistiques et sur leur impact social et économique.
Le texte prévoit d’ailleurs explicitement un mécanisme de compensation financière pour l’État, via la création d’une taxe additionnelle, afin de préserver l’équilibre budgétaire global.
Une issue encore incertaine À ce stade, cet amendement n’a pas été adopté. Il est en attente dans le cadre des discussions parlementaires à l’Assemblée nationale. Son sort dépendra directement :
du calendrier des débats, des arbitrages politiques de la semaine en cours, et, le cas échéant, du recours ou non à l’article 49.3 sur le budget. En cas de passage en force, il est possible que cet amendement ne soit jamais examiné ni voté.
Ce qu’il faut retenir Cet amendement constitue une prise de position claire sur la place des formations linguistiques dans l’écosystème du CPF. Il met en lumière une tension structurante entre deux objectifs légitimes : la maîtrise des dépenses publiques et la préservation d’un accès effectif à des compétences clés pour l’emploi.
Quelle que soit son issue, il illustre un point essentiel : la réforme du CPF ne se résume pas à un plafonnement budgétaire. Elle pose des questions de fond sur la nature des compétences que la collectivité choisit de soutenir prioritairement.
Les prochains jours permettront de savoir si cette spécificité linguistique sera reconnue dans le texte final, ou si elle devra s’adapter au nouveau cadre commun qui se dessine pour l’ensemble des formations certifiantes.
Texte intégral de l’amendement relatif à l’exclusion des formations linguistiques du plafonnement du CPF À l’alinéa 5, après le mot :
« professionnelles »,
insérer les mots :
« , ainsi que de celles menant à une certification visant l’atteinte ou l’amélioration d’un niveau de connaissance d’une langue ».
Exposé sommaire
Le présent amendement vise à exclure explicitement les formations aux langues du champ du plafonnement instauré par l’amendement adopté par le Sénat à l’article 81 du projet de loi de finances pour 2026, en raison de la durée longue de ces formations et de leur rôle déterminant en matière d’employabilité des personnes éloignées de l’emploi.
L’amendement adopté par le Sénat prévoit un mécanisme de plafonnement des droits mobilisables sur le compte personnel de formation (CPF) pour les actions de formation sanctionnées par des certifications enregistrées au répertoire spécifique (RS). Si l’objectif de responsabilisation des acteurs et de maîtrise de la dépense publique est partagé, l’application indifférenciée de ce plafonnement aux formations linguistiques apparaît inadaptée et contre-productive au regard des finalités mêmes du CPF.
Les formations en langues constituent en effet un levier essentiel d’accès ou de retour à l’emploi pour les publics les plus fragiles : demandeurs d’emploi de longue durée, personnes peu qualifiées, salariés occupant des emplois précaires ou exposés aux mutations économiques, mais également personnes issues de parcours migratoires ou résidant dans des territoires où les opportunités professionnelles exigent une maîtrise accrue des langues. À ce titre, elles contribuent directement à la réduction des inégalités d’accès à l’emploi et à la sécurisation des parcours professionnels.
Par nature, les compétences linguistiques s’inscrivent dans une logique de progression continue, sur le temps long. Contrairement à d’autres certifications du répertoire spécifique, elles ne correspondent pas à une habilitation ponctuelle ou définitivement acquise, mais à un apprentissage évolutif nécessitant des formations successives pour consolider, maintenir ou améliorer un niveau de maîtrise. Leur plafonnement risquerait de freiner cette progression et d’entraver l’objectif de montée en compétences poursuivi par le CPF.
En outre, les formations linguistiques ne peuvent être assimilées à des formations strictement liées à l’adaptation immédiate au poste de travail, relevant de la seule responsabilité de l’employeur au titre de l’article L. 6321-1 du code du travail. Elles participent au développement de compétences transversales, transférables d’un emploi à l’autre, et répondent ainsi pleinement à la vocation du CPF comme outil d’autonomie et d’émancipation professionnelle des actifs.
Enfin, l’exclusion des formations en langues du mécanisme de plafonnement s’inscrit dans la continuité des choix opérés par le législateur dans d’autres textes récents, notamment en matière de lutte contre les fraudes au CPF, où la spécificité des certifications linguistiques a été reconnue afin de permettre leur mobilisation répétée dans une logique de progression des compétences.
Le présent amendement vise donc à préserver l’accès effectif aux formations linguistiques financées par le CPF, compétence indispensable à l’insertion professionnelle et à l’adaptation des actifs aux évolutions du marché du travail, tout en maintenant l’équilibre général du dispositif de plafonnement pour les autres certifications du répertoire spécifique.
La perte de recettes pour l’État est compensée à due concurrence par la création d’une taxe additionnelle aux droits mentionnés aux articles 575 et 575 A du code général des impôts.
Source : https://www.assemblee-nationale.fr/dyn/17/amendements/2247/AN/2483" https://cpformation.com/plafonnement-du-cpf-pourquoi-un-amendement-vise-a-exclure-les-formations-linguistiques/amp/ #Metaglossia #metaglossia_mundus #métaglossie
"For African enterprises navigating global business, the question has shifted from whether to use AI translation to which system to trust when accuracy affects contracts, compliance, and customer relationships.
The global AI translation market is expanding from $1.20 billion in 2024 to $4.50 billion by 2033 at 16.5% CAGR. Despite 70% of global businesses integrating AI translation by 2025, a trust gap persists: advanced AI tools achieve only 60-85% accuracy versus professional human translation’s 95%+ accuracy.
How Do You Trust AI Translation When You Don’t Speak the Target Language?
For African enterprises expanding across the continent’s 54 countries and 2,000+ languages, or engaging with international partners, this challenge is particularly acute. Decision-makers regularly need to approve critical translations, contracts, compliance documents, and product specifications in languages they don’t understand. The traditional approach has been frustratingly inefficient: copy text into Google Translate, then DeepL, then maybe ChatGPT, manually comparing outputs and hoping for the best.
“The biggest issue isn’t that AI makes mistakes, it’s that you can’t easily tell when it’s wrong unless you speak the target language,” noted a user in the r/LanguageTechnology Reddit community, where translation professionals frequently discuss the challenges of trusting single AI engines. This sentiment echoes across enterprise technology discussions throughout 2024 and 2025, as businesses grapple with the practical reality of deploying AI translation at scale.
MachineTranslation.com’s newly launched SMART (consensus translation) feature offers a fundamentally different approach: instead of asking which single AI engine is best, it answers which translation multiple independent engines agree is correct. SMART provides the most trusted translation by comparing the outputs of 22 AI models. It automatically selects the version that the majority of AIs agree on for each sentence. This drastically reduces risk and cuts AI translation errors by 90%. This verification-first methodology represents what industry experts are calling the most significant advancement in machine translation reliability since neural networks became mainstream.
What Is AI Translation Hallucination and Why Does It Matter?
AI hallucinations in translation occur when systems generate fluent, grammatically correct content containing factual inaccuracies or fabricated information.
A 2025 Scientific Reports study analyzing 3 million mobile app reviews found 1.75% of user complaints were about hallucination-like errors. In enterprise deployments, 47% of AI users in 2024 made at least one major decision based on hallucinated content, while 39% of AI-powered customer service bots were reworked due to hallucination errors.
For translation, hallucinations manifest as dropped words, fabricated facts, and terminology drift—especially in low-resource African languages. MIT’s analysis shows hallucinations are particularly prevalent when translating out of English. OpenAI’s latest models demonstrate hallucination rates of 33-79% depending on complexity.
How Does SMART’s 22-Model Consensus Actually Work?
SMART transforms the translation workflow by querying multiple independent AI engines, including Google, DeepL, Claude, Microsoft, and others from its platform of over 22 AI models, and automatically selecting the sentence-level translation that the majority of engines converge on. Crucially, this isn’t about adding a rewriting layer or stylistic polish on top. SMART picks the strongest consensus result without modifying meaning.
“When you see independent AI systems lining up behind the same segments, you get one outcome that’s genuinely dependable,” explained Rachelle Garcia, AI Lead at Tomedes, the company behind MachineTranslation.com. “It turns the old routine of ‘compare every candidate output manually’ into simply ‘scan what actually matters.'”
The consensus model delivers three key advantages that directly address enterprise pain points:
Hallucination Mitigation:
When one engine fabricates details, others typically don’t. SMART follows the majority rather than the outlier, significantly reducing the risk of invented content making it into final deliverables.
Non-Linguist Confidence:
Stakeholders who don’t speak the target language finally see “the translation where most AIs agree,” providing a practical safety net for approval processes.
Review Efficiency:
Editors and reviewers no longer need to scrutinize five separate versions of the same sentence, dramatically accelerating quality assurance workflows.
What Results Has SMART Demonstrated in Real-World Testing?
Internal evaluations on mixed business and legal material revealed that consensus-driven choices reduced visible AI errors and stylistic drift by 18-22% compared to relying on a single engine. The largest gains came from fewer hallucinated facts, tighter terminology consistency, and fewer dropped words, all critical factors for professional content used in contracts, compliance documents, and stakeholder communications.
Even more striking for enterprise decision-makers: in a focused review where professional linguists rated SMART output, 9 out of 10 described it as the safest entry point for stakeholders who don’t speak the target language at all. This directly addresses the fundamental pain point in global business operations, where executives regularly need to approve translations in languages outside their competency.
According to Ofer Tirosh, CEO of Tomedes: “MachineTranslation.com is no longer just a scoring and benchmarking layer for AI outputs; it now builds a single, trustworthy translation from those outputs, end to end. We’ve evolved beyond pure comparison into active composition, and SMART surfaces the most robust translation, not merely the highest-ranked candidate.”
These improvements arrive at a critical moment for African enterprises. As AI adoption accelerates across the continent, with industries from finance to healthcare increasingly relying on automated translation for cost efficiency, the need for verifiable accuracy has never been greater.
Where Does Consensus Translation Provide Maximum Value?
SMART offers benefits across all scenarios, but certain use cases show particularly strong ROI:
Contracts and Legal Policies:
Less scrutiny required; reviewers focus on sensitive clauses, trusting consensus for standard language. Legal AI translation achieves 90% compliance with jurisdiction-specific terminology.
Product Pages and UI Content:
Consistent phrasing across SKUs and interfaces enables faster localization. Critical for African e-commerce – 76% of online buyers prefer products with information in their local language.
Compliance and Regulatory Documents:
Fewer wording slips enable enterprises to align terminology once and distribute confidently across jurisdictions.
Technical and Medical Content:
Healthcare AI localization reduced medical translation errors by 35%, but stakes remain high. Consensus provides a “safety net” when multiple engines converge on medical terminology.
The African context makes these improvements timely. Digital transformation initiatives are driving AI adoption, while Africa’s projected AI market growth to $16.5 billion by 2030 signals increasing enterprise investment.
When Should Enterprises Still Use Human Translation?
SMART’s consensus approach significantly improves AI translation reliability, but it doesn’t eliminate the need for human expertise in all scenarios. The platform explicitly acknowledges this through its optional Human Verification feature for mission-critical content.
Industry data suggests a tiered approach delivers optimal results:
High-Stakes Content (legal documents, marketing campaigns, public-facing materials): Requires human expertise and review. Professional translation delivers 95-100% accuracy versus AI’s 70-85%, and the reputational and legal risks of errors justify the investment. Cost: $0.10-$0.25 per word for common language pairs.
Medium-Stakes Content (help articles, internal policies, onboarding materials): Works well with AI translation plus light human review, especially when using consensus approaches like SMART. Hybrid workflows combining AI draft translation with certified linguist review deliver savings of up to 45% compared to pure human translation while maintaining 97% accuracy. Cost: approximately $0.08 per word including post-editing.
Low-Stakes Content (internal communications, routine emails, preliminary drafts): SMART consensus translation without human review provides a reliable, cost-effective baseline. Free to use for basic implementations, with API options for enterprise integration.
For African SMEs with limited resources, this tiered approach offers a practical pathway. As businesses across the continent seek to transform operations with AI, understanding where to allocate human expertise versus AI automation becomes a crucial competitive advantage.
How Does SMART Fit Into Secure Enterprise Workflows?
Enterprise adoption requires robust data governance. MachineTranslation.com addresses this through:
Secure Mode:
SOC 2-compliant AI processing meeting enterprise security standards
Automatic Anonymization:
Sensitive fields anonymized before processing
Temporary Sharing:
Expiring guest links for controlled collaboration
Format Preservation:
Maintains layouts for PDFs, Word docs, and PowerPoints
No Long-Term Storage:
Content isn’t retained, addressing data sovereignty concerns
These features align with African regulatory frameworks including South Africa’s POPIA, Nigeria’s Data Protection Regulation, and the emerging AU Data Protection Framework.
What Does the Reddit Translation Community Say About This Approach?
The value of comparing multiple AI outputs resonates in translation technology discussions. In r/LanguageTechnology communities, users frequently discuss challenges of trusting single AI engines, with consensus emerging that comparing multiple outputs reduces error risk.
Reddit invested substantially in AI translation, expanding to over 35 countries in 2024. According to Slator, Reddit’s translation cost per language was under $1 million in Q3 2024. CEO Steve Huffman called machine translation “one of the best opportunities we’ve ever seen to rapidly grow the content base outside of English.”
How Should African Enterprises Implement Consensus Translation?
For technology leaders considering SMART:
Identify High-Volume Content:
Start with reliable but not mission-critical translations – product descriptions, support docs, routine correspondence.
Measure Baseline Metrics:
Document current error rates and review time before implementation.
Run Parallel Testing:
Process content through both existing workflow and SMART for 2-4 weeks.
Define Review Triggers:
Establish when consensus translations need human verification.
Scale Gradually:
Begin with one department, validate results, then expand.
This measured approach allows African enterprises embracing AI to validate results before full adoption.
What Are the Broader Implications for African Business?
SMART’s consensus approach signals a shift: rather than seeking the “best” single AI system, orchestrate multiple specialized systems and leverage agreement as reliability proxy.
For Africa’s language diversity, consensus-based approaches offer value for:
Intra-African Trade:
Reliable contract and specification translation as AfCFTA drives cross-border commerce
Regulatory Compliance:
Consistent documentation across multiple African jurisdictions
Digital Public Services:
Verifiable accuracy for e-government initiatives in multiple official languages
Healthcare and Education:
Safety mechanisms where translation errors have direct human impact
The translation market’s projected growth to $27.46 billion by 2030 (from $6.93 billion in 2024) at 25.79% annual growth reflects recognition that language barriers represent genuine economic obstacles.
What Comes Next in Translation Verification?
SMART is live on MachineTranslation.com, free for basic use with enterprise API options. The platform supports 270+ languages via web, Android, iOS, and API.
As AI reshapes enterprise operations, consensus approaches suggest a principle: when single AI systems struggle with reliability, orchestrating multiple systems and extracting consensus offers a practical path forward.
For African technology leaders, the question isn’t which AI translator is best – it’s how organizations verify translation accuracy when decisions depend on it. SMART’s 22-model consensus provides one answer, reducing errors by up to 90% and giving non-linguists a practical safety net.
In a continent where linguistic diversity is simultaneously cultural asset and business challenge, tools making multilingual communication trustworthy aren’t just convenient – they’re an infrastructure for economic integration and growth."
https://www.itnewsafrica.com/2026/01/stop-asking-which-ai-translator-is-best-start-asking-how-translation-gets-verified-inside-smarts-22-model-consensus/
#Metaglossia
#metaglossia_mundus
#métaglossie
"“AI-powered translations for Reels are starting to roll out in more languages, including Bengali, Tamil, Telugu, Marathi, and Kannada, on Instagram. These new additions build on our existing language support for English, Hindi, Portuguese, and Spanish.”
The addition of more of the languages spoken in India is significant, because India is now the biggest single market for both Facebook and Instagram usage, beating out the U.S. by a significant margin.
As such, the capacity to translate your Reels into more natural language for this audience could give some creators a big boost in their audience reach.
Meta’s AI Translations use the sound and tone of the creators own voice, in alignment with lip-synching, to create a more authentic representation of the original clip.
And thus far, it’s having an impact. Instagram chief Adam Mosseri says that creators are seeing increased reach due to translated content, with more Reels from around the world now making their way into people’s feeds.
It could be a valuable consideration, and having the capacity to connect in more languages could boost your exposure to millions more Reels viewers.
In order to activate Meta’s AI translations, you’ll need to have a Page, or have professional mode turned on, and have at least 1,000 followers. Meta’s AI translations are available in countries where Meta AI is available..." https://www.socialmediatoday.com/news/meta-adds-more-languages-to-ai-translations-for-reels/810019/ #Metaglossia #metaglossia_mundus #métaglossie
"Prix de de la traduction ATTF-BNP PARIBAS Association taiwanaise des traducteurs de français
Hsieh Pei-chi remporte le prix de traduction avec La vie secrète d’un cimetière
Hieh Pei-chi remporte le prix de traduction 2025 de l'ATTF-BNP Paribas pour sa traduction de La vie secrète d'un cimetière, de Benoît Gallot (photos ATTF, montage Rti)
L’Association taïwanaise des traducteurs de français (ATTF) a annoncé aujourd’hui que Hsieh Pei-Chi (謝珮琪) est la lauréate du prix de la traduction de 2025 pour l’ouvrage La vie secrète d’un cimetière, de Benoît Gallot.
L’ATTF décerne chaque année un prix à la meilleure traduction pour des œuvres françaises publiées à Taïwan, sponsorisé par BNP-Paribas et alternant les catégories littéraire et sciences humaines et sociales d’un an à l’autre, cette dernière étant la catégorie de 2025.
Suivant la nomination de cinq ouvrages en novembre dernier, le jury final s'est réuni à Taipei le 11 janvier pour délibérer, noter chaque traduction et choisir la lauréate. Hsieh Pei-chi était nominée pour la première fois à ce prix. Selon Caroline Jortay, qui fait partie de jury : « C'est une traduction exceptionnellement fluide que nous livre Hsieh Pei-chi dans sa traduction de La vie secrète d’un cimetière de Benoît Gallot. On y retrouve toute la délicatesse et la musicalité du texte français, tandis que la traductrice guide avec beaucoup d'habileté les lecteurs sinophones vers les recoins méconnus du Père Lachaise. »
Notons que le jury était composé de l’autrice Wei-Yun Lin-Górecka (林蔚昀), la chercheuse du CNRS Coraline Jortay, la chercheuse à l’Institut d’histoire et philologie de l’Academia Sinica Dai Lijuan (戴麗娟), le docteur en philosophie diplômé de l’Université de Louvain Shen Ching-Kai (沈清楷) et le professeur de politique à l’Université Cheng Chi Yeh Hao (葉浩)."
21/01/2026 14:51
Par: La Rédaction
https://www.rti.org.tw/fr/news?uid=3&pid=187576
#Metaglossia
#metaglossia_mundus
#métaglossie
"Atos et Graia s’unissent pour lever les barrières linguistiques au travail grâce à la traduction vocale en temps réel
Paris, France – 20 janvier 2026 – Atos, un leader mondial de la transformation digitale accélérée par l’IA, annonce aujourd’hui un partenariat stratégique avec Graia, une plateforme d’intelligence artificielle qui redéfinit l’expérience client au travers d’interactions intelligentes et empathiques. Ensemble, les deux entreprises ont pour objectif de révolutionner le support multilingue au sein de l‘écosystème Digital Workplace d’Atos en intégrant à ses centres de services et de support la technologie de traduction vocale bidirectionnelle de pointe de Graia.
Cette initiative positionne Atos à l’avant-garde de l’innovation en matière d’environnement de travail augmenté par l’IA, en permettant une traduction vocale fluide et en temps réel. Les utilisateurs et les agents de support s’expriment naturellement dans leur langue maternelle pendant que le système traduit instantanément. L’IA générative de Graia ne se contente pas de transcrire et traduire en temps réel, elle surveille également les expressions clés, suggère aux agents des formulations adaptées à la culture de leurs interlocuteurs et automatise l’évaluation des appels. Atos bénéficie ainsi d’un contrôle qualité homogène et d’un suivi des performances d’une grande précision.
Atos accompagne aujourd’hui plus de 5 millions d’utilisateurs à travers le monde dans plus de 100 langues. Grâce à la technologie de Graia, Atos comblera les imperfections linguistiques d’utilisateurs non anglophones rencontrant des difficultés à s’exprimer avec précision en anglais. Déployé initialement dans le cadre du support multilingue, le service sera ensuite étendu aux parcours d’intégration et de formation inclusifs disponibles au sein de l’offre Digital Workplace d’Atos.
Ce partenariat marque un tournant pour l’accessibilité au travail. La traduction en temps réel accélérera la résolution des incidents et éliminera les blocages linguistiques, offrant une expérience positive, tant aux utilisateurs qu’aux équipes de support. Les clients bénéficieront de services hyper-localisés, adaptés à leur langue, leur culture et leurs préférences, et intégrés à une offre dont l’efficacité est reconnue par le marché.
L’IA générative de Graia transcrit et traduit les conversations en temps réel tout en surveillant les expressions clés et fournit aux agents des énoncés adaptés aux nuances culturelles. L’évaluation automatisée des appels garantit un contrôle qualité constant et des indicateurs de performance fiables à chaque interaction. Cette intégration renforce le leadership d’Atos dans la création d’environnements de travail digitaux inclusifs et évolutifs.
Mike McGarvey, responsable mondial de la stratégie, Digital Workplace, Atos Group, a déclaré : « En intégrant la technologie de traduction vocale de Graia à notre écosystème Digital Workplace, nous donnons à nos équipes le moyen de dialoguer plus naturellement avec nos clients du monde entier. Il s’agit d’un pas décisif vers des expériences digitales véritablement inclusives, intuitives et localisées. Des clients des secteurs de la finance, des services publics, du commerce de détail ou de la santé ont déjà manifesté un vif intérêt pour ce service et nous ont confié des projets en phase pilote. »
Sahil Rekhi, directeur commercial, Graia, a déclaré : « Atos innove sans cesse afin de proposer des environnements de travail fluides. Nous sommes très fiers d’avoir été choisis pour enrichir les interactions multilingues au sein des opérations de support technique d’Atos. Ensemble, nous démontrons comment l’IA générative et agentique peut accroître la satisfaction et la productivité des clients grâce à un support intelligent et adaptatif. »
Cette initiative s’inscrit dans la stratégie d’Atos de proposer un support informatique proactif, comme en témoigne l’Experience Operations Center (XOC). Ce service d’Atos identifie et résout en temps réel les problèmes liés à l’expérience collaborateur, créant ainsi un environnement de travail numérique plus engageant et valorisant.
***
À propos d’Atos Group
Atos Group est un leader international de la transformation digitale avec près de 67 000 collaborateurs et un chiffre d’affaires annuel de près de 10 milliards d’euros. Présent commercialement dans 61 pays, il exerce ses activités sous deux marques : Atos pour les services et Eviden pour les produits. Numéro un européen de la cybersécurité, du cloud et des supercalculateurs, Atos Group s’engage pour un avenir sécurisé et décarboné. Il propose des solutions sur mesure et intégrées, accélérées par l’IA, pour tous les secteurs d’activité. Atos Group est la marque sous laquelle Atos SE (Societas Europaea) exerce ses activités. Atos SE est cotée sur Euronext Paris.
La raison d’être d’Atos Group est de contribuer à façonner l’espace informationnel. Avec ses compétences et ses services, le Groupe supporte le développement de la connaissance, de l’éducation et de la recherche dans une approche pluriculturelle et contribue au développement de l’excellence scientifique et technologique. Partout dans le monde, le Groupe permet à ses clients et à ses collaborateurs, et plus généralement au plus grand nombre, de vivre, travailler et progresser durablement et en toute confiance dans l’espace informationnel.
Contact presse
Isabelle Grangé | isabelle.grange@atos.net | +33 (0) 6 64 56 74 88" https://www.24matins.fr/pr/atos-et-graia-sunissent-pour-lever-les-barrieres-linguistiques-au-travail-grace-a-la-traduction-vocale-en-temps-reel #Metaglossia #metaglossia_mundus #métaglossie
"At The Hindu Lit for Life 2026, Deepa Bhasthi unpacked the choices, politics and power behind translating Kannada into English.
“There is no such thing as proper English.” That opening remark by writer and translator Ms. Bhasthi anchored one of the sessions at The Hindu Lit for Life 2026 titled ‘Re-imagining Stories’, moderated by writer and translator Nandini Krishnan. The discussion centred on Ms. Bhasthi’s acclaimed translation of Heart Lamp, the International Booker Prize-winning collection of Kannada stories by Banu Mushtaq.
Ms. Bhasthi was unequivocal that her role went far beyond that of a linguistic mediator. In selecting, editing and translating these works, she acted as editor, interpreter and cultural custodian, shaping what global readers now recognise as Ms. Mushtaq’s literary voice in English. “It wasn’t just the labour of translation,” she noted, “but the choices on what to include, what to leave out, that travelled internationally,” Ms. Bhasthi says.
Ms. Bhasthi opened the session by reading out an excerpt from the Kannada version of Heart Lamp. The discussion lingered on her decision to retain culturally specific terms such as ganda, pati, yajamana, and akki roti without any footnotes. Such choices allow the English version of Heart Lamp to carry an accent, a geography and a lived social texture. Translation, for her, is not about erasing difference but about inviting readers to inhabit another linguistic world. “I am more led by the language than the writer. I want the reader to understand as much of Kannada linguistic culture as possible… If someone can learn a new Kannada word, then why not?” Ms. Bhasthi says.
Ms. Bhasthi spoke about resisting the idea of a “neutral” or “global” English, particularly one shaped by colonial or metropolitan standards, which is why “South Indian speech rhythms deliberately find space in the stories,” says Ms. Bhasthi. She also acknowledged moments of negotiation, particularly with Western publishers unfamiliar with Indian languages. The term “nursing home,” she recalled with humour, proved to be one of the rare compromises.
The session also explored how Ms. Bhasthi’s translations across different authors demand different forms of attentiveness. With Heart Lamp, she said, the process required immersion into unfamiliar religious and cultural registers, including Urdu and Arabic influences within Kannada. On being asked about death as an underlying theme across the stories, Ms. Bhasthi spoke about how they are peppered with dark humour rather than being an easy or light read. “I wanted to ensure that all stages of womanhood are touched upon, from a young bride to an old woman. You realise patriarchy and religious fundamentalism affect women of all ages,” Ms. Bhasthi says.
Despite winning one of the world’s most prestigious translation prizes, Ms. Bhasthi noted that translators continue to remain marginal figures. “As long as reviewers keep calling translations ‘seamless’, they erase the labour and thinking behind every line,” she observes.
Ultimately, the session foregrounded Ms. Bhasthi not just as a translator of Heart Lamp, but as a thinker reshaping how Indian literature travels, insisting that readers learn to listen to stories that challenge them linguistically."
https://www.thehindu.com/lit-for-life/re-imagining-stories-when-language-travels-honestly/article70510354.ece
#Metaglossia
#metaglossia_mundus
#métaglossie
"Hugging Face has released FineTranslations, a large-scale multilingual dataset containing more than 1 trillion tokens of parallel text across English and 500+ languages. The dataset was created by translating non-English content from the FineWeb2 corpus into English using Gemma3 27B, with the full data generation pipeline designed to be reproducible and publicly documented.
The dataset is primarily intended to improve machine translation, particularly in the English→X direction, where performance remains weaker for many lower-resource languages. By starting from text originally written in non-English languages and translating it into English, FineTranslations provides large-scale parallel data suitable for fine-tuning existing translation models. Internal evaluations also indicate that the resulting English text performs on a similar level to FineWeb for English-only model training, allowing the data to be reused beyond translation-specific tasks.
Beyond translation, Hugging Face reports that the resulting English corpus retains substantial cultural and contextual information from the source languages. In internal experiments, models trained on the translated English text achieved performance comparable to those trained on the original FineWeb dataset, suggesting that FineTranslations can also serve as a high-quality supplement for English-only model pretraining.
The dataset is sourced from FineWeb2, which aggregates multilingual web content from CommonCrawl snapshots collected between 2013 and 2024. To reduce skew toward highly repetitive or domain-specific material, such as religious texts and Wikipedia pages, only language subsets with a bible_wiki_ratio below 0.5 were included. For each language, up to 50 billion tokens were processed, with quality classifiers from FineWeb2-HQ applied where available, and random sampling used otherwise.
Translation was carried out at scale using the datatrove framework, which enabled robust checkpointing, asynchronous execution, and efficient GPU utilization on the Hugging Face cluster. Documents were split into chunks of up to 512 tokens, with a sliding-window strategy to preserve context across segments. Additional safeguards were introduced to mitigate common large-scale translation issues, including early classification of toxic or spam-like content, strict formatting constraints, and post-processing to ensure consistency of line breaks and structure.
Each dataset entry includes aligned original and translated text chunks, language and script identifiers, token counts, quality and educational scores, and references to the original CommonCrawl source. The dataset can be accessed through the Hugging Face datasets library...
FineTranslations is available now on Hugging Face. The dataset is released under the Open Data Commons Attribution (ODC-By) v1.0 license, and its use is subject to CommonCrawl’s terms."
Robert Krzaczyński
Senior Software Engineer
https://www.infoq.com/news/2026/01/huggingface-fine-translations/
#Metaglossia
#metaglossia_mundus
#métaglossie
|
"Peter Burke once noted that translators, like historians, are “serving two masters and attempting to reconcile fidelity to the original with intelligibility to their readers” (2007). It is particularly interesting to examine how translation functioned during the emergence of a new political-economic reality and its language in the formation of a well-ordered police state. Among the texts on political economy translated from German into Russian, the most important were those on cameral and police sciences. The history of their translation is embedded in the broader history of cultural transfer (Espagne).
Manuscript translations of Wilhelm von Schröder (1640–1688) on the prince’s treasury from the early eighteenth century, the numerous multi-volume book and journal translations of Johann H. G. von Justi (1717–1771) on manufactures, welfare, and the science of state governance, and the influential translation of Joseph F. von Sonnenfels’s (1732–1817) book on politics and finance form the basis for reflections on the translation of the concepts of state, society, welfare, happiness (Glückseligkeit), and police (Policey). The paper is mainly focused on two questions: what was translated, and in what manner (with what intentions) were the translations carried out? At times, information about the translators and the intended audience provides additional context for understanding the adaptation, transfer, and reception of these ideas. In my research, I proceed from the non-neutrality of translation (not merely a linguistic transfer), but instead consider it as a process of profound transformation, appropriation, and semantic interaction. The political, aesthetic, and intellectual act of translation and transfer invites reflection on a broader “bracket” under the name of the Baroque that brings these translations together in the age of the Enlightenment.
Danila Raskov is a researcher at the Department of Economic History, Uppsala U... He is the author of the books "The Economic Institutions of Old Believers" (2012) and "The Rhetoric of Institutional Economics" (2023), both published in Russian. He is currently working on the book Cameralism in the Building of the Russian Empire: Administrative and Intellectual Discourses on the Well-Ordered State."
Date: 3 February 2026, 15:15–17:00
Location: IRES Library, Gamla torget 3, 3rd Floor
Type: Lecture, Seminar
Organiser: Institute for Russian and Eurasian Studies (IRES)
IRES higher seminar
Last modified: 2026-01-23
Contact
+46 18 471 00 00 (switchboard)
https://www.uu.se/en/department/russian-and-eurasian-studies/events/archive/2026-02-03-translations-of-cameralists-in-the-eighteenth-century-russian-empire
#Metaglossia
#metaglossia_mundus
#métaglossie