\"<p>In
In this photo taken Wednesday Nov 24, 2021, Kola Tubosun, is photograph in his house in Lagos, Nigeria. Computers have become amazingly precise at translating spoken words to text messages and scouring huge troves of information for answers to complex questions. At least, that is, so long as you speak English or another of the world's dominant languages. But try talking to your phone in Yoruba, Igbo or any number of widely spoken African languages and you'll find glitches that can hinder access to information, trade, personal communications, customer service and other benefits of the global tech economy. (AP Photo\/Sunday Alamba)<\/span><\/figcaption><\/figure>
LAGOS: Computers have become amazingly precise at translating spoken words to text messages and scouring huge troves of information for answers to complex questions. At least, that is, so long as you speak English or another of the world's dominant languages.

But try talking to your phone in Yoruba, Igbo or any number of widely spoken African languages and you'll find glitches that can hinder access to information, trade, personal communications, customer service and other benefits of the global tech economy.

\"We are getting to the point where if a machine doesn't understand your language it will be like it never existed,\" said Vukosi Marivate, chief of data science at the University of Pretoria in South Africa, in a call to action before a December virtual gathering of the world's artificial intelligence researchers.

American tech giants don't have a great track record of making their language technology work well outside the wealthiest markets, a problem that's also made it harder for them to detect dangerous misinformation on their platforms.

Marivate is part of a coalition of African researchers who have been trying to change that. Among their projects is one that found machine translation tools failed to properly translate online COVID-19 surveys from English into several African languages.

\"Most people want to be able to interact with the rest of the information highway in their local language,\" Marivate said in an interview. He's a founding member of Masakhane, a pan-African research project to improve how dozens of languages are represented in the branch of AI known as natural language processing. It's the biggest of a number of grassroots language technology projects that have popped up from the Andes to Sri Lanka.

Tech giants offer their products in numerous languages, but they don't always pay attention to the nuances necessary for those apps work in the real world. Part of the problem is that there's just not enough online data in those languages - including scientific and medical terms - for the AI systems to effectively learn how to get better at understanding them.

Google<\/a>, for instance, offended members of the Yoruba community several years ago when its language app mistranslated Esu, a benevolent trickster god, as the devil. Facebook<\/a>'s language misunderstandings have been tied to political strife around the world and its inability to tamp down harmful misinformation about COVID-19 vaccines. More mundane translation glitches have been turned into joking online memes.

Omolewa Adedipe has grown frustrated trying to share her thoughts on Twitter in the Yoruba language because her automatically translated tweets usually end up with different meanings.

One time, the 25-year-old content designer tweeted, \"T'Ilu o ba dun, T'Ilu o ba t'oro. Eyin l'ęmo bi ę şe şe,\" which means, \"If the land (or country, in this context) is not peaceful, or merry, you're responsible for it.\" Twitter, however, managed to end up with the translation: \"If you are not happy, if you are not happy.\"

For complex Nigerian languages like Yoruba, those accent marks -- often associated with tones -- make all the difference in communication. 'Ogun', for instance, is a Yoruba word that means war, but it can also mean a state in Nigeria (Ogun), god of iron (Ogun), stab (Ogun), twenty or property (Ogun).

\"Some of the bias is deliberate given our history,\" said Marivate, who has devoted some of his AI research to the southern African languages of Xitsonga and Setswana spoken by his family members, as well as to the common conversational practice of \"code-switching\" between languages.

\"The history of the African continent and in general in colonized countries, is that when language had to be translated, it was translated in a very narrow way,\" he said. \"You were not allowed to write a general text in any language because the colonizing country might be worried that people communicate and write books about insurrections or revolutions. But they would allow religious texts.\"

Google and Microsoft are among the companies that say they are trying to improve technology for so-called \"low-resource\" languages that AI systems don't have enough data for. Computer scientists at Meta, the company formerly known as Facebook, announced in November a breakthrough on the path to a \"universal translator\" that could translate multiple languages at once and work better with lower-resourced languages such as Icelandic or Hausa.

That's an important step, but at the moment, only large tech companies and big AI labs in developed countries can build these models, said David Ifeoluwa Adelani. He's a researcher at Saarland University in Germany and another member of Masakhane, which has a mission to strengthen and spur African-led research to address technology \"that does not understand our names, our cultures, our places, our history.\"

Improving the systems requires not just more data but careful human review from native speakers who are underrepresented in the global tech workforce. It also requires a level of computing power that can be hard for independent researchers to access.

Writer and linguist Kola Tubosun created a multimedia dictionary for the Yoruba language and also created a text-to-speech machine for the language. He is now working on similar speech recognition technologies for Nigeria's two other major languages, Hausa and Igbo, to help people who want to write short sentences and passages.

\"We are funding ourselves,\" he said. \"The aim is to show these things can be profitable.\"

Tubosun led the team that created Google's \"Nigerian English\" voice and accent used in tools like maps. But he said it remains difficult to raise the money needed to build technology that might allow a farmer to use a voice-based tool to follow market or weather trends.

In Rwanda, software engineer Remy Muhire is helping to build a new open-source speech dataset for the Kinyawaranda language that involves a lot of volunteers recording themselves reading Kinyawaranda newspaper articles and other texts.

\"They are native speakers. They understand the language,\" said Muhire, a fellow at Mozilla, maker of the
Firefox<\/a> internet<\/a> browser. Part of the project involves a collaboration with a government-supported smartphone app that answers questions about COVID-19. To improve the AI systems in various African languages, Masakhane researchers are also tapping into news sources across the continent, including Voice of America's Hausa service and the BBC broadcast in Igbo.

Increasingly, people are banding together to develop their own language approaches instead of waiting for elite institutions to solve problems, said Damian Blasi, who researches linguistic diversity at the Harvard Data Science Initiative.

Blasi co-authored a recent study that analyzed the uneven development of language technology across the world's more than 6,000 languages. For instance, it found that while Dutch and Swahili both have tens of millions of speakers, there are hundreds of scientific reports on natural language processing in the Western European language and only about 20 in the East African one.

(O'Brien reported from Providence, Rhode Island)<\/em><\/strong>
<\/body>","next_sibling":[{"msid":88465234,"title":"Jack Dorsey- Marc Andreesen Twitter spat escalates over Web3 ownership","entity_type":"ARTICLE","link":"\/news\/jack-dorsey-marc-andreesen-twitter-spat-escalates-over-web3-ownership\/88465234","category_name":null,"category_name_seo":"telecomnews"}],"related_content":[],"msid":88465250,"entity_type":"ARTICLE","title":"In Africa, rescuing the languages that Western tech ignores","synopsis":"\"We are getting to the point where if a machine doesn't understand your language it will be like it never existed,\" said Vukosi Marivate, chief of data science at the University of Pretoria in South Africa, in a call to action before a December virtual gathering of the world's artificial intelligence researchers.","titleseo":"telecomnews\/in-africa-rescuing-the-languages-that-western-tech-ignores","status":"ACTIVE","authors":[],"Alttitle":{"minfo":""},"artag":"AP","artdate":"2021-12-24 08:04:15","lastupd":"2021-12-24 08:06:29","breadcrumbTags":["Google","Internet","Big tech","Facebook","western tech companies","firefox","International"],"secinfo":{"seolocation":"telecomnews\/in-africa-rescuing-the-languages-that-western-tech-ignores"}}" data-authors="[" "]" data-category-name="" data-category_id="" data-date="2021-12-24" data-index="article_1">

在非洲,拯救西方科技语言,忽略了

“我们,如果一台机器不懂你的语言就像从未存在过,“说Vukosi Marivate,数据科学主任南非比勒陀利亚大学在12月前行动呼吁虚拟世界上收集的人工智能研究人员。

  • 更新2021年12月24日08:06点坚持
周三在这张照片< p > 11月24日,2021年,可乐Tubosun,照片在他的房子在拉各斯,尼日利亚。计算机已经变得非常精确的翻译口语词汇短信和在巨大的宝贵信息的复杂问题的答案。至少,这是,只要你说英语或另一个世界的主流语言。但试着跟你的电话在约鲁巴语,伊博人或任何数量的广泛使用非洲语言,你会发现故障,可以阻碍获取信息,贸易、个人通信、客户服务和其他全球科技经济的好处。(美联社照片/周日Alamba) < / p >
2021年周三在这张照片11月24日,可乐Tubosun,照片在他的房子在拉各斯,尼日利亚。计算机已经变得非常精确的翻译口语词汇短信和在巨大的宝贵信息的复杂问题的答案。至少,这是,只要你说英语或另一个世界的主流语言。但试着跟你的电话在约鲁巴语,伊博人或任何数量的广泛使用非洲语言,你会发现故障,可以阻碍获取信息,贸易、个人通信、客户服务和其他全球科技经济的好处。(美联社照片/周日Alamba)

拉各斯:计算机变得非常精确的翻译口语词汇短信和在巨大的宝贵信息的复杂问题的答案。至少,这是,只要你说英语或另一个世界的主流语言。

广告
但试着跟你的电话在约鲁巴语,伊博人或任何数量的广泛使用非洲语言,你会发现故障,可以阻碍获取信息,贸易、个人通信、客户服务和其他全球科技经济的好处。

“我们,如果一台机器不懂你的语言就像从未存在过,“说Vukosi Marivate,数据科学主任南非比勒陀利亚大学在12月前行动呼吁虚拟世界上收集的人工智能研究人员。

美国科技巨头没有一个伟大的记录使他们的语言技术工作之外最富有的市场,这一问题也使它更难探测危险的错误信息在他们的平台上。

Marivate是一个非洲联盟的一部分,研究人员一直试图改变这种状况。在他们的项目是一个发现机器翻译工具未能正常在线COVID-19调查从英语翻译成几个非洲语言。

“大多数人希望能够与其余的信息高速公路在当地语言中,“Marivate在接受采访时表示。Masakhane的他是一个创始成员泛非研究项目改善许多语言是如何在人工智能的分支称为自然语言处理。它是最大的一批基层语言技术项目的出现,从安第斯山脉斯里兰卡。

广告
科技巨头提供他们的产品在众多的语言,但他们并不总是注意那些应用程序所需的细微差别在现实世界中工作。问题的一部分是,没有足够的在线数据在这些语言,包括科学和医学术语,AI系统有效地学习如何更好地理解它们。

谷歌例如,冒犯了约鲁巴语社区的成员数年前当其语言应用静电单位错译,上帝仁慈的骗子,魔鬼。脸谱网语言的误解已经与世界各地的政治冲突及其对COVID-19疫苗无法抑制有害的错误信息。的翻译问题已经变成了开玩笑的网络模因。

Omolewa Adedipe变得沮丧试图约鲁巴语语言在Twitter上分享她的想法,因为她自动翻译tweet通常有不同的含义。

设计师写道:有一次,这位25岁的内容,“T 'Ilu o英航dun, T 'Ilu o英航T ' oro。Eyin l 'ęmo bięşeşe”,这意味着,“如果土地(或国家,在这种情况下)不是和平,或快乐,你负责。”推特, however, managed to end up with the translation: "If you are not happy, if you are not happy."

对于复杂的尼日利亚约鲁巴语这样的语言,这些重音符号——常与音调——让所有的差异沟通。“Ogun”,例如,是一个约鲁巴语的词,意味着战争,但它也意味着一个国家在尼日利亚(Ogun),铁的神(Ogun),刺(Ogun), 20或财产(Ogun)。

”的一些偏见是故意给我们的历史,”Marivate说,他已将他的一些人工智能研究南部非洲语言语按和他的家人,以及之间的“代码转换”的共同对话练习语言。

“非洲大陆的历史,一般在殖民国家,当语言必须翻译,翻译在一个非常狭窄的方式,”他说。“你不允许在任何语言来编写一个通用文本,因为殖民国家可能担心人们交流和写书对叛乱或革命。但他们将允许宗教经文。”

谷歌和微软等公司表示,他们正试图提高所谓的“资源缺乏”语言,人工智能技术系统没有足够的数据。计算机科学家在元,公司前身是Facebook,在11月宣布了一项突破的道路上“通用翻译”,马上可以翻译多种语言和工作更好的与冰岛等lower-resourced语言或豪萨语。

迈出的重要一步,但目前只有大型科技公司和大型人工智能实验室在发达国家可以建立这些模型,David Ifeoluwa Adelani。他是一个德国萨尔州大学研究员,Masakhane的另一个成员,有使命加强和促进主导的研究来解决技术”,不理解我们的名字,我们的文化,我们的地方,我们的历史。”

提高系统不仅需要更多的数据但小心人类审查来自母语在全球的科技劳动力。它也需要一定的计算能力,可以独立研究人员很难访问。

作家兼语言学家可乐Tubosun创建了一个多媒体约鲁巴语语言的字典也创建了一个语言的文本到语音转换机。他现在工作在类似的语音识别技术对尼日利亚的另外两个主要语言,豪萨语和伊博语,帮助那些想写短句子和段落。

“我们正在资助自己,”他说。“这些东西的目的是展示可以盈利。”

尼日利亚Tubosun带领团队创造了谷歌的“英语”的声音和口音用于地图等工具。但是他说这仍然很难筹集资金需要构建技术,可能允许农民用语音工具遵循市场或天气趋势。

在卢旺达,软件工程师雷米Muhire正在帮助建立一个新的开源语音数据集Kinyawaranda语言,很多志愿者记录自己读Kinyawaranda报纸文章和其他文本。乐动扑克

“他们是母语。他们理解的语言,”研究员Mozilla Muhire说,制造商的火狐 互联网浏览器。部分合作项目涉及的政府支持的智能手机应用对COVID-19回答问题。提高AI系统多种非洲语言,Masakhane研究者也利用整个欧洲大陆的新闻来源,包括美国之音在伊博人的豪萨语服务和BBC广播。乐动扑克

越来越多的人们联合起来开发他们自己的语言的方法而不是等待精英机构解决问题,研究语言多样性达米安布拉西说哈佛的科学项目的数据。

布拉西合著的最近的一项研究中,分析了不平衡发展语言技术在世界各地的超过6000种语言。例如,它发现,荷兰语和斯瓦希里语都有数千万人,有成百上千的科学报告在西方欧洲语言和自然语言处理中只有大约20东非。

(O ' brien报道来自美国罗德岛州普罗维登斯市)
  • 发布于2021年12月24日08:04点坚持
是第一个发表评论。
现在评论

加入2 m +行业专业人士的社区

订阅我们的通讯最新见解与分析。乐动扑克

下载ETTelec乐动娱乐招聘om应用

  • 得到实时更新
  • 保存您最喜爱的文章
扫描下载应用程序
\"&lt;p&gt;In
In this photo taken Wednesday Nov 24, 2021, Kola Tubosun, is photograph in his house in Lagos, Nigeria. Computers have become amazingly precise at translating spoken words to text messages and scouring huge troves of information for answers to complex questions. At least, that is, so long as you speak English or another of the world's dominant languages. But try talking to your phone in Yoruba, Igbo or any number of widely spoken African languages and you'll find glitches that can hinder access to information, trade, personal communications, customer service and other benefits of the global tech economy. (AP Photo\/Sunday Alamba)<\/span><\/figcaption><\/figure>
LAGOS: Computers have become amazingly precise at translating spoken words to text messages and scouring huge troves of information for answers to complex questions. At least, that is, so long as you speak English or another of the world's dominant languages.

But try talking to your phone in Yoruba, Igbo or any number of widely spoken African languages and you'll find glitches that can hinder access to information, trade, personal communications, customer service and other benefits of the global tech economy.

\"We are getting to the point where if a machine doesn't understand your language it will be like it never existed,\" said Vukosi Marivate, chief of data science at the University of Pretoria in South Africa, in a call to action before a December virtual gathering of the world's artificial intelligence researchers.

American tech giants don't have a great track record of making their language technology work well outside the wealthiest markets, a problem that's also made it harder for them to detect dangerous misinformation on their platforms.

Marivate is part of a coalition of African researchers who have been trying to change that. Among their projects is one that found machine translation tools failed to properly translate online COVID-19 surveys from English into several African languages.

\"Most people want to be able to interact with the rest of the information highway in their local language,\" Marivate said in an interview. He's a founding member of Masakhane, a pan-African research project to improve how dozens of languages are represented in the branch of AI known as natural language processing. It's the biggest of a number of grassroots language technology projects that have popped up from the Andes to Sri Lanka.

Tech giants offer their products in numerous languages, but they don't always pay attention to the nuances necessary for those apps work in the real world. Part of the problem is that there's just not enough online data in those languages - including scientific and medical terms - for the AI systems to effectively learn how to get better at understanding them.

Google<\/a>, for instance, offended members of the Yoruba community several years ago when its language app mistranslated Esu, a benevolent trickster god, as the devil. Facebook<\/a>'s language misunderstandings have been tied to political strife around the world and its inability to tamp down harmful misinformation about COVID-19 vaccines. More mundane translation glitches have been turned into joking online memes.

Omolewa Adedipe has grown frustrated trying to share her thoughts on Twitter in the Yoruba language because her automatically translated tweets usually end up with different meanings.

One time, the 25-year-old content designer tweeted, \"T'Ilu o ba dun, T'Ilu o ba t'oro. Eyin l'ęmo bi ę şe şe,\" which means, \"If the land (or country, in this context) is not peaceful, or merry, you're responsible for it.\" Twitter, however, managed to end up with the translation: \"If you are not happy, if you are not happy.\"

For complex Nigerian languages like Yoruba, those accent marks -- often associated with tones -- make all the difference in communication. 'Ogun', for instance, is a Yoruba word that means war, but it can also mean a state in Nigeria (Ogun), god of iron (Ogun), stab (Ogun), twenty or property (Ogun).

\"Some of the bias is deliberate given our history,\" said Marivate, who has devoted some of his AI research to the southern African languages of Xitsonga and Setswana spoken by his family members, as well as to the common conversational practice of \"code-switching\" between languages.

\"The history of the African continent and in general in colonized countries, is that when language had to be translated, it was translated in a very narrow way,\" he said. \"You were not allowed to write a general text in any language because the colonizing country might be worried that people communicate and write books about insurrections or revolutions. But they would allow religious texts.\"

Google and Microsoft are among the companies that say they are trying to improve technology for so-called \"low-resource\" languages that AI systems don't have enough data for. Computer scientists at Meta, the company formerly known as Facebook, announced in November a breakthrough on the path to a \"universal translator\" that could translate multiple languages at once and work better with lower-resourced languages such as Icelandic or Hausa.

That's an important step, but at the moment, only large tech companies and big AI labs in developed countries can build these models, said David Ifeoluwa Adelani. He's a researcher at Saarland University in Germany and another member of Masakhane, which has a mission to strengthen and spur African-led research to address technology \"that does not understand our names, our cultures, our places, our history.\"

Improving the systems requires not just more data but careful human review from native speakers who are underrepresented in the global tech workforce. It also requires a level of computing power that can be hard for independent researchers to access.

Writer and linguist Kola Tubosun created a multimedia dictionary for the Yoruba language and also created a text-to-speech machine for the language. He is now working on similar speech recognition technologies for Nigeria's two other major languages, Hausa and Igbo, to help people who want to write short sentences and passages.

\"We are funding ourselves,\" he said. \"The aim is to show these things can be profitable.\"

Tubosun led the team that created Google's \"Nigerian English\" voice and accent used in tools like maps. But he said it remains difficult to raise the money needed to build technology that might allow a farmer to use a voice-based tool to follow market or weather trends.

In Rwanda, software engineer Remy Muhire is helping to build a new open-source speech dataset for the Kinyawaranda language that involves a lot of volunteers recording themselves reading Kinyawaranda newspaper articles and other texts.

\"They are native speakers. They understand the language,\" said Muhire, a fellow at Mozilla, maker of the
Firefox<\/a> internet<\/a> browser. Part of the project involves a collaboration with a government-supported smartphone app that answers questions about COVID-19. To improve the AI systems in various African languages, Masakhane researchers are also tapping into news sources across the continent, including Voice of America's Hausa service and the BBC broadcast in Igbo.

Increasingly, people are banding together to develop their own language approaches instead of waiting for elite institutions to solve problems, said Damian Blasi, who researches linguistic diversity at the Harvard Data Science Initiative.

Blasi co-authored a recent study that analyzed the uneven development of language technology across the world's more than 6,000 languages. For instance, it found that while Dutch and Swahili both have tens of millions of speakers, there are hundreds of scientific reports on natural language processing in the Western European language and only about 20 in the East African one.

(O'Brien reported from Providence, Rhode Island)<\/em><\/strong>
<\/body>","next_sibling":[{"msid":88465234,"title":"Jack Dorsey- Marc Andreesen Twitter spat escalates over Web3 ownership","entity_type":"ARTICLE","link":"\/news\/jack-dorsey-marc-andreesen-twitter-spat-escalates-over-web3-ownership\/88465234","category_name":null,"category_name_seo":"telecomnews"}],"related_content":[],"msid":88465250,"entity_type":"ARTICLE","title":"In Africa, rescuing the languages that Western tech ignores","synopsis":"\"We are getting to the point where if a machine doesn't understand your language it will be like it never existed,\" said Vukosi Marivate, chief of data science at the University of Pretoria in South Africa, in a call to action before a December virtual gathering of the world's artificial intelligence researchers.","titleseo":"telecomnews\/in-africa-rescuing-the-languages-that-western-tech-ignores","status":"ACTIVE","authors":[],"Alttitle":{"minfo":""},"artag":"AP","artdate":"2021-12-24 08:04:15","lastupd":"2021-12-24 08:06:29","breadcrumbTags":["Google","Internet","Big tech","Facebook","western tech companies","firefox","International"],"secinfo":{"seolocation":"telecomnews\/in-africa-rescuing-the-languages-that-western-tech-ignores"}}" data-news_link="//www.iser-br.com/news/in-africa-rescuing-the-languages-that-western-tech-ignores/88465250">