Goal has developed a series of artificial intelligence models to preserve languages that are in danger of disappearing, which are already capable of covering 4,000 languages spoken of all over the world.
The Models Massive multilingual speech programs have expanded text-to-speech and speech-to-text technologies from one hundred languages to more than 1,100, a leap that is also replicated in the identification of spoken languages, which now stands at 4,000, “40 times more than before ”.
This has been stated by Meta, which has shared news this Monday about the models of artificial intelligence in which he works to preserve languages that are in danger of disappearing. The purpose is that users can access the information and use your device in the language of your choice.
With this objective, it has announced the opening of its models and codes so that the research community can collaborate with this task, as stated in a statement shared on its official blog.
Likewise, he explained the approach that they have applied to the work with the models of massive multilingual speaking, which has allowed the jump in the number of languages it supports. For this, they have turned to the Bible, since it has been translated into many languages.
The translations of the Bible have been used in text-based language translation research, and are accompanied by recorded readings that are publicly available. From these audios, Meta has created “a data set of New Testament readings in more than 1,100 languages, which provided an average of 32 hours of data per language.”
to reach the figure Of the 4,000 languages, they also considered “unlabeled recordings of other Christian religious readings.” Despite being religious content, the company ensures that the models do not present a bias to produce more religious language.
The company hopes in the future to expand the number of languages that massively multilingual speech models can support, and to incorporate dialects.