Intelligence Researchers Pursue Comprehensive Text Translation

March 27, 2017

The US Intelligence Advanced Research Projects Agency (IARPA) is seeking programmers to help develop a tool that can quickly search text in over 7,000 languages. ArsTechnica reports on the initiative (dubbed the Machine Translation for English Retrieval of Information in Any Language, or MATERIAL) in the article, “Intelligence Seeks a Universal Translator for Text Search in Any Language.” As it is, it takes time to teach a search algorithm to translate each language. For the most-used tongues, this process is quite well-along, but not so for “low-resource” languages. Writer Sean Gallagher explains:

To get reliable translation of text based on all variables could take years of language-specific training and development. Doing so for every language in a single system—even to just get a concise summary of what a document is about, as MATERIAL seeks to do—would be a tall order. Which is why one of the goals of MATERIAL, according to the IARPA announcement, ‘is to drastically decrease the time and data needed to field systems capable of fulfilling an English-in, English-out task.’

Those taking on the MATERIAL program will be given access to a limited set of machine translation and automatic speech recognition training data from multiple languages ‘to enable performers to learn how to quickly adapt their methods to a wide variety of materials in various genres and domains,’ the announcement explained. ‘As the program progresses, performers will apply and adapt these methods in increasingly shortened time frames to new languages.’

Interested developers should note candidates are not expected to have foreign-language expertise. Gallagher notes that IARPA plans to publish their research publicly; he looks forward to wider access to foreign-language documents down the road, should the organization meet their goal.

Cynthia Murrell, March 27, 2017

Comments

Comments are closed.

  • Archives

  • Recent Posts

  • Meta