You are currently viewing Raytheon Develops New Software to Bypass the Language Barrier in Research

Raytheon Develops New Software to Bypass the Language Barrier in Research

  • Post author:
  • Post category:News

Retrieving information has always been a key requirement of intelligence analysis. Yet, for centuries, the language barrier has presented a crucial obstacle. In 2017, IARPA (Intelligence Advanced Research Projects Activity) launched its MATERIAL (Machine Translation for English Retrieval of Information in Any Language) program to solve the age-old problem of cross-language data retrieving. Five years later, Raytheon BBN Technologies seems to have found the solution. The American conglomerate, battling with competitors such as Columbia University, Johns Hopkins University, and the University of Southern California Information Sciences Institute, designed software capable of translating search results from all languages to English.

The software elaborates research inputs in English, searches for results in other languages, and then translates them. It is essentially a universal tool, since the program also pays attention to both the context in which a language is used and its syntactic structure and applies to textual documents, images, videos and recordings. Moreover, the algorithm was created with particular attention to low-data languages, which present minimum levels of expertise. Kazakh, Pashto, Somali, Swahili and Tagalog were used to build up the software’s machine learning. Additional tests showed it also held against languages such as Farsi, Bulgarian, Lithuanian and Georgian. As Raytheon BBN Program Manager John Makhoul pointed out, low-resource languages have always been particularly problematic to tackle for research softwares due to a lack of data for machine learning algorithms – but Raytheon managed to overcome. Agencies will thus no longer be forced to coordinate research between multiple language-speaking experts.

Raytheon’s John Makhoul claimed the software «exceeded the goals of the program», as the evaluation team composed by MIT Lincoln Laboratory, the University of Maryland Center for Advanced Study of Language, the National Institute of Standards and Technology, and Tarragon Consulting seemed to agree. With the previous 2017 grant awarded to Columbia University worth 14 million $, the new contract awarded to Raytheon will likely exceed those figures.

Written by Matteo Acquarelli

Bibliography:

Florida, Robert, “Columbia University Awarded $14 Million Grant to Develop Computer System that Can Translate and Summarize Documents from Different Languages into English”, Columbia University, Data Science Institute, November 10, 2017, [online]. Available at: https://datascience.columbia.edu/news/2017/columbia-university-awarded-14-million-grant-to-develop-computer-system-that-can-translate-and-summarize-documents-from-different-languages-into-english/ [Accessed February 7, 2022].

Pomerleau, Mark, “IARPA wants those foreign documents translated … and fast”, C4ISRNET, December 27, 2017, [online]. Available at: https://www.c4isrnet.com/intel-geoint/2017/12/27/iarpa-wants-those-foreign-document-translated-and-fast/ [Accessed February 7, 2022].

Rubino, Carl, ‘MATERIAL. Designing an MT Program for IARPA’. (2018).  [Online] Available at: https://www.iarpa.gov/images/PropsersDayPDFs/MATERIAL/AMTA_2018_Keynote_Carl-Rubino.pdf [Accessed 7 February 2022].

Strout, Nathan, “Here’s how intelligence agencies can search foreign documents without learning the language”, C4ISRNET, February 1, 2022, [online]. Available at: https://www.c4isrnet.com/intel-geoint/2022/01/31/heres-how-intelligence-agencies-can-search-foreign-documents-without-learning-the-language/ [Accessed February 7, 2022].