by ETRI
by ETRI

The Electronics and Telecommunications Research Institute (ETRI) announced on the 3rd that it has developed a 'conversational AI technology' that can recognize the world's 24 major languages ​​as voice and convert it into text.

ETRI explained that the performance of the speech recognition technology developed by ETRI is superior in Korean and comparable in other languages ​​compared to global companies such as Google.
 
The research team solved the difficulties of language expansion through self-supervised learning, application of doctor's label, large-capacity multilingual dictionary learning model, and audio data generation (TTS) augmentation technology from voice data.

We improved usability by improving the shortcomings of the commonly used end-to-end voice recognition technology, and developed a streaming inference technology for the problem of slow response speed and improved it to enable real-time processing.

In addition, a hybrid end-to-end recognition technology was developed and applied to make it easy to specialize in specific domains such as medical, legal, and scientific technology.

Sang-Hoon Kim, Senior Researcher at ETRI's Complex Intelligence Lab, said, "It is significant that we have developed a voice recognition technology that is comparable to that of a global leader with domestic technology. I hope it will be of great help to you.”

Copyright © SmartTimes. Prohibited from unauthorized reproduction and redistribution