Ucom has developed an automatic Armenian speech recognition system, based on artificial intelligence.
Itel.am talked to team leader of Ucom’s speech recognition system project Zaven Navoyan about the technology.
About the technology
“This system automatically recognizes and records the Armenian speech during a telephone conversation,” Zaven Navoyan explained, emphasizing that the system does not understand the content and simply records it in a text version.
According to Zaven Navoyan, Ucom team has worked on the development of the system for 6 years now. The work involves specialists in natural language processing and machine learning, programmers, linguists.
The operation will be accompanied with collection of new recordings, which will help improve system quality.
About the idea
The idea of elaborating the system was born 6 years ago.
Zaven Navoyan told that the idea was abstract at the beginning and it was impossible to estimate its complexity. Despite the risks, the team decided to pursue the goal.
“Your target is essential. Your task will be relatively easy to achieve, if you are aiming at getting quality in quiet conditions, when there is only one speaker. However, your task gets more complex, when your system has to understand any caller with any diction in any conditions (including noisy places),” Zaven Navoyan noted.
A number of challenges appeared in the stage of development.
According to Zaven Navoyan, the reason is the difference between formal and colloquial forms of Armenian language.
“Informal Armenian often contains sentences where 90% of words are Russian or slang expressions. The technology has to operate and secure quality in these difficult circumstances,” he explained.
Another issue is finding experts for the job. Overall, ten people worked on the speech recognition system.
“When we were just starting, we couldn’t find the specialists we needed in Armenia, so we had to outsource some part of the work to Europe,” noted Navoyan.
He pointed out noise as the third obstacle they encountered.
“It turns out Armenians are quite noisy. We sent the recordings to our European colleague and they were surprised to find that much noise in the phone calls between Armenians. Over the past six years we collected and annotated a rather large noise database. The goal is for the technology to be able to recognize and differentiate the noise instead of likening it to a word. The database helped us stabilize the technology when it’s used in noisy environment,” said Navoyan.
“The technology is ready. Now we intend to test it within Ucom, but we have yet to define what issues we want to solve in that period. At the moment, the technology can work just for phone call recording, because it only operates in certain acoustic conditions. The system we have developed enables it to work in other places too, but we aren’t thinking about that yet,” concluded Zaven Navoyan.
Narine Daneghyan talked to Zaven Navoyan