Novosibirsk scientists have developed the world's first "smart" program for voice diagnostics of laryngeal diseases and depression (video)

1 June 2020

In early May, specialists of NSTU NETI together with colleagues from NSPU and the phoniatric center developed the world's first system that can diagnose long-term psychoemotional abnormalities often masked by the patient's voice. The system also helps to determine the early stages of the tumors development in the vocal apparatus. In addition to medicine, the system can be used in teaching, social work, security and identification systems.

Scientists from the Novosibirsk State Technical University NETI came to conclusion that psychoemotional disorders can be diagnosed by analyzing sound waves. After years of research, they found that the voice changes were related to psychoemotional disorders. The patented technique allows accurate detecting abnormalities by digital audio processing.

Previuosly, the attempts to create similar voice diagnostics systems were also made abroad. In Russia, such developments were conducted mainly for military purposes. The scientists of Radio Engineering and Electronics Faculty created the device whose unique feature is the algorithm of the voice signal digital processing. This algorithm finds a correlation between voice changes and psychoemotional disorders of the speaker.

Novosibirsk scientists have confirmed the hypothesis that human speech changes depending on the type of disorder. Thanks to tests conducted on children of the primary school group, experts found a variable in a specially created mathematical model that is responsible for the correlation of the voice with the psychoemotional state. According to scientists, the number of disorders is now growing among both children and adults. This can be anxiety, depression, aggression, or autoagression.

The device performs analysis using an algorithm based on acoustic analysis of the audio signal. A set of test phrases is recorded on a microphone and digitized by a high-resolution sound card. This is how the sound wave is converted into a digital signal. Then the program processes this wave: the algorithm calculates the parameters of high-frequency and low-frequency vibrations, sound power, and builds a curve. The obtained parameters of the studied voice are compared with the reference sample. Based on the differences obtained, the person's psychoemotional state is determined. The system will be easy to use: it only needs a microphone, a specialized audio card and a computer with the software.

"To explain how the program works, let me give you an example. Let's assume that there is a sound parameter X, which in a normal person has a value of 0.2—0.3, and in a person with a clear psychoemotional disorder, it equal 5. From this we can conclude that a person has certain symptoms, for example, autoaggression. This is the correlation we were looking for. Finally we have found it" says Daria Borovikova, a co-author of the development, a young scientist at Radio Engineering and Electronics Faculty.

In addition to psychoemotional diseases, the program can also detect functional disorders of the voice, which can later lead to constant organic changes in the larynx: laryngitis, chorditis, knots and polyps of the vocal chords, laryngeal papillomatosis and other diseases. According to scientists, functional disorders are more difficult to diagnose than organic ones, which can be seen using laryngoscopes and endoscopes. However, they also pose a threat. Such disorders are usually associated with improper use of the vocal apparatus. They often occur in children who begin to incorrectly engage in vocalism at an early age. It is the long incorrect sound reproduction that leads to inflammations.

Previously, psychological disorders and functional disorders in the patient's voice were detected by a specialist exclusively "by ear". The results of such diagnostics were subjective and depended on a particular doctor's competence. "The objective detection of abnormalities in the voice of a patient is a very complex and challenging task. Contacting highly qualified specialists is not always possible. Detection of functional disorders in the voice at early stages allows preventing the diseases," said Olga Fetisova, a leading specialist of the regional consulting and diagnostic phoniatric center, a speech therapist of the highest qualification category.

Other developments based on the use of voice analysis technologies nearly always involved the use of sensors that had to be fixed on the patient's neck. The disadvantage of such methods is the need for contact between the doctor and the patient. It may be impossible, for example, in conditions of self-isolation. Most of the methods have vulnerabilities related to signal processing technology. The voice analysis system developed by NSTU NETI scientists is devoid of these disadvantages and aims at the remote signal characteristics recognition as well as tracing the changes in the signal.

Another area of the digital voice processing system application may be the sphere of security and military affairs. With additional research, the program will be able to make an examination of the voice in the terrorists and criminals phone conversations. It can also become an additional element in the lie detector and monitor the psychoemotional state of soldiers.

Besides medicine and security, the technology can be used in teaching and social work. In the future, the scientists plan to conduct a number of additional studies and increase the number of subjects. According to Daria Borovikova, creating such an expanded sample will help to adjust the system parameters and make the diagnostic system more accurate.

The scholars involved in the project besides the employees of the NSTU NETI, include the specialists in psychology from Novosibirsk State Pedagogical University and a phonopaedist researching voice and its training. According to Daria Borovikova, it was thanks to the collective work of engineers, a voice specialist and a psychologist that the entire team managed to achieve such results.

The founder of the scientific development is Vladimir Makukha, Doctor of Technical Sciences, Professor of Radio Engineering and Electronics Faculty. For more than 10 years he has been engaged in the voice-speech characteristics objectification. Vladimir Makukha, the Honorary University Worker, died in July 2019. Now the project is being supervised by Oleg Grishin, Professor of Radio Engineering and Electronics Faculty, NSTU NETI, Doctor of Medical Sciences.

Video of the "smart" program for psychoemotional and functional disorders diagnostics

Back to news list