April 15, 2022 – As the war in Ukraine exploded in intensity, Japanese researchers analyzed snippets of Russian President Vladimir Putin's voice over several weeks. As his stress levels rose, the researchers said, his mental health distress became evident.
"The state of the psyche can be measured from speech sounds," said Kanji Okazaki, CEO of Risk Measurement Technologies in Tokyo, who did the Putin research and offers a voice analysis product to help companies track workers' mental health.
While U.S.-based experts urge caution about jumping to that conclusion about Putin based on the limited speech samples, they do agree that voice analysis, when done by experts and used with other vital pieces of information, has enormous potential for health care.
Voice, an Age-Old Clue to Health
Listening to a person's voice to evaluate physical or mental health status isn't new. Scientists have long known that diseases can affect organs such the lungs, brain, and heart, along with muscles and vocal folds, and in turn can change a person's voice.
Mental health providers know that as patients with depression get better with treatment, they tend to talk faster than before, with fewer pauses, for example. Patients with Parkinson's disease tend to have a low-volume voice with a monotone quality. Those with multiple sclerosis may slur their words, or have a disrupted speech pattern, with long pauses between words.
Now, artificial intelligence technology – developing computer systems to perform tasks that usually need human intelligence, including for speech recognition – promises to bump up the potential of voice analysis. Researchers can train an algorithm to listen for signs of stress in a voice, after researching voice differences in a diverse population of people and teasing out the differences between the stressed and non-stressed.
Voice analysis, researchers say, can potentially help diagnose mental health and other conditions, track how well treatments work, and even predict health problems such as heart attacks. Companies and researchers are already using voice analysis, and several apps are available for consumers to download and use on smartphones.
For the Putin analysis, Okazaki says, "the state of the vocal cords is read from the speech sound. When a person feels tension, under stress, the vocal cords become stiff. This is an involuntary reaction and cannot be controlled by oneself," he says. "Thus, the state of the psyche can be measured from speech sounds."
As an example, he suggests recalling how your voice rises or sounds "off" when you are nervous, because the tension causes the vocal cords to stiffen.
Okazaki analyzed more than an hour of Putin's speeches from Feb. 1 to March 19, then compared them to a calm talk he had given in September 2020 at the United Nations, where he praised international cooperation. By March 10, as the war outcome he envisioned had not come to be, stress levels detected were 40% above baseline, Okazaki said in previous reports.
In an email interview, Okazaki said his company is continuing to analyze Putin's voice but stops short of making any predictions about whether or when he might surrender.
"It must be said that predicting yielding is difficult,” he said. “This is because there is not enough data to make reliable predictions."
Experts: Not so Fast
"I would be very cautious about any suggestions that a psychiatric problem can be inferred [from Putin's speech recordings]," says Alexander S. Young, MD, a professor and interim chair of psychiatry and biobehavioral sciences at the David Geffen School of Medicine at UCLA. "In fact, I would suggest in this case not to make any such suggestion. True psychiatric evaluation would be required."
Voice analysis does hold promise but has a ways to go, experts in the field say.
"As a scientist who studies voice, I am excited about the potential," says Satrajit S. Ghosh, PhD, principal research scientist at the McGovern Research Institute for Brain Research at the Massachusetts Institute of Technology.
He uses neuroimaging, speech, and machine learning to improve mental health assessment and treatment. He has also evaluated published studies on the use of speech to assess psychiatric disorders.
"The field is in its infancy," he says. "It's very easy to be excited about these things. [But] I do feel the field is not advanced enough to know exactly the kinds of information we can extract from voice in relation to human behavior."
Like other experts, Ghosh says there are concerns about the technology. Databases used in the voice analyses need improvement, and privacy matters must be addressed, he says.
In the future, a high-quality voice sample, on its own, will be valuable, but the science needs to evolve, says Reza Hosseini Ghomi, MD, a neuropsychiatrist at the University of Washington and chief medical officer at Brain Check, a cognitive health technology platform. While he says voice analysis will be useful across the board, and he has researched its use for depression, he says a more practical focus is on conditions such as dementia and the loss of nerve cell function.
"In those cases, I have a truth to point to," he says, referring to evidence of brain buildup of plaques that can support the voice analysis findings.
When diagnosing the mental health status of someone, including Putin, voice isn't the only thing important to gauge, says Lillian Glass, PhD, a Los Angeles communication and body language expert. Considering only voice, she says, "is like looking [only] at the elephant's tail when you are trying to describe an elephant."
"You have to look at the body language. Is he shaking, does he move other parts of the body? Is there lots of movement?" Speech content counts, too, as does tone. "If you want to know how any of your leaders are doing, look at those aspects."
For consumers drawn to use the apps to assess themselves, Ghomi offers this advice: "Think of it as participating in research at this point."
Range of Voice Analysis Research
Among the areas under study using voice analysis:
Cardiac risk: An AI-based computer algorithm predicted a person's likelihood of having heart issues related to clogged arteries based on speech recordings, Mayo Clinic researchers reported. The researchers evaluated three 30-second voice recordings from 108 patients using a smartphone app. The system analyzed more than 80 features of voice recordings.
Then, the researchers gave each person a score. Over a 2-year follow-up, those with a high score were 2.6 times more likely to have a cardiac issue and 3 times more likely to show plaque buildup on medical tests than those with low scores.
Depression: Other researchers, including Ghomi, were able to pinpoint voice features collected from patients with depression that accurately predicted how a patient would answer a single question on a questionnaire that assesses if the patient is a suicide risk.
In another study, Korean researchers found that voice analysis can help health care providers detect minor and major depression. They extracted 21 voice features from interview recordings and compared them among three groups: 33 non-depressed participants, 26 with minor depression, and 34 with major depression. They found seven voice indicators that showed differences between the three groups, even after adjusting for things like age and body weight.
PTSD: A speech-based algorithm can help identify posttraumatic stress disorder patients, other researchers found. They obtained speech samples from veterans assessed by their doctors to have PTSD or not and pinpointed voice features more likely to be found in those with PTSD.
Voice Analysis Programs on the Market
In 2021, Cigna International, a global health service company, launched its StressWaves test for people and for employers whose members have Cigna insurance. A user talks for 90 seconds, responding to questions, and then gets an analysis of whether their stress level is low or high.
Sonde Health offers a Mental Fitness app, based on research showing that voice changes are linked to mental health changes, says Jim Harper, founder and chief operating officer of the Boston-based company.
"The goal is to encourage engagement," he says, emphasizing the tool's intent is to promote general wellness, and not to diagnose. The company has a respiratory health tool, too, after finding certain voice features linked to patients with asthma and other lung issues.
Eleos Health, a startup, offers a program for behavioral health specialists that, with consent, records doctor-patient sessions and, using voice artificial intelligence, measures progress while also promising to save doctors’ time.
Ghomi is an adviser for Kintsugi, developer of a voice biomarker technology for depression and anxiety, and is on the Biogen speakers' bureau.