AI could help detect early voice box cancer with just the sound of your voice

Publicly released:
International
Photo by Emmanuel Ikwuegbu on Unsplash
Photo by Emmanuel Ikwuegbu on Unsplash

AI could one day help doctors distinguish between laryngeal cancer and benign lesions on the voice box, at least in men, according to international researchers. In a proof-of-concept study, US researchers used AI to analyse variations in tone, pitch, volume, and clarity of over 12,000 voice recordings of 306 people. They found certain features of the voice, especially the harmonic-to-noise ratio in men (which refers to the relationship between tone and noise), helped distinguish those with a voice disorder, those with benign vocal fold lesions, and those with laryngeal cancer. The team haven’t yet found such informative features of women’s voices, but say more data could help.

Media release

From: Frontiers

AI could soon detect early voice box cancer from the sound of your voice

Vocal fold lesions and early stages of laryngeal cancer alter acoustics of the voice, paving the way for AI recognition

Researchers have shown that patients with benign vocal fold lesions and laryngeal cancer can be distinguished through acoustic features of the voice, especially the harmonic-to-noise ratio and its variation within speech. Now that the proof-of-principle has been established, this task can be fed into machine learning algorithms, to build AI applications that triage patients at risk for laryngeal cancer based on their voice.

Main text: Cancer of the voice box or larynx is an important public health burden. In 2021, there were an estimated 1.1 million cases of laryngeal cancer worldwide, and approximately 100,000 people died from it. Risk factors include smoking, alcohol abuse, and infection with human papillomavirus. The prognosis for laryngeal cancer ranges from 35% to 78% survival over five years when treated, depending on the tumor’s stage and its location within the voice box.

Catching cancer early is key for a patient’s prospects. At present, laryngeal cancers are diagnosed through video nasal endoscopy and biopsies – onerous, invasive procedures. Getting to a specialist who can perform these procedures can take time, causing delays in diagnosis. But now, researchers have shown in Frontiers in Digital Health that abnormalities of the vocal folds can be detected from the sound of the voice. Such ‘vocal fold lesions’ can be benign, like nodules or polyps, but may also represent the early stages of laryngeal cancer. These proof-of-principle results open the door for a new application of AI: namely, to recognize the early warning stages of laryngeal cancer from voice recordings.

“Here we show that with this dataset we could use vocal biomarkers to distinguish voices from patients with vocal fold lesions from those without such lesions,” said Dr Phillip Jenkins, a postdoctoral fellow in clinical informatics at Oregon Health & Science University, and the study’s corresponding author.

Voice messages

Jenkins and his colleagues are members of the ‘Bridge2AI-Voice’ project within the US National Institute of Health’s ‘Bridge to Artificial Intelligence’ (Bridge2AI) consortium, a nationwide endeavor to apply AI to complex biomedical challenges. Here, they analyzed variations in tone, pitch, volume, and clarity within the first version of the public Bridge2AI-Voice dataset, with 12,523 voice recordings of 306 participants from across North America.

A minority were from patients with known laryngeal cancer, benign vocal fold lesions, or two other conditions of the voice box: spasmodic dysphonia and unilateral vocal fold paralysis.

The researchers focused on differences in a number of acoustic features of the voice: for example, the mean fundamental frequency (pitch); jitter, variation in pitch within speech; shimmer, variation of the amplitude; and the harmonic-to-noise ratio, a measure of the relation between harmonic and noise components of speech.

The researchers found marked differences in the harmonic-to-noise ratio and fundamental frequency between men without any voice disorder, men with benign vocal fold lesions, and men with laryngeal cancer. They didn’t find any informative acoustic features among women, but it is possible that a larger dataset would reveal such differences.

The authors concluded that especially variation in the harmonic-to-noise ratio can be helpful to monitor the clinical evolution of vocal fold lesions, and to detect laryngeal cancer at an early stage, at least in men.

“Our results suggest that ethically sourced, large, multi‑institutional datasets like Bridge2AI‑Voice could soon help make our voice a practical biomarker for cancer risk in clinical care,” said Jenkins.

Building a bridge to AI

Now that the proof-of-principle has been established,  the next step is to use these algorithms on more data and test them in clinical settings on patient voices.

“To move from this study to an AI tool that recognizes vocal fold lesions, we would train models using an even larger dataset of voice recordings, labeled by professionals. We then need to test the system to make sure it works equally well for women and men,” said Jenkins.

“Voice-based health tools are already being piloted. Building on our findings, I estimate that with larger datasets and clinical validation, similar tools to detect vocal fold lesions might enter pilot testing in the next couple of years," predicted Jenkins.

Attachments

Note: Not all attachments are visible to the general public. Research URLs will go live after the embargo ends.

Research Frontiers, Web page The URL will go live after the embargo lifts.
Journal/
conference:
Frontiers in Digital Health
Research:Paper
Organisation/s: Oregon Health and Science University, USA
Funder: The author(s) declare that no financial support was received for the research and/or publication of this article.
Media Contact/s
Contact details are only visible to registered journalists.