Device translates nonverbal woman's thoughts into speech in real time

Publicly released:
International
Photo by Milad Fakurian on Unsplash
Photo by Milad Fakurian on Unsplash

International researchers have turned a woman's thoughts into speech using a device trained on her brain activity that can speak in real time. The researchers say similar devices have been designed previously, but they aimed to develop a device that could speak in real time to avoid delays in conversation. They implanted their device into a 47-year-old woman with quadriplegia who has been unable to speak for 18 years, and trained the device on her brain activity as she internally spoke sentences containing just over 1000 unique words. The voice of the device was also trained using a clip of the woman speaking before her injury to help her voice sound like her.

Media release

From: Springer Nature

Neuroscience: Developing a real-time thought-to-speech device for patients with paralysis 

A demonstration of online naturalistic streaming speech synthesis with synchronised text decoding from brain activity. Credit: Chang et al.
A demonstration of online streaming text-decoding and incremental text-to-speech synthesis from brain activity. Credit: Chang et al.

A new device capable of translating speech activity in the brain into spoken words in real-time is presented in a Nature Neuroscience paper. This technology could help people with speech loss to regain their ability to communicate more fluently in real time.

Current brain–computer interfaces for speech typically have a delay of a few seconds between the person silently attempting to say sentences and the computer’s verbal output, which prevents fluent and articulate communication. This can lead to miscommunication and frustration between the listener and speaker. A real-time system has the potential to restore the natural flow of conversation, which could improve the quality of life for patients who cannot speak.

Edward Chang, Gopala Anumanchipalli, and colleagues developed a silent brain–computer interface and implanted it into a 47-year-old woman with quadriplegia (paralysis of the limbs and torso), who could not speak or vocalize for 18 years after experiencing a stroke, as part of a clinical trial. The authors trained a deep learning neural network using the participant’s brain activity, which was recorded using electrodes implanted over her speech sensorimotor cortex, while she internally spoke complete sentences in her brain which contained 1,024 unique words. This model was then used to decode online speech in 80-millisecond increments, simultaneously with the participant’s vocal intent, and then produce audio mimicking the participant’s voice trained using a clip from her speaking before the injury. The brain–computer interface could also generalize to words the participant had not been exposed to during training, and the authors found that the device could operate continuously as opposed to in increments of a few seconds each.

Although further research is needed in more participants, the device could potentially help patients with speech paralysis to speak more naturally and seamlessly in real time and improve their quality of life, the authors suggest.

Multimedia

A demonstration of online naturalistic streaming speech synthesis
A demonstration of online streaming text-decoding and incremental text-to-speech

Attachments

Note: Not all attachments are visible to the general public. Research URLs will go live after the embargo ends.

Research Springer Nature, Web page The URL will go live after the embargo ends
Journal/
conference:
Nature Neuroscience
Research:Paper
Organisation/s: University of California, USA
Funder: For this research, support was provided by the National Institutes of Health (grant NINDS 5U01DC018671), the Japan Science and Technology Agency’s Moonshot Research and Development Program, the Joan and Sandy Weill Foundation, Susan and Bill Oberndorf, Ron Conway, Graham and Christina Spencer and the William K. Bowes, Jr., Foundation for K.T.L., C.J.C., J.R.L., A.B.S., V.R.A., C.M.K.-M., S.B., I.P.H., D.A.M., G.K.A. and E.F.C. Additionally, K.T.L., C.J.C. and G.K.A. were supported by the UC Noyce Initiative, Rose Hills Innovator program, Google Research Scholar Award, NSF award (2106928), BAIR. G.K.A. holds the Robert E. and Beverly A. Brooks Professorship at UC Berkeley. The National Institute On Deafness and Other Communication Disorders of the National Institutes of Health (award number F30DC021872) supports A.B.S. K.T.L. is supported by the National Science Foundation GRFP. B.Y., A.P.K., A.S., A.T.-C. and K.G. did not have relevant funding.
Media Contact/s
Contact details are only visible to registered journalists.