By Adin Blumofe
Steven Hawking famously interacted with the world by speaking through a craggy, static, electrified voice from his high wheelchair. The technology that gave him a voice piqued many people’s interest. Hawking’s ALS limited him to speaking through jerks in his cheek, which a computer rendered into words. While Hawking’s vocal technology was impressive, it does not compare to the advancements that science had made in the field of Artificial Intelligence (AI).
A team of researchers from UC Berkeley and UC San Francisco, have developed a technology to bring a voice to the voiceless and expressions to the expressionless. The idea behind the technology is that language can be broken down into its integral sounds—phonemes. As the brain conjures the sounds, an AI identifies and strings together the activating neurons as audible sentences. The benefit to this approach is the relative ‘simplicity’ in training a functional AI, as English only employs 39 unique sounds, compared to the thousands of words that occupy the modern parlance. Additionally, scientists want to create a virtual face that can properly inflect in sync with all the words—giving a semblance of motion to the motionless.
The development centers on one woman: Ann. She had an unexplained brain stem stroke at the age of 30 in 2005. The episode made her suffer from “locked-in syndrome.” She possessed full neurological function and the capacity to experience all the senses, but was incapable of reciprocating. Trapped inside a box that was her body, not “able to wink and say a few words.”
Initially, physical therapists were at a loss for how to approach this unfortunate case. Through extensive physical therapy, over the years, she has regained the most basic functions, like training herself how to breathe autonomously. Ann found out about Dr. Chang’s research in 2021, when she read about Pancho, a patient with a similar ailment, when the UCSF-UCB managed to convert the patient’s brain waves into speech for the first time in history.
The team removed a portion of her skull and attached 253 electrodes directly to a critical portion of the brain that processes language. “The electrodes intercepted the brain signals that, if not for the stroke, would have gone to muscles in Ann’s lips, tongue, jaw and larynx, as well as her face. A cable, plugged into a port fixed to Ann’s head, connected the electrodes to a bank of computers,” according to UCSF. They trained the AI for weeks by having Ann think about saying 1,024 words, teaching the AI the specific pattern of neurons to fire. The team soon developed an AI with which Ann can now speak at a rate of 80 words per minute. For reference, her previous speech tool, based on earlier technology, could only produce 14 words per minute.
Beyond just giving a voice to the voiceless, the team is giving Ann back her voice. They used audio recordings from her wedding to make the speech sound like herself before the stroke. Upon hearing her voice, Ann said that the voice sounded like an old friend. Her children, the youngest, who was only one at the time of the incident, can ostensibly hear her mother’s voice for the first time.
This miraculous technology has already received FDA approval; in a real sense—the future is here. Being able to speak and be expressive is a gift people do not often contemplate and appreciate unless they are deprived of it.