'Speech' in cross pond high tech

Tags
Current selected tag: 'Speech'. Clear

Scientists have found a way to decode brain signals into speech | cross pond high tech | Scoop.it

From www.technologyreview.com - December 4, 2019 1:39 PM

You don’t have to think about it: when you speak, your brain sends signals to your lips, tongue, jaw, and larynx, which work together to produce the intended sounds.

Now scientists in San Francisco say they’ve tapped these brain signals to create a device capable of spitting out complete phrases, like “Don’t do Charlie’s dirty dishes” and “Critical equipment needs proper maintenance.”

The research is a step toward a system that would be able to help severely paralyzed people speak—and, maybe one day, consumer gadgets that let anyone send a text straight from the brain.

A team led by neurosurgeon Edward Chang at the University of California, San Francisco, recorded from the brains of five people with epilepsy, who were already undergoing brain surgery, as they spoke from a list of 100 phrases.

When Chang’s team subsequently fed the signals to a computer model of the human vocal system, it generated synthesized speech that was about half intelligible.

The effort doesn’t pick up on abstract thought, but instead listens for nerves firing as they tell your vocal organs to move. Previously, researchers have used such motor signals from other parts of the brain to control robotic arms.

“We are tapping into the parts of the brain that control these movements—we are trying to decode movements, rather than speech directly,” says Chang.

In Chang’s experiment, the signals were recorded using a flexible pad of electrodes called an electrocorticography array, or ECoG, that rests on the brain’s surface.

To test how well the signals could be used to re-create what the patients had said, the researchers played the synthesized results to people hired on Mechanical Turk, a crowdsourcing site, who tried to transcribe them using a pool of possible words. Those listeners could understand about 50 to 70% of the words, on average.

“This is probably the best work being done in BCI [brain-computer interfaces] right now,” says Andrew Schwartz, a researcher on such technologies at the University of Pittsburgh. He says if researchers were to put probes within the brain tissue, not just overlying the brain, the accuracy could be far greater.

Previous efforts have sought to reconstruct words or word sounds from brain signals. In January of this year, for example, researchers at Columbia University measured signals in the auditory part of the brain as subjects heard someone else speak the numbers 0 to 9. They were then able to determine what number had been heard.

Brain-computer interfaces are not yet advanced enough, nor simple enough, to assist people who are paralyzed, although that an objective of scientists.

Last year, another researcher at UCSF began recruiting people with ALS, or Lou Gehrig’s disease, to receive ECoG implants. That study will attempt to synthesize speech, according to a description of the trial, as well as asking patients to control an exoskeleton supporting their arms.

Chang says his own system is not being tested in patients. And it remains unclear if it would work for people unable to move their mouth. The UCSF team says that their set-up didn’t work nearly as well when they asked speakers to silently mouth words instead of saying them aloud.

Some Silicon Valley companies have said they hope to develop commercial thought-to-text brain readers. One of them, Facebook, says it is funding related research at UCSF “to demonstrate the first silent speech interface capable of typing 100 words per minute,” according to a spokesperson.

Facebook didn’t pay for the current study and UCSF declined to described what further research it's doing on behalf of the social media giant. But Facebook says it sees the implanted system is a step towards the type of consumer device it wants to create.

“This goal is well aligned with UCSF's mission to develop an implantable communications prosthesis for people who cannot speak – a mission we support. Facebook is not developing products that require implantable devices, but the research at UCSF may inform research into non-invasive technologies,” the company said.

Chang says he is “not aware” of any technology able to work from outside the brain, where the signals mix together and become difficult to read.

“The study that we did was involving people having neurosurgery. We are really not aware of currently available noninvasive technology that could allow you to do this from outside the head,” he says. “Believe me, if it did exist it would have profound medical applications.”

Amazon Alexa scientists find ways to improve speech and sound recognition | cross pond high tech | Scoop.it

From venturebeat.com - April 19, 2019 5:41 AM

How do assistants like Alexa discern sound? The answer lies in two Amazon research papers scheduled to be presented at this year’s International Conference on Acoustics, Speech, and Signal Processing in Aachen, Germany. Ming Sun, a senior speech scientist in the Alexa Speech group, detailed them this morning in a blog post.

“We develop[ed] a way to better characterize media audio by examining longer-duration audio streams versus merely classifying short audio snippets,” he said, “[and] we used semisupervised learning to train a system developed from an external dataset to do audio event detection.”

The first paper addresses the problem of media detection — that is, recognizing when voices captured from an assistant originate from a TV or radio rather than a human speaker. To tackle this, Sun and colleagues devised a machine learning model that identifies certain characteristics common to media sound, regardless of content, to delineate it from speech.

Scientists have found a way to decode brain signals into speech

Amazon Alexa scientists find ways to improve speech and sound recognition