AI-Powered Noise-Canceling Headphones Allow Only One Voice to Pass Through
In our bustling modern world, noise-canceling headphones offer respite from the cacophony. However, their indiscriminate sound reduction can inadvertently drown out important voices. Enter a new prototype AI system called “Target Speech Hearing.” It allows users to select a specific person’s voice to remain audible even amidst noise cancellation. Although still a proof of concept, this technology could soon enhance popular noise-canceling earbuds and even hearing aids, ensuring that we hear what truly matters.
A new AI system for headphones aims to fix this problem. Called Target Speech Hearing, the system lets users choose a person’s voice to hear clearly while canceling out all other sounds.
This technology is still in the early stages, but its creators are talking with popular noise-canceling earbud brands and also want to make it available for hearing aids.
“Listening to specific people is a key part of how we communicate and interact with others,” says Shyam Gollakota, a professor at the University of Washington who worked on the project. “It can be very difficult to focus on certain people in noisy environments, even if you don’t have hearing problems.”
Trainning AI to Recognize and Filter
The same researchers had previously trained an AI to recognize and filter out specific sounds like babies crying, birds chirping, or alarms ringing. However, separating human voices is more difficult and needs more complex AI. This complexity is a problem because AI models must work in real-time on headphones with limited computing power and battery life. To handle these limits, the neural networks had to be small and energy-efficient. So, the team used a technique called knowledge distillation. This involved taking a large AI model trained on millions of voices (the “teacher”) and using it to train a much smaller model (the “student”) to perform just as well.
The smaller model was then taught to pick out specific voices from the surrounding noise using microphones on a pair of regular noise-canceling headphones.
How it Works
To use the Target Speech Hearing system, the wearer holds down a button on the headphones for a few seconds while facing the person they want to focus on. During this time, called “enrollment,” the system records an audio sample from both headphones to identify the speaker’s voice, even if there are other voices and noises around.
These voice features are sent to a neural network on a small computer connected to the headphones by a USB cable. This network runs all the time, separating the chosen voice from others and playing it back to the listener. Once the system locks onto a speaker, it keeps focusing on that person’s voice, even if the wearer turns away. The more the system listens to a speaker, the better it gets at isolating that voice.
Currently, the system can only successfully focus on a speaker if their voice is the loudest one. However, the team is working to make it function even when the loudest voice isn’t the target speaker.
Advancing Speech Separation: Practical Applications and Future Prospects
Picking out one voice in a noisy place is very hard, says Sefik Emre Eskimez, a senior researcher at Microsoft who works on speech and AI but didn’t work on this particular study. “I know companies want to do this,” he says. “If they can figure it out, it could be useful in many areas, especially during meetings.”
Although speech separation research is usually more about theory than practice, this study has clear practical uses, says Samuele Cornell, a researcher at Carnegie Mellon University’s Language Technologies Institute who also didn’t work on this study. “I think it’s a move in the right direction,” Cornell says. “It’s a refreshing change.”
Read the Original Article MIT Technology Review
Read more Kilnam Chon Predicts AI Surpassing Human Intelligence in 30 Years