Voice Activity Detector
The challenge of power consumption in a voice-activated system
The interaction between humans and devices is increasingly carried out through voice control, be it for plugged or battery-powered devices. This unavoidable trend is made feasible thanks to the emergence of powerful signal processing algorithms capable to convert, with high success rate, the human speech to text but also to understand meaningful information, such as keywords and commands.
Dolphin Integration introduces a breakthrough Voice Activity Detector, the WhisperTrigger™, to provide ultra-low power Always-On Voice Listening for the longest-lasting operation without battery recharge and for environmentally friendly devices.
Demonstration of our innovative Voice Activity Detector
by Cyprien Dumortier, application engineer for Audio IPs
Ultimate power savings
The WhisperTrigger™ is a stand-alone feature. It enables to switch-off the audio DSP and ADC, to divide the power consumption by up to 50 times compared to conventional approaches at system level.
WhisperTrigger™'s self-adjustable algorithm ensures the best detection rate whatever the environment conditions of the user.
Innovative Voice Activity Detector for ultimate power savings
Conventional Always-On Keyword Spotting can not be ultra-low power
To minimize power consumption of a voice-controlled device, the current approach is to handle speech recognition after Keyword Spotting. It involves to keep the voice ADC and the DSP always on.
Breakthrough Voice Activity Detectors for ultimate power savings
The breakthrough WhisperTrigger™ tracks continuously the surrounding sounds to identify
a voice activity so as to awaken the voice processing blocks (audio ADC and DSP) through a trigger
signal as soon as, but not sooner than, a voice is detected.
Two types of WhisperTrigger™ (WT-d and WT-a), are proposed for integration into the SoC to support both analog/electric and digital microphones. WhisperTrigger™ can also be embedded into a DMIC.
WhisperTrigger™–digital (WT-d): The fully synthesizable VAD to switch-off the audio DSP
The WT-d detects the presence of a voice at the output of a DMIC or at the output of a voice ADC embedded on SoC so as to switch-on the DSP, to perform Keyword Spotting, only once a voice is detected.
WhisperTrigger™ –analog (WT-a): The ultimate power saving solution
The WT-a enables to switch-off both ADC and DSP and to awake them only, but as soon as, a voice is detected.
Self-adaptable algorithm (patent-pending): the must-have
Users of wireless smart home, IoT products or Smartphones are exposed to different types of ambient noises all along the day. As a result, the only solution to ensure the best detection rate is to rely on a trigger capable to continuously adapt itself to the environment of the device user.
WhisperTrigger™'s powerful algorithm wakes-up the DSP handling Keyword Spotting with the best success rate, no matter the environment of the user and the distance between the user and the voice-activated device.
Check the self-adaptability to the background noise of WhisperTrigger™
MIWOK™, an objective benchmark for Voice Activity Detector
Dolphin Integration has developed MIWOK™, the first benchmark in the public domain to assess the 3 critical performances characterizing the performance in voice detection, be it in near field or far field condition, whatever the user environment, be it noisy or quiet:
- the capability to detect the beginning of a voice event with fast response
- the capability to avoid the wrong response to a noise event, meaning a high rate of Voice Detected as Voice (VDV) and a low rate of Noise Detected as Voice (NDV)
- the evaluation of the fast Detection Latency (DL) time
The WhisperTrigger™’s performances
|WhisperTrigger™ -d||WhisperTrigger™ -a|
|Detection Latency (DL)||Down to 4.6 ms||Down to 3 ms|
|Voice Detected as Voice (VDV)||Up to 100%||Up to 100%|
|Noise Detected as Voice (NDV)||Down to 7%||Down to 3.6%|
|Power consumption of the WT||25 µA||10 µA|