EnergyDetection: Energy Detection
Warning
Do not use this software-based energy detector in conjunction with a hardware-based VAD. The interaction between the two will adversely affect performance.
Overview
Amazon Wake Word Engine comes with an acoustic energy detection (EnergyDetection) VAD implementation. It will trigger on any acoustic energy found in the input signal, whether it be human voice or not. This EnergyDetection is disabled by default - it must be explicitly enabled at initialization time. Note that the EnergyDetection remains active for a certain period of time after signal energy has stopped - this is referred to as the "hangover" time, and is approximately 500 ms. EnergyDetection is based on absolute signal level, and will trigger on signal levels above -52dBFS.
EnergyDetection can be used to reduce CPU usage during periods of very low or zero acoustic energy in the input stream. In systems with automatic CPU schedulers built in to the OS (Linux, iOS, Android, etc.), CPU savings are acquired automatically - EnergyDetection will gate running wakeword detection and other dependent algorithms if configured alongside those features unless explicitly disabled. In systems without automatic CPU scaling (i.e. low power DSPs/MCUs with no OS), the client application can use the EnergyDetection event detection to scale up or down the clock speed of the host processor programmatically.
EnergyDetection CPU consumption varies depending on processor architecture, but it typically consumes less than 1 MHz on most processors. Note that the use of EnergyDetection does very slightly degrade the accuracy of the WW Engine, typically 0.5-1.0% relative decrease in FAR/FRR, depending on the model being used.
Dependencies
- None (v1 and v2 both come with EnergyDetection)
Resource Requirements
EnergyDetection is built in to all versions of the Wake Word engine. It's memory footprint is in the 1-2K range and the EnergyDetection itself consumes between 0.25 and .75 Mhz on typical ARM processors (armv7a).