avs-device-sdk/extension/avs-weakup-sdk/docs/search/search_index.js

1 line
298 KiB
JavaScript

const local_index = {"config":{"lang":["en"],"min_search_length":3,"prebuild_index":false,"separator":"[\\s\\-]+"},"docs":[{"location":"index.html","text":"Amazon's Wake Word Engine, aka \"PryonLite\", is a highly optimized & portable wake word engine for use in various classes of embedded processors. This documentation is dynamically generated based on the features and targets supported in this release. The provided engine contains the following features: Wake Word Wake Word Diagnostic Information (WWDI) Fingerprinting Watermarking Device Artifact Vending Service (DAVS) This release contains engines built for the following targets: The general architecture of the engine and its usage is described here . To get started integrating right away, see Getting Started , and be sure the check out any architecture-specific information for your processor architectures in Architecture Notes . For detailed descriptions of all features included with this release and how to integrate them, see Features . If integrating Wake Word into the AVS Device SDK, see the AVS Device SDK section. To get help and browe the FAQ, see Support .","title":"Welcome"},{"location":"api-versions-features.html","text":"API Versions & Features API Versions There are currently 2 API Versions for the Amazon Wake Word Engine, they are not compatible. API v1 (deprecated) API Version 1 supports Wake Word detection only. Depending on the engine type, it can support all wakeword model types except for \"X\" type models. \"X\" type models require API Version 2. API v2 (PRLXXXX) API Version 2 is required for features other than Wake Word detection. V2 binaries and API usage samples are named with a -PRLXXXX suffix that indicates the features set supported. API Version 2 is not backward compatible with API V1. Features The following table shows the features supported by the v1 API and v2 API (PRLXXXX) versions: Version Wake Word Wake Word (X-class)* Feature Extraction Media Suppression Acoustic Event Detection (AED) Speaker Verification (SV) v1 \u2705 PRL1000 (v2) \u2705 PRL1100 (v2) \u2705 \u2705 PRL1200 (v2) \u2705 \u2705 \u2705 PRL1300 (v2) \u2705 \u2705 \u2705 PRL2000 (v2) \u2705 \u2705 \u2705 PRL3000 (v2) \u2705 \u2705 \u2705 \u2705 PRL5000 (v2) \u2705 \u2705 \u2705 \u2705 \u2705 * v1 API and PRL1000 (v2) do not support \"X\" class models. They are for ultra-low power implementations only . For details on the above features, see the Features section.","title":"API Versions & Features"},{"location":"api-versions-features.html#api-versions-features","text":"","title":"API Versions &amp; Features"},{"location":"api-versions-features.html#api-versions","text":"There are currently 2 API Versions for the Amazon Wake Word Engine, they are not compatible.","title":"API Versions"},{"location":"api-versions-features.html#api-v1-deprecated","text":"API Version 1 supports Wake Word detection only. Depending on the engine type, it can support all wakeword model types except for \"X\" type models. \"X\" type models require API Version 2.","title":"API v1 (deprecated)"},{"location":"api-versions-features.html#api-v2-prlxxxx","text":"API Version 2 is required for features other than Wake Word detection. V2 binaries and API usage samples are named with a -PRLXXXX suffix that indicates the features set supported. API Version 2 is not backward compatible with API V1.","title":"API v2 (PRLXXXX)"},{"location":"api-versions-features.html#features","text":"The following table shows the features supported by the v1 API and v2 API (PRLXXXX) versions: Version Wake Word Wake Word (X-class)* Feature Extraction Media Suppression Acoustic Event Detection (AED) Speaker Verification (SV) v1 \u2705 PRL1000 (v2) \u2705 PRL1100 (v2) \u2705 \u2705 PRL1200 (v2) \u2705 \u2705 \u2705 PRL1300 (v2) \u2705 \u2705 \u2705 PRL2000 (v2) \u2705 \u2705 \u2705 PRL3000 (v2) \u2705 \u2705 \u2705 \u2705 PRL5000 (v2) \u2705 \u2705 \u2705 \u2705 \u2705 * v1 API and PRL1000 (v2) do not support \"X\" class models. They are for ultra-low power implementations only . For details on the above features, see the Features section.","title":"Features"},{"location":"general-architecture.html","text":"General Architecture Overall Operation The following diagram illustrates the general operation of the Wake Word Engine inside an application: During an initialization phase, the application loads the Wake Word Model into memory. It then passes the model, along with configuration parameters to the engine. The engine stores a reference to this model memory to use during its processing, so it needs to remain persistent throughout the lifetime of the application. The application queries the engine for the size of scratch memory needed to run the loaded model. It allocates the required amount and completes the initialization. The application then runs a continuous loop, feeding 16kHz audio samples into the engine, 10ms (160 samples) at a time. When a wake word is detected, or any other event such as VAD or events generated from other components, the engine calls an event handling callback (registered during initialization). Wake Word detection events will trigger a callback after all audio containing the wake word has been passed in, as illustrated in the waveform / timeline below. v2 API Callbacks The v2 API uses a single callback function to pass back all supported event types. The top level PryonLiteV2Event structure has pointers to payloads for all supported event types. When an event is triggered pointers to valid payloads will be set and pointers to payloads not applicable to the event will be NULL. See the API sample code for an example of how to handle events.","title":"General Architecture"},{"location":"general-architecture.html#general-architecture","text":"","title":"General Architecture"},{"location":"general-architecture.html#overall-operation","text":"The following diagram illustrates the general operation of the Wake Word Engine inside an application: During an initialization phase, the application loads the Wake Word Model into memory. It then passes the model, along with configuration parameters to the engine. The engine stores a reference to this model memory to use during its processing, so it needs to remain persistent throughout the lifetime of the application. The application queries the engine for the size of scratch memory needed to run the loaded model. It allocates the required amount and completes the initialization. The application then runs a continuous loop, feeding 16kHz audio samples into the engine, 10ms (160 samples) at a time. When a wake word is detected, or any other event such as VAD or events generated from other components, the engine calls an event handling callback (registered during initialization). Wake Word detection events will trigger a callback after all audio containing the wake word has been passed in, as illustrated in the waveform / timeline below.","title":"Overall Operation"},{"location":"general-architecture.html#v2-api-callbacks","text":"The v2 API uses a single callback function to pass back all supported event types. The top level PryonLiteV2Event structure has pointers to payloads for all supported event types. When an event is triggered pointers to valid payloads will be set and pointers to payloads not applicable to the event will be NULL. See the API sample code for an example of how to handle events.","title":"v2 API Callbacks"},{"location":"release-history.html","text":"Release History 2.16.1 (2022-Apr-19) Changes Bugfix: Corrected default config initializer for VAD Compatibility WW models with ECID = 12 2.16.0 (2022-Mar-26) Changes Added API for standalone VAD . Existing inline WW VAD enableVad has been replaced. To replicate the original functionality, enable/disable EnergyDetection alongside wakeword detection. Compatibility WW models with ECID = 12 2.15.1 (2022-Feb-8) Changes The following model classes were renamed: WR_50k -> WR_50k_SA WS_50k -> WS_50k_SA U_1S_50k -> U_50k_SA U_50k -> U_50k_CC Compatibility WW models with ECID = 12 2.15.0 (2022-Jan-6) Changes Update not affecting this library. Compatibility WW models with ECID = 12 2.14.0 (2021-Oct-15) Changes Update not affecting this library. Compatibility WW models with ECID = 11 2.13.0 (2021-Sep-13) Changes Model format change. Not backward compatible. Compatibility WW models with ECID = 10 2.12.0 (2021-Sep-7) Changes Update not affecting this library. Compatibility WW models with ECID = 9 2.11.1 (2021-Aug-10) Changes Metadata Update Compatibility - backwards compatible 2.11.0 (2021-May-26) Changes Added watermarking feature for PRL2000 Bugfix: Fixed an issue where fingerprint media suppression did not work when X_* models were loaded Compatibility WW models with ECID = 8 2.10.6 (2021-May-13) Changes Bugfix: fixed an issue where STOP keyword might be detected without a preceding wakeword Compatibility WW models with ECID = 8 2.10.5 (2021-May-12) Changes Bugfix: Added checks for initialized PryonLiteDecoder Compatibility WW models with ECID = 8 2.10.4 (2021-May-07) Changes iOS build upgraded to xcframeworks Compatibility WW models with ECID = 8 2.10.3 (2021-April-20) Changes Fixed potential early detections on ephemeral use-cases with X class models Compatibility WW models with ECID = 8 2.10.2 (2021-April-05) Changes Update engine API Compatibility WW models with ECID = 8 2.10.1 (2021-March-31) Changes DNN accelerator callback functions' signatures updated in pryon_lite_accel.h for architectures using an external neural network accelerator. Compatibility WW models with ECID = 8 2.10.0 (2021-March-19) Changes Model format change to support a new architecture. The Models are backwards compatible with 2.9.0 engine for other architectures. Compatibility WW models with ECID = 8 2.9.0 (2020-October-12) Changes Model format change. Not backward compatible Modified public API PryonLite_GetConfigAttributes() output structure PryonLiteWakewordConfigAttributes to return the list of supported keywords for a wake word configuration Added wake word API PryonLiteWakeword_EnableKeyword() to enable/disable one or all keywords Compatibility WW models with ECID = 7 2.8.2 (2020-September-21) Changes Wake Word end index accuracy improvement Reduced wake word detection latency (up to 4 frames). Bugfix: fixed detections start-end indexes when VAD is enabled. Time spent with VAD active was not taken into account. Compatibility WW models with ECID = 6 2.8.1 (2020-August-28) Changes Bugfix: fixed a memory tracking issue for kalimba architecture The workaround is to ensure engineMem argument of PryonLite_Initialize() is 64-bit aligned. Compatibility WW models with ECID = 6 2.8.0 (2020-July-17) Changes Update not affecting this library Compatibility WW models with ECID = 6 2.7.1 (2020-June-23) Changes Optimized x86 architecture using AVX2 instructions Compatibility WW models with ECID = 6 2.7.0 (2020-June-10) Changes Model format change. Not backward compatible Wake Word end index accuracy improvement Compatibility WW models with ECID = 6 2.6.1 (2020-June-04) Changes Metadata Update Compatibility WW models with ECID = 5 2.6.0 (2020-May-15) Changes Add PryonLite_SetClientProperty V2 API Compatibility WW models with ECID = 5 2.5.0 (2020-May-11) Changes Update not affecting this library Compatibility WW models with ECID = 4 2.4.4 (2020-April-20) Changes Bugfix: fixed an issue where the same wake word could be detected twice Compatibility WW models with ECID = 4 2.4.3 (2020-April-17) Changes Update not affecting this library Compatibility WW models with ECID = 4 2.4.2 (2020-March-20) Changes Improve accuracy of classification scores Compatibility WW models with ECID = 4 2.4.1 (2020-March-18) Changes Prevent false-detections early in the stream Improved accuracy of wake word detection for W models, at the cost of 100ms latency on average Compatibility WW models with ECID = 4 2.4.0 (2020-March-13) Changes Model format change. Not backward compatible Wake Word detection improvements Compatibility WW models with ECID = 4 2.3.2 (2020-January-22) Changes Model format change. Backward compatible Compatibility WW models with ECID = 2, 3 2.3.1 (2019-December-23) Changes Metadata Update Compatibility WW models with ECID = 2 2.3.0 (2019-December-10) Changes Fingerprinting API update for configuration value name changes to 0.2.0; not backwards compatible with 0.1.0. Compatibility WW models with ECID = 2 2.2.0 (2019-October-26) Changes PryonLiteError enum replaced with PryonLiteStatus structure for every version 2.x API function return value. Compatibility WW models with ECID = 2 2.1.0 (2019-Sep-26) Changes Models format change Note: This engine version is not backward compatible with previous models format Compatibility WW models with ECID = 2 2.0.1 (2019-Aug-21) Changes Bugfix: fixed detection lockout Compatibility WW models with ECID = 1 2.0.0 (2019-August-15) Changes Addition of a multi-feature capable API. Note on backwards compatibility: API for version 1.x is still supported but will be deprecated. Only wake word is supported by API version 1.x. Compatibility WW models with ECID = 1 1.13.5 (2019-July-16) Changes Refactored header files Compatibility backwards compatible 1.13.4 (2019-Jun-14) Changes Bugfix: fixed negative StartIndex when detection happens very early in the stream Compatibility backwards compatible 1.13.3 (2019-May-20) Changes Improved VAD performance. Compatibility backwards compatible 1.13.2 (2019-May-16) Changes Bugfix: fixed VAD counter overflow. Compatibility backwards compatible 1.13.1 (2019-May-15) Changes Bugfix: Memory layout fixes Compatibility backwards compatible 1.13.0 (2019-March-04) Changes More wake word start index accuracy improvement. Compatibility backwards compatible 1.12.2 (2019-February-28) Changes Wake Word start index accuracy improved for U-class models. Compatibility backwards compatible 1.12.1 (2019-February-18) Changes Bugfix: reported overwritten debug data Compatibility backwards compatible 1.12.0 (2019-January-31) Changes Models format change Compatibility Not backwards compatible with previous models format 1.11.3 (2019-January-10) Changes Added model_compatibility.json in models folder. Moved the common/x86 format model files from models folder to models/common folder. Compatibility backwards compatible 1.11.2 (2019-January-8) Changes Metadata Update Compatibility backwards compatible 1.11.1 (2018-December 27) Changes Updated LICENSE.txt Compatibility backwards compatible 1.11.0 (2018-December 10) Changes Models format change. Optimized arm-neon model format introduced. ARM architectures using neon must now use these models from models/arm-neon/ .* Compatibility backwards compatible for API. Arm architectures must use the new interleaved models. 1.10.1 (2018-December 4) Changes Add support for \"W\" type models. These are usable with the \"U\" and universal type engines. Compatibility backwards compatible 1.10.0 (2018-November-26) Changes Models format change Compatibility backwards compatible 1.9.6 (2018-November 8) Changes Bugfix: improve accuracy of detection indices Compatibility backwards compatible 1.9.5 (2018-October-16) Changes Metadata Update Compatibility backwards compatible 1.9.4 (2018-October-15) Changes Bugfix: Allow optional vadCallback parameter to be NULL when useVad = true Compatibility backwards compatible 1.9.3 (2018-October-12) Changes Bugfix: Validation of model configuration Compatibility backwards compatible 1.9.2 (2018-October-10) Changes Moved api_sample.cpp to / Compatibility backwards compatible 1.9.1 (2018-September-21) Changes Add support for larger model sizes Compatibility the engine is backwards compatible, but newer larger models will not work with older engines 1.9.0 (2018-August-14) Changes Revert change of start and end indices from 1.6.0, back to long long absolute sample indices. Compatibility backwards compatible 1.8.0 (2018-August-8) Changes Added a userData variable in the PryonLiteDecoderConfig and PryonLiteResult structures user can pass data while initializing the decoder and can obtain the the data in the PryonLiteResult in the callback. Added the userData in VAD callback. Compatibility backwards compatible 1.7.0 (2018-August-2) Changes Change sample data container from int to short. Sample must be 16 bit right aligned. Make pryon_lite.h architecture-specific and move it into the architecture folder Compatibility backwards compatible 1.6.0 (2018-July-19) Changes Change sample data container from a short to an int. Samples must be 16 bit right aligned data in this integer container. Change start and end indices in the result structure to be negative offsets from the most recent pushed frame. Prior to this change the beginSampleIndex and endSampleIndex in the PryonLiteResult struct were long long values. The values were absolute sample indices. After this change the two indices are integers. These values are negative offsets from the most recent sample pushed to the engine through the PryonLiteDecoder_PushAudioSamples call. If the only information needed from the result is the relative index or the duration of the wakeword then there are no application code changes needed. If the absolute indices are still required, the application code must keep track of how many samples are pushed to the decoder and add that value to the relative indices. In general this means adding the value of the sampleCount parameter passed to PryonLiteDecoder_PushAudioSamples to a global counter whenever the function succeeds. Then when a detection is seen, add this global counter to the beginSampleIndex and endSampleIndex. Compatibility Both a result structure change and an API change. 1.5.0 (2018-July-16) Changes Models format change Compatibility backwards compatible 1.4.0 (2018-June-29) Changes Models format change Compatibility backwards compatible 1.3.0 (2018-May-15) Changes Addition of near-miss threshold to report possible wake word utterances that did not break the required confidence threshold Compatibility backwards compatible 1.2.0 (2018-May-14) Changes Changed the boolean values in PryonLiteDecoderConfig struct to integer values. Compatibility backwards compatible 1.1.0 (2018-May-10) Changes Addition of support for querying the maximum size of the metadata blob returned as part of a detection result, via the PryonLite_GetEngineAttributes() API call. Compatibility backwards compatible 1.0.3 (2018-Apr-26) Changes Addition of \"confidence\" to detection result. The confidence is an integer between 0 (lowest) and 1000 (highest) Compatibility Model/engine are not backwards-compatible 1.0.2 (2018-Apr-03) Changes Addition of PryonLiteDecoder_SetDetectionThreshold() API call to set the detection threshold at run-time. This is useful for scenarios such as music playback when the application might want more sensitive detection behavior. Compatibility backwards compatible 1.0.1 (2018-Feb-01) Changes Addition of \"lowLatency\" mode, valid only for \"U\" class models. This configuration parameter will reduce latency by 225 ms on average. Disabled by default - to enable, set PryonLiteDecoderConfig.lowLatency = true. Compatibility backwards compatible 1.0.0 (2018-Jan-22) Changes Initial revision Compatibility backwards compatible Fingerprint-Based Media Suppression Version History 0.1.0 (2019-August-15) Changes Initial revision.","title":"Release History"},{"location":"release-history.html#release-history","text":"","title":"Release History"},{"location":"release-history.html#2161-2022-apr-19","text":"","title":"2.16.1 (2022-Apr-19)"},{"location":"release-history.html#2160-2022-mar-26","text":"","title":"2.16.0 (2022-Mar-26)"},{"location":"release-history.html#2151-2022-feb-8","text":"","title":"2.15.1 (2022-Feb-8)"},{"location":"release-history.html#2150-2022-jan-6","text":"","title":"2.15.0 (2022-Jan-6)"},{"location":"release-history.html#2140-2021-oct-15","text":"","title":"2.14.0 (2021-Oct-15)"},{"location":"release-history.html#2130-2021-sep-13","text":"","title":"2.13.0 (2021-Sep-13)"},{"location":"release-history.html#2120-2021-sep-7","text":"","title":"2.12.0 (2021-Sep-7)"},{"location":"release-history.html#2111-2021-aug-10","text":"","title":"2.11.1 (2021-Aug-10)"},{"location":"release-history.html#-backwards-compatible","text":"","title":"- backwards compatible"},{"location":"release-history.html#2110-2021-may-26","text":"","title":"2.11.0 (2021-May-26)"},{"location":"release-history.html#2106-2021-may-13","text":"","title":"2.10.6 (2021-May-13)"},{"location":"release-history.html#2105-2021-may-12","text":"","title":"2.10.5 (2021-May-12)"},{"location":"release-history.html#2104-2021-may-07","text":"","title":"2.10.4 (2021-May-07)"},{"location":"release-history.html#2103-2021-april-20","text":"","title":"2.10.3 (2021-April-20)"},{"location":"release-history.html#2102-2021-april-05","text":"","title":"2.10.2 (2021-April-05)"},{"location":"release-history.html#2101-2021-march-31","text":"","title":"2.10.1 (2021-March-31)"},{"location":"release-history.html#2100-2021-march-19","text":"","title":"2.10.0 (2021-March-19)"},{"location":"release-history.html#290-2020-october-12","text":"","title":"2.9.0 (2020-October-12)"},{"location":"release-history.html#282-2020-september-21","text":"","title":"2.8.2 (2020-September-21)"},{"location":"release-history.html#281-2020-august-28","text":"","title":"2.8.1 (2020-August-28)"},{"location":"release-history.html#280-2020-july-17","text":"","title":"2.8.0 (2020-July-17)"},{"location":"release-history.html#271-2020-june-23","text":"","title":"2.7.1 (2020-June-23)"},{"location":"release-history.html#270-2020-june-10","text":"","title":"2.7.0 (2020-June-10)"},{"location":"release-history.html#261-2020-june-04","text":"","title":"2.6.1 (2020-June-04)"},{"location":"release-history.html#260-2020-may-15","text":"","title":"2.6.0 (2020-May-15)"},{"location":"release-history.html#250-2020-may-11","text":"","title":"2.5.0 (2020-May-11)"},{"location":"release-history.html#244-2020-april-20","text":"","title":"2.4.4 (2020-April-20)"},{"location":"release-history.html#243-2020-april-17","text":"","title":"2.4.3 (2020-April-17)"},{"location":"release-history.html#242-2020-march-20","text":"","title":"2.4.2 (2020-March-20)"},{"location":"release-history.html#241-2020-march-18","text":"","title":"2.4.1 (2020-March-18)"},{"location":"release-history.html#240-2020-march-13","text":"","title":"2.4.0 (2020-March-13)"},{"location":"release-history.html#232-2020-january-22","text":"","title":"2.3.2 (2020-January-22)"},{"location":"release-history.html#231-2019-december-23","text":"","title":"2.3.1 (2019-December-23)"},{"location":"release-history.html#230-2019-december-10","text":"","title":"2.3.0 (2019-December-10)"},{"location":"release-history.html#220-2019-october-26","text":"","title":"2.2.0 (2019-October-26)"},{"location":"release-history.html#210-2019-sep-26","text":"","title":"2.1.0 (2019-Sep-26)"},{"location":"release-history.html#201-2019-aug-21","text":"","title":"2.0.1 (2019-Aug-21)"},{"location":"release-history.html#200-2019-august-15","text":"","title":"2.0.0 (2019-August-15)"},{"location":"release-history.html#1135-2019-july-16","text":"","title":"1.13.5 (2019-July-16)"},{"location":"release-history.html#1134-2019-jun-14","text":"","title":"1.13.4 (2019-Jun-14)"},{"location":"release-history.html#1133-2019-may-20","text":"","title":"1.13.3 (2019-May-20)"},{"location":"release-history.html#1132-2019-may-16","text":"","title":"1.13.2 (2019-May-16)"},{"location":"release-history.html#1131-2019-may-15","text":"","title":"1.13.1 (2019-May-15)"},{"location":"release-history.html#1130-2019-march-04","text":"","title":"1.13.0 (2019-March-04)"},{"location":"release-history.html#1122-2019-february-28","text":"","title":"1.12.2 (2019-February-28)"},{"location":"release-history.html#1121-2019-february-18","text":"","title":"1.12.1 (2019-February-18)"},{"location":"release-history.html#1120-2019-january-31","text":"","title":"1.12.0 (2019-January-31)"},{"location":"release-history.html#1113-2019-january-10","text":"","title":"1.11.3 (2019-January-10)"},{"location":"release-history.html#1112-2019-january-8","text":"","title":"1.11.2 (2019-January-8)"},{"location":"release-history.html#1111-2018-december-27","text":"","title":"1.11.1 (2018-December 27)"},{"location":"release-history.html#1110-2018-december-10","text":"","title":"1.11.0 (2018-December 10)"},{"location":"release-history.html#1101-2018-december-4","text":"","title":"1.10.1 (2018-December 4)"},{"location":"release-history.html#1100-2018-november-26","text":"","title":"1.10.0 (2018-November-26)"},{"location":"release-history.html#196-2018-november-8","text":"","title":"1.9.6 (2018-November 8)"},{"location":"release-history.html#195-2018-october-16","text":"","title":"1.9.5 (2018-October-16)"},{"location":"release-history.html#194-2018-october-15","text":"","title":"1.9.4 (2018-October-15)"},{"location":"release-history.html#193-2018-october-12","text":"","title":"1.9.3 (2018-October-12)"},{"location":"release-history.html#192-2018-october-10","text":"","title":"1.9.2 (2018-October-10)"},{"location":"release-history.html#191-2018-september-21","text":"","title":"1.9.1 (2018-September-21)"},{"location":"release-history.html#190-2018-august-14","text":"","title":"1.9.0 (2018-August-14)"},{"location":"release-history.html#180-2018-august-8","text":"","title":"1.8.0 (2018-August-8)"},{"location":"release-history.html#170-2018-august-2","text":"","title":"1.7.0 (2018-August-2)"},{"location":"release-history.html#160-2018-july-19","text":"","title":"1.6.0 (2018-July-19)"},{"location":"release-history.html#150-2018-july-16","text":"","title":"1.5.0 (2018-July-16)"},{"location":"release-history.html#140-2018-june-29","text":"","title":"1.4.0 (2018-June-29)"},{"location":"release-history.html#130-2018-may-15","text":"","title":"1.3.0 (2018-May-15)"},{"location":"release-history.html#120-2018-may-14","text":"","title":"1.2.0 (2018-May-14)"},{"location":"release-history.html#110-2018-may-10","text":"","title":"1.1.0 (2018-May-10)"},{"location":"release-history.html#103-2018-apr-26","text":"","title":"1.0.3 (2018-Apr-26)"},{"location":"release-history.html#102-2018-apr-03","text":"","title":"1.0.2 (2018-Apr-03)"},{"location":"release-history.html#101-2018-feb-01","text":"","title":"1.0.1 (2018-Feb-01)"},{"location":"release-history.html#100-2018-jan-22","text":"","title":"1.0.0 (2018-Jan-22)"},{"location":"release-history.html#fingerprint-based-media-suppression-version-history","text":"0.1.0 (2019-August-15)","title":"Fingerprint-Based Media Suppression Version History"},{"location":"supported-architectures.html","text":"Supported Architectures Amazon's Wake Word engine is available for the following architectures: Architecture Target System Examples Optimized? Notes x86/64: Linux (Ubuntu, Amazon Linux) Desktop/Cloud PC Yes Supported by the AVS Device SDK x86/64: MacOS Mac No Supported by the AVS Device SDK x86/64: Windows Desktop / Cloud PC No Supported by the AVS Device SDK Raspberry Pi 3/4 Raspberry Pi Yes Supported by the AVS Device SDK Android Android Phone Yes Various NDK versions >= 16, supports armv7a, armv8a & x86 armv7a MediaTek C4x, Qualcomm APQx/IPQx, ARM Cortex A-Series Yes armv8a ARM Cortex A-Series Yes ARM9 (ARMv5TE) NXP i.MX28 No ARM Cortex M4 Mediatek AB1552, STMicro STM32Fx Yes ARM Cortex M7 NXP i.MX RT1050 CM7, STMicro STM32Hx Yes ARM Cortex M33 Yes ARM Cortex M55 Yes CEVA TeakLite 3 DSPG DBM10L Yes Supports NNLite accelerator CEVA TeakLite 4 TL410 Beken Corp BK3268 Yes CEVA TeakLite 4 TL420 Beken Corp BK3268 Yes CEVA X2 Beken Corp BK3288, BK7271 Yes Cirrus Logic ADSP2 Cirrus Logic CS48LV40 Yes Cirrus Logic Halo Cirrus Logic CS48L32 Yes EMSDK WebAssembly No iOS Apple iPhone/iPad No Delivered as an xcframework MediaTek RISC MRV 33 No MediaTek RISC MRV 55 No Tensilica Hifi Mini MediaTek MT2811, RT5518 Yes Tensilica Hifi 3 Knowles IA8201 Yes Tensilica Hifi4 MediaTek MT8168, MT8512 Yes Tensilica Hifi5 Yes Tensilica Fusion F1 NXP RT500 No Qualcomm Kalimba4 Stretto Yes Qualcomm Kalimba4 StrettoPlus Yes Qualcomm Hexagon v62 Yes Qualcomm Hexagon v65 Yes Qualcomm Hexagon v66 Yes Microsemi Starcore Starcore SC140 No MIPS XBurst Yes XMOS XS3A XMOS XVF3510 Yes","title":"Supported Architectures"},{"location":"supported-architectures.html#supported-architectures","text":"Amazon's Wake Word engine is available for the following architectures: Architecture Target System Examples Optimized? Notes x86/64: Linux (Ubuntu, Amazon Linux) Desktop/Cloud PC Yes Supported by the AVS Device SDK x86/64: MacOS Mac No Supported by the AVS Device SDK x86/64: Windows Desktop / Cloud PC No Supported by the AVS Device SDK Raspberry Pi 3/4 Raspberry Pi Yes Supported by the AVS Device SDK Android Android Phone Yes Various NDK versions >= 16, supports armv7a, armv8a & x86 armv7a MediaTek C4x, Qualcomm APQx/IPQx, ARM Cortex A-Series Yes armv8a ARM Cortex A-Series Yes ARM9 (ARMv5TE) NXP i.MX28 No ARM Cortex M4 Mediatek AB1552, STMicro STM32Fx Yes ARM Cortex M7 NXP i.MX RT1050 CM7, STMicro STM32Hx Yes ARM Cortex M33 Yes ARM Cortex M55 Yes CEVA TeakLite 3 DSPG DBM10L Yes Supports NNLite accelerator CEVA TeakLite 4 TL410 Beken Corp BK3268 Yes CEVA TeakLite 4 TL420 Beken Corp BK3268 Yes CEVA X2 Beken Corp BK3288, BK7271 Yes Cirrus Logic ADSP2 Cirrus Logic CS48LV40 Yes Cirrus Logic Halo Cirrus Logic CS48L32 Yes EMSDK WebAssembly No iOS Apple iPhone/iPad No Delivered as an xcframework MediaTek RISC MRV 33 No MediaTek RISC MRV 55 No Tensilica Hifi Mini MediaTek MT2811, RT5518 Yes Tensilica Hifi 3 Knowles IA8201 Yes Tensilica Hifi4 MediaTek MT8168, MT8512 Yes Tensilica Hifi5 Yes Tensilica Fusion F1 NXP RT500 No Qualcomm Kalimba4 Stretto Yes Qualcomm Kalimba4 StrettoPlus Yes Qualcomm Hexagon v62 Yes Qualcomm Hexagon v65 Yes Qualcomm Hexagon v66 Yes Microsemi Starcore Starcore SC140 No MIPS XBurst Yes XMOS XS3A XMOS XVF3510 Yes","title":"Supported Architectures"},{"location":"wrappers-sdk-integrations.html","text":"Wrappers & SDK Integrations Wrappers While the engine is natively written in C, there are wrappers available for higher level languages so that the engine can be integrated into mobile applications for iOS, Android, and others. These wrappers are as close to a 1:1 binding of the native API as possible for the given wrapper language. Currently there are wrappers available for: Java (Android/JNI) iOS (Swift) SDK Integrations One architectural level above the wrappers are SDK Integrations. These are fully functional integrations with existing Client SDKs, and usually come in the form of a set of scripts / patches to be applied to a specific version of an Alexa Client SDK. Currently there are adapters for the following SDKs: AVS Device SDK","title":"Wrappers & SDK Integrations"},{"location":"wrappers-sdk-integrations.html#wrappers-sdk-integrations","text":"","title":"Wrappers &amp; SDK Integrations"},{"location":"wrappers-sdk-integrations.html#wrappers","text":"While the engine is natively written in C, there are wrappers available for higher level languages so that the engine can be integrated into mobile applications for iOS, Android, and others. These wrappers are as close to a 1:1 binding of the native API as possible for the given wrapper language. Currently there are wrappers available for: Java (Android/JNI) iOS (Swift)","title":"Wrappers"},{"location":"wrappers-sdk-integrations.html#sdk-integrations","text":"One architectural level above the wrappers are SDK Integrations. These are fully functional integrations with existing Client SDKs, and usually come in the form of a set of scripts / patches to be applied to a specific version of an Alexa Client SDK. Currently there are adapters for the following SDKs: AVS Device SDK","title":"SDK Integrations"},{"location":"api-reference/index.html","text":"API Reference Coming Soon This section is currently being authored","title":"API Reference"},{"location":"api-reference/index.html#api-reference","text":"Coming Soon This section is currently being authored","title":"API Reference"},{"location":"api-reference/wrappers/java/java-binding-engine-reference.html","text":"PryonLite Java Engine API At its core, one creates a PryonLite engine by instanciating and initializing a PryonLite5000 java object. Event handlers are registered on object creation prior to initialization. 1. Engine Configuration Prior to engine use, you must prepare a PryonLite5000.Config structure with object configuration parameters, consisting of the following: /** * Configuration parameters for attribute query and instance initialization. */ public static class Config { // Wake Word /** * Wake Word model binary. */ public byte [] wakewordModel ; /** * Wake Word detection threshold. Integer in range [1, 1000]. * Recommended value is 500. * 1 = most permissive threshold, most detections. * 1000 = least permissive threshold, fewest detections. */ public int detectThreshold ; /** * Flag for enabling voice activity detector pre-stage. * For most application-core integrations, this should be set to false. */ public boolean useVad ; /** * Flag for enabling low-latency detection mode. * Only valid for type 'U' models. Results in ~200ms lower detection * latency, at the cost of less accurate ww end index reporting. */ public boolean lowLatency ; // Fingerprint (Media-induced wake suppression) /** * Binary containing media-derived reference fingerprints for matching and suppression. * This may be set to null to disable fingerprint-matching-based suppression. */ public byte [] fingerprintList ; // Speaker Verification /** * Speaker Verification model binary. * This may be set to null to disable speaker verification functionality. */ public byte [] speakerVerificationModel ; /** * Number of enrollment examples required to generate a Speaker Verification profile. */ public int numEnrollmentExamples ; /** * Minimum dB value for average SNR to accept wake word eample for enrollment. */ public int minEnrollmentSnr ; /** * Maximum number of Speaker Verification profiles that can be simultaneously loaded. */ public int maxLoadableProfiles ; /** * Maximum size of Profile ID used for enrollment; must be a multiple of 4. */ public int maxProfileIdSize ; } We can see above that for PRL5000, we have three distinct sections for Wake Word, Fingerprinting, and Speaker Verification. 2. Engine Attributes Attributes for the configuration structure can be retrieved via a getAttributes() method, and consist of the following: /** * Attributes associated with a specified PryonLite configuration. * * @note Class not marked as static, as it causes JNI GetMethodID for the constructor to fail. */ public class Attributes { public String engineVersion ; ///< PryonLite engine version. public int maxMetadataBlobSize ; ///< Maximum size of metadata blob returned with wake word detection result. public int requiredMem ; ///< Memory in bytes required by an engine instance using this configuration. public int samplesPerFrame ; ///< Samples per frame for PushAudio // Wake Word public String wakewordApiVersion ; ///< Wake Word API version public String wakewordConfigVersion ; ///< Wake Word configuration version (wakeword model) // Fingerprint public String fingerprintApiVersion ; ///< Fingerprint API version public int fingerprintListVersion ; ///< Fingerprint List Version // Speaker Verification public String speakerVerificationApiVersion ; ///< Speaker Verification API version public String speakerVerificationConfigVersion ; ///< Speaker Verification configuration version (the model) public byte [] speakerVerificationModelId ; ///< Speaker Verification Model ID public int speakerVerificationMaxProfileSize ; ///< Speaker Verification maximum profile size public List < String > speakerVerificationKeywords ; ///< List of keywords supported by Speaker Verification Model. public List < String > speakerVerificationLocales ; ///< List of locales supported by Speaker Verification Model. } We can see again that fields are grouped by functionality into sections for wake word, fingerprinting, and speaker verification. The following are some of the attributes associated with the engine as a whole (and not tied to specific features like the wake word detector, fingerprinting, or speaker verification). public String engineVersion ; ///< PryonLite engine version. public int maxMetadataBlobSize ; ///< Maximum size of metadata blob returned with wake word detection result. public int requiredMem ; ///< Memory in bytes required by an engine instance using this configuration. public int samplesPerFrame ; ///< Samples per frame for PushAudio 3. Engine Commands 3.1 Engine Object Creation A PryonLite Java object can be created with the following constructor - note that there are additional steps needed to initialize the engine so that it can process audio streams. /** * Constructor for the PryonLite engine. * * @param callbacks The callbacks to register for event notifications. */ public PryonLite5000 ( final Callbacks callbacks ) { this . callbacks = callbacks ; this . nativeMem = 0 ; } This creates the environment in which attribute query, initialization, and all further interactions with the engine may proceed. Callbacks The callbacks that must be registered with the PryonLite engine instance on object creation are as follows: /** * Set of callbacks for handling event from the PryonLite object. */ public interface Callbacks { /** * Function signature of event handler for wake word detection events. * * @param wakeWord [in] The detected wake word. * @param beginSampleIndex [in] The first sample of the wake word, relative to the last sample pushed. * @param endSampleIndex [in] The last sample of the wake word, relative to the last sample pushed. * @param metadata [in] Wake Word Engine Metadata associated with wake word event. */ void wakeWordDetected ( final String wakeWord , final long beginSampleIndex , final long endSampleIndex , final byte [] metadata ); /** * Function signature of event handler for voice activity detection (VAD) state changes. * * @param state - new VAD state. */ void vadStateChanged ( final int state ); /** * Function signature of event handler for serious errors encountered during execution * of native methods where the error cannot be returned through the method's return value. * This would generally be used in a context leading up to the invocation of another event * handler in the Callbacks interface, like wakeWordDetected(). * * @param errorCode [in] Internal code, consult vendor. */ void errorEvent ( final int errorCode ); /** * Function signature of event handler for Speaker Verification enrollment events. * * @param profileId [in] Profile ID associated with the enrollment session. * @param notification [in] The type of notification (example accepted, example rejected, profile generated) * @param profile [in] If not null, the voice profile blob generated by an enrollment session. * @param metadata [in] speaker verification enrollment event metadata blob. */ void speakerVerificationEnrollmentEvent ( final byte [] profileId , final int notification , final byte [] profile , final byte [] metadata ); /** * Function signature of event handler for Speaker Verification classification events. * * @param profileId [in] Profile ID associated with the event. * @param score [in] The classification score. * @param metadata [in] speaker verification classification event metadata blob. */ void speakerVerificationClassificationEvent ( final byte [] profileId , final int score , final byte [] metadata ); /** * Function signature of event handler for Speaker Verification enrollment example capture events. * * @param wakeWord [in] The detected wake word. * @param startIndexInSamples [in] Index of the first sample of the wake word, relative to the start * of the wake word example buffer. * @param endIndexInSamples [in] Index of the last sample of the wake word, relative to the start of the * wake word example buffer. * @param example [in] The wake word example audio buffer. * @param metadata [in] Wake Word detection metadata associated with the wake word example. */ void speakerVerificationWakewordExampleEvent ( final String wakeWord , final int startIndexInSamples , final int endIndexInSamples , final short [] example , final byte [] metadata ); } Aside from the errorEvent callback, the remaining callbacks are associated with functionalities like wake word, and speaker verification. 3.2 Engine Attribute Query Attributes for a particular engine configuration can be requested by calling the following: /** * Query attributes associated with a given PryonLite configuration. * * @param config [in] Input configuration parameters. * @return Attributes if successful; NULL otherwise. */ public Attributes getAttributes ( final Config config ); Note that attributes for a particular set of configuration parameters may and should be retrieved prior to initialization, to retrieve attributes relevant to the engine while it is in use. The getAttributes() may also be called after engine initialization, as it is one of the few methods that is not marked with the synchronized keyword. 3.3 Engine Object Initialization PryonLite engine initialization is through the following method: /** * Initializes the PryonLite instance. * * @param config [in] Input configuration parameters. * @return Zero if successful, non-zero error code otherwise. * @note Here is a list of error codes for this function and their meanings. * 0: Success, nominal operation. * -1: Config parameter is invalid. * -2: The object has already been initialized. * -3: JNI memory allocation for the instance failed. * -4: A global reference could not be created for the object. * -5: Query of class properties failed. * -6: Import of Java config parameters failed. (DEPRECATED) * -7: Native memory allocation for the instance failed. * -8: Audio buffer allocation for the instance failed. * -9: Saving of JNI memory pointer to object failed. * -10: Speaker Verification voice profile memory allocation failed. * All other values: Internal code, consult vendor. */ public synchronized int initialize ( final Config config ); 3.4 Pushing Audio to the Engine Audio buffers are streamed to the PryonLite engine by repeatedly calling the following function: /** * Pushes a frame of audio data to the PryonLite instance. * * @param samples [in] Audio frame from input audio stream. * @return Zero if successful, non-zero error code otherwise. * @note This function should be invoked only after initialize() has been called, * and should not be invoked after a destroy(). * @note Here is a list of error codes for this function and their meanings. * 0: Success, nominal operation. * -32: The object has not been initialized. * -33: Input sample array is null. * -34: Length of input sample array is invalid. * -35: Retrieval of the input sample array failed. * -36: The object passed into pushAudio() does not match the object initialized. * All other values: Internal code, consult vendor. * @note This method does not Log.i as part of nominal operation, both due to its invocation * frequency/regularity, and to respect the high priority context within which it is typically * invoked. If you have use cases where you are invoking pushAudio irregularly, and need to * understand the timing of such invocations, you are free to add a log statement in the calling code. */ public synchronized int pushAudio ( final short [] samples ); 3.5 Engine Object Destruction Prior to deleting the PryonLite5000 Java object, any engine initialization MUST be mirrored by a call to this method. /** * Release PryonLite instance. * * @return Zero if successful, non-zero error code otherwise. * @note Here is a list of error codes for this function and their meanings. * 0: Success, nominal operation. * -31: The object has not been initialized. * All other values: Internal code, consult vendor. */ public synchronized int destroy (); 3.6 Retrieval of Audio Frame Size The following function is redundant, as samples per frame is returned as an attribute. However, it has been maintained for backwards compatibility. Future versions of this Java binding may deprecate this method. /** * Returns the number of samples per frame for audio pushes. * * @return Samples per frame for audio pushes, <= 0 on error. * @note This value must be used when subsequent calls to pushAudio() are made. * @note This function should be invoked only after initialize() has been called, * and should not be invoked after a destroy(). * @note Also returned as part of attribute query, which can be invoked prior to * initializing a PryonLite object. * @note Here is a list of error codes for this function and their meanings. * 0: This code is not used and reserved for future use. * -37: The object has not been initialized. * All other values: Internal code, consult vendor. */ public synchronized int getSamplesPerFrame (); We can see from the comments above that the scope of validity for invocation of the getSamplesPerFrame() method is more limited than that of getAttributes() . 3.7 Querying Engine Object Initialization Status The following convenience function allows the PryonLite Java client to check if a given object has been initialized or not: /** * Checks if the instance has been initialized. * * @return 1 if initialized, 0 if not initialized */ public synchronized int isInitialized (); 4. Engine Events 4.1 Error Reporting Errors that are not easily returned through a command's return code are returned via the following callback: /** * Function signature of event handler for serious errors encountered during execution * of native methods where the error cannot be returned through the method's return value. * This would generally be used in a context leading up to the invocation of another event * handler in the Callbacks interface, like wakeWordDetected(). * * @param errorCode [in] Internal code, consult vendor. */ void errorEvent ( final int errorCode );","title":"Engine API"},{"location":"api-reference/wrappers/java/java-binding-engine-reference.html#pryonlite-java-engine-api","text":"At its core, one creates a PryonLite engine by instanciating and initializing a PryonLite5000 java object. Event handlers are registered on object creation prior to initialization.","title":"PryonLite Java Engine API"},{"location":"api-reference/wrappers/java/java-binding-engine-reference.html#1-engine-configuration","text":"Prior to engine use, you must prepare a PryonLite5000.Config structure with object configuration parameters, consisting of the following: /** * Configuration parameters for attribute query and instance initialization. */ public static class Config { // Wake Word /** * Wake Word model binary. */ public byte [] wakewordModel ; /** * Wake Word detection threshold. Integer in range [1, 1000]. * Recommended value is 500. * 1 = most permissive threshold, most detections. * 1000 = least permissive threshold, fewest detections. */ public int detectThreshold ; /** * Flag for enabling voice activity detector pre-stage. * For most application-core integrations, this should be set to false. */ public boolean useVad ; /** * Flag for enabling low-latency detection mode. * Only valid for type 'U' models. Results in ~200ms lower detection * latency, at the cost of less accurate ww end index reporting. */ public boolean lowLatency ; // Fingerprint (Media-induced wake suppression) /** * Binary containing media-derived reference fingerprints for matching and suppression. * This may be set to null to disable fingerprint-matching-based suppression. */ public byte [] fingerprintList ; // Speaker Verification /** * Speaker Verification model binary. * This may be set to null to disable speaker verification functionality. */ public byte [] speakerVerificationModel ; /** * Number of enrollment examples required to generate a Speaker Verification profile. */ public int numEnrollmentExamples ; /** * Minimum dB value for average SNR to accept wake word eample for enrollment. */ public int minEnrollmentSnr ; /** * Maximum number of Speaker Verification profiles that can be simultaneously loaded. */ public int maxLoadableProfiles ; /** * Maximum size of Profile ID used for enrollment; must be a multiple of 4. */ public int maxProfileIdSize ; } We can see above that for PRL5000, we have three distinct sections for Wake Word, Fingerprinting, and Speaker Verification.","title":"1. Engine Configuration"},{"location":"api-reference/wrappers/java/java-binding-engine-reference.html#2-engine-attributes","text":"Attributes for the configuration structure can be retrieved via a getAttributes() method, and consist of the following: /** * Attributes associated with a specified PryonLite configuration. * * @note Class not marked as static, as it causes JNI GetMethodID for the constructor to fail. */ public class Attributes { public String engineVersion ; ///< PryonLite engine version. public int maxMetadataBlobSize ; ///< Maximum size of metadata blob returned with wake word detection result. public int requiredMem ; ///< Memory in bytes required by an engine instance using this configuration. public int samplesPerFrame ; ///< Samples per frame for PushAudio // Wake Word public String wakewordApiVersion ; ///< Wake Word API version public String wakewordConfigVersion ; ///< Wake Word configuration version (wakeword model) // Fingerprint public String fingerprintApiVersion ; ///< Fingerprint API version public int fingerprintListVersion ; ///< Fingerprint List Version // Speaker Verification public String speakerVerificationApiVersion ; ///< Speaker Verification API version public String speakerVerificationConfigVersion ; ///< Speaker Verification configuration version (the model) public byte [] speakerVerificationModelId ; ///< Speaker Verification Model ID public int speakerVerificationMaxProfileSize ; ///< Speaker Verification maximum profile size public List < String > speakerVerificationKeywords ; ///< List of keywords supported by Speaker Verification Model. public List < String > speakerVerificationLocales ; ///< List of locales supported by Speaker Verification Model. } We can see again that fields are grouped by functionality into sections for wake word, fingerprinting, and speaker verification. The following are some of the attributes associated with the engine as a whole (and not tied to specific features like the wake word detector, fingerprinting, or speaker verification). public String engineVersion ; ///< PryonLite engine version. public int maxMetadataBlobSize ; ///< Maximum size of metadata blob returned with wake word detection result. public int requiredMem ; ///< Memory in bytes required by an engine instance using this configuration. public int samplesPerFrame ; ///< Samples per frame for PushAudio","title":"2. Engine Attributes"},{"location":"api-reference/wrappers/java/java-binding-engine-reference.html#3-engine-commands","text":"","title":"3. Engine Commands"},{"location":"api-reference/wrappers/java/java-binding-engine-reference.html#31-engine-object-creation","text":"A PryonLite Java object can be created with the following constructor - note that there are additional steps needed to initialize the engine so that it can process audio streams. /** * Constructor for the PryonLite engine. * * @param callbacks The callbacks to register for event notifications. */ public PryonLite5000 ( final Callbacks callbacks ) { this . callbacks = callbacks ; this . nativeMem = 0 ; } This creates the environment in which attribute query, initialization, and all further interactions with the engine may proceed.","title":"3.1 Engine Object Creation"},{"location":"api-reference/wrappers/java/java-binding-engine-reference.html#32-engine-attribute-query","text":"Attributes for a particular engine configuration can be requested by calling the following: /** * Query attributes associated with a given PryonLite configuration. * * @param config [in] Input configuration parameters. * @return Attributes if successful; NULL otherwise. */ public Attributes getAttributes ( final Config config ); Note that attributes for a particular set of configuration parameters may and should be retrieved prior to initialization, to retrieve attributes relevant to the engine while it is in use. The getAttributes() may also be called after engine initialization, as it is one of the few methods that is not marked with the synchronized keyword.","title":"3.2 Engine Attribute Query"},{"location":"api-reference/wrappers/java/java-binding-engine-reference.html#33-engine-object-initialization","text":"PryonLite engine initialization is through the following method: /** * Initializes the PryonLite instance. * * @param config [in] Input configuration parameters. * @return Zero if successful, non-zero error code otherwise. * @note Here is a list of error codes for this function and their meanings. * 0: Success, nominal operation. * -1: Config parameter is invalid. * -2: The object has already been initialized. * -3: JNI memory allocation for the instance failed. * -4: A global reference could not be created for the object. * -5: Query of class properties failed. * -6: Import of Java config parameters failed. (DEPRECATED) * -7: Native memory allocation for the instance failed. * -8: Audio buffer allocation for the instance failed. * -9: Saving of JNI memory pointer to object failed. * -10: Speaker Verification voice profile memory allocation failed. * All other values: Internal code, consult vendor. */ public synchronized int initialize ( final Config config );","title":"3.3 Engine Object Initialization"},{"location":"api-reference/wrappers/java/java-binding-engine-reference.html#34-pushing-audio-to-the-engine","text":"Audio buffers are streamed to the PryonLite engine by repeatedly calling the following function: /** * Pushes a frame of audio data to the PryonLite instance. * * @param samples [in] Audio frame from input audio stream. * @return Zero if successful, non-zero error code otherwise. * @note This function should be invoked only after initialize() has been called, * and should not be invoked after a destroy(). * @note Here is a list of error codes for this function and their meanings. * 0: Success, nominal operation. * -32: The object has not been initialized. * -33: Input sample array is null. * -34: Length of input sample array is invalid. * -35: Retrieval of the input sample array failed. * -36: The object passed into pushAudio() does not match the object initialized. * All other values: Internal code, consult vendor. * @note This method does not Log.i as part of nominal operation, both due to its invocation * frequency/regularity, and to respect the high priority context within which it is typically * invoked. If you have use cases where you are invoking pushAudio irregularly, and need to * understand the timing of such invocations, you are free to add a log statement in the calling code. */ public synchronized int pushAudio ( final short [] samples );","title":"3.4 Pushing Audio to the Engine"},{"location":"api-reference/wrappers/java/java-binding-engine-reference.html#35-engine-object-destruction","text":"Prior to deleting the PryonLite5000 Java object, any engine initialization MUST be mirrored by a call to this method. /** * Release PryonLite instance. * * @return Zero if successful, non-zero error code otherwise. * @note Here is a list of error codes for this function and their meanings. * 0: Success, nominal operation. * -31: The object has not been initialized. * All other values: Internal code, consult vendor. */ public synchronized int destroy ();","title":"3.5 Engine Object Destruction"},{"location":"api-reference/wrappers/java/java-binding-engine-reference.html#36-retrieval-of-audio-frame-size","text":"The following function is redundant, as samples per frame is returned as an attribute. However, it has been maintained for backwards compatibility. Future versions of this Java binding may deprecate this method. /** * Returns the number of samples per frame for audio pushes. * * @return Samples per frame for audio pushes, <= 0 on error. * @note This value must be used when subsequent calls to pushAudio() are made. * @note This function should be invoked only after initialize() has been called, * and should not be invoked after a destroy(). * @note Also returned as part of attribute query, which can be invoked prior to * initializing a PryonLite object. * @note Here is a list of error codes for this function and their meanings. * 0: This code is not used and reserved for future use. * -37: The object has not been initialized. * All other values: Internal code, consult vendor. */ public synchronized int getSamplesPerFrame (); We can see from the comments above that the scope of validity for invocation of the getSamplesPerFrame() method is more limited than that of getAttributes() .","title":"3.6 Retrieval of Audio Frame Size"},{"location":"api-reference/wrappers/java/java-binding-engine-reference.html#37-querying-engine-object-initialization-status","text":"The following convenience function allows the PryonLite Java client to check if a given object has been initialized or not: /** * Checks if the instance has been initialized. * * @return 1 if initialized, 0 if not initialized */ public synchronized int isInitialized ();","title":"3.7 Querying Engine Object Initialization Status"},{"location":"api-reference/wrappers/java/java-binding-engine-reference.html#4-engine-events","text":"","title":"4. Engine Events"},{"location":"api-reference/wrappers/java/java-binding-engine-reference.html#41-error-reporting","text":"Errors that are not easily returned through a command's return code are returned via the following callback: /** * Function signature of event handler for serious errors encountered during execution * of native methods where the error cannot be returned through the method's return value. * This would generally be used in a context leading up to the invocation of another event * handler in the Callbacks interface, like wakeWordDetected(). * * @param errorCode [in] Internal code, consult vendor. */ void errorEvent ( final int errorCode );","title":"4.1 Error Reporting"},{"location":"api-reference/wrappers/java/java-binding-fingerprinting-reference.html","text":"PryonLite Java Fingerprinting API Details specific to fingerprinting (fingerprint-based media-induced wake suppression) functionality are briefly described below. 1. Fingerprinting Configuration The Java client must supply arguments for the following parameters as part of fingerprinting configuration: /** * Binary containing media-derived reference fingerprints for matching and suppression. * This may be set to null to disable fingerprint-matching-based suppression. */ public byte [] fingerprintList ; 2. Fingerprinting Attributes Attributes associated with the fingerprinting component of the PryonLite engine are listed below: public String fingerprintApiVersion ; ///< Fingerprint API version public int fingerprintListVersion ; ///< Fingerprint List Version 3. Fingerprinting Commands There are no fingerprinting commands. 4. Fingerprinting Events There are no fingerprinting events exposed in the Java API.","title":"Fingerprinting API"},{"location":"api-reference/wrappers/java/java-binding-fingerprinting-reference.html#pryonlite-java-fingerprinting-api","text":"Details specific to fingerprinting (fingerprint-based media-induced wake suppression) functionality are briefly described below.","title":"PryonLite Java Fingerprinting API"},{"location":"api-reference/wrappers/java/java-binding-fingerprinting-reference.html#1-fingerprinting-configuration","text":"The Java client must supply arguments for the following parameters as part of fingerprinting configuration: /** * Binary containing media-derived reference fingerprints for matching and suppression. * This may be set to null to disable fingerprint-matching-based suppression. */ public byte [] fingerprintList ;","title":"1. Fingerprinting Configuration"},{"location":"api-reference/wrappers/java/java-binding-fingerprinting-reference.html#2-fingerprinting-attributes","text":"Attributes associated with the fingerprinting component of the PryonLite engine are listed below: public String fingerprintApiVersion ; ///< Fingerprint API version public int fingerprintListVersion ; ///< Fingerprint List Version","title":"2. Fingerprinting Attributes"},{"location":"api-reference/wrappers/java/java-binding-fingerprinting-reference.html#3-fingerprinting-commands","text":"There are no fingerprinting commands.","title":"3. Fingerprinting Commands"},{"location":"api-reference/wrappers/java/java-binding-fingerprinting-reference.html#4-fingerprinting-events","text":"There are no fingerprinting events exposed in the Java API.","title":"4. Fingerprinting Events"},{"location":"api-reference/wrappers/java/java-binding-overview.html","text":"PryonLite Java Binding Overview Description A Java binding for the PryonLite API is supplied via a JNI wrapper. This binding provides an interface that is similar but not identical to the PryonLite C language binding. A primary feature of the JNI wrapper is to implement what would normally be a key responsibility of the PryonLite C client - the dynamic memory management (allocation and alignment) for the PryonLite C object instance and various data buffers (models, profiles) associated with the PryonLite Java object. Another feature of the Java binding is to load the native C shared library components that implement the JNI wrapper and the core PryonLite engine. The last notable feature of the Java binding is to provide a means for forwarding events from the PryonLite C binding to the PryonLite Java client. This is achieved by having the PryonLite Java client register event handlers with the PryonLite object constructor. Diving Deeper See PryonLite5000.java for the PRL5000 PryonLite library Java API and the Integration Guide for more details.","title":"Overview"},{"location":"api-reference/wrappers/java/java-binding-overview.html#pryonlite-java-binding-overview","text":"","title":"PryonLite Java Binding Overview"},{"location":"api-reference/wrappers/java/java-binding-overview.html#description","text":"A Java binding for the PryonLite API is supplied via a JNI wrapper. This binding provides an interface that is similar but not identical to the PryonLite C language binding. A primary feature of the JNI wrapper is to implement what would normally be a key responsibility of the PryonLite C client - the dynamic memory management (allocation and alignment) for the PryonLite C object instance and various data buffers (models, profiles) associated with the PryonLite Java object. Another feature of the Java binding is to load the native C shared library components that implement the JNI wrapper and the core PryonLite engine. The last notable feature of the Java binding is to provide a means for forwarding events from the PryonLite C binding to the PryonLite Java client. This is achieved by having the PryonLite Java client register event handlers with the PryonLite object constructor.","title":"Description"},{"location":"api-reference/wrappers/java/java-binding-overview.html#diving-deeper","text":"See PryonLite5000.java for the PRL5000 PryonLite library Java API and the Integration Guide for more details.","title":"Diving Deeper"},{"location":"api-reference/wrappers/java/java-binding-reference.html","text":"PryonLite Java API Reference See Overview for an introduction to the Java API. Components Java Engine API Java Wake Word API Java Fingerprinting API Java Speaker Verification API","title":"Reference"},{"location":"api-reference/wrappers/java/java-binding-reference.html#pryonlite-java-api-reference","text":"See Overview for an introduction to the Java API.","title":"PryonLite Java API Reference"},{"location":"api-reference/wrappers/java/java-binding-reference.html#components","text":"Java Engine API Java Wake Word API Java Fingerprinting API Java Speaker Verification API","title":"Components"},{"location":"api-reference/wrappers/java/java-binding-wake-word-reference.html","text":"PryonLite Java Wake Word API Details specific to wake word functionality are briefly described below. 1. Wake Word Configuration The Java client must supply arguments for the following parameters as part of wake word configuration: /** * Wake Word model binary. */ public byte [] wakewordModel ; /** * Wake Word detection threshold. Integer in range [1, 1000]. * Recommended value is 500. * 1 = most permissive threshold, most detections. * 1000 = least permissive threshold, fewest detections. */ public int detectThreshold ; /** * Flag for enabling voice activity detector pre-stage. * For most application-core integrations, this should be set to false. */ public boolean useVad ; /** * Flag for enabling low-latency detection mode. * Only valid for type 'U' models. Results in ~200ms lower detection * latency, at the cost of less accurate ww end index reporting. */ public boolean lowLatency ; 2. Wake Word Attributes Attributes associated with the wake word component of the PryonLite engine are listed below: public String wakewordApiVersion ; ///< Wake Word API version public String wakewordConfigVersion ; ///< Wake Word configuration version (wakeword model) 3. Wake Word Commands 3.1 Selecting Detection Threshold The Wake Word detection threshold can be modified at run-time via the following method: /** * Sets the wake word detection threshold. * * @param threshold [in] Threshold in range [1, 1000], with 1 being least and 1000 being most permissive. * @return Zero if successful, non-zero error code otherwise. * @note This function should be invoked only after initialize() has been called, * and should not be invoked after a destroy(). * @note Here is a list of error codes for this function and their meanings. * 0: Success, nominal operation. * -38: The object has not been initialized. * All other values: Internal code, consult vendor. */ public synchronized int wakewordSetDetectionThreshold ( final int threshold ); 4. Wake Word Events 4.1 Wake Word Detection /** * Function signature of event handler for wake word detection events. * * @param wakeWord [in] The detected wake word. * @param beginSampleIndex [in] The first sample of the wake word, relative to the last sample pushed. * @param endSampleIndex [in] The last sample of the wake word, relative to the last sample pushed. * @param metadata [in] Wake Word Engine Metadata associated with wake word event. */ void wakeWordDetected ( final String wakeWord , final long beginSampleIndex , final long endSampleIndex , final byte [] metadata ); 4.2 Voice Activity Detection Historically, voice activity detection has been tightly coupled with the wake word detector; however, one could consider it a more generic engine event that may be emitted in future engine configurations where wake word detection is disabled. For the purpose of documentation, we leave this event in the Wake Word section, but reserve the right to move it to the Engine section. /** * Function signature of event handler for voice activity detection (VAD) state changes. * * @param state - new VAD state. */ void vadStateChanged ( final int state ); This event notifies the client when the PryonLite voice activity detector transitions between 'voice present' and 'voice not present' states.","title":"Wake Word API"},{"location":"api-reference/wrappers/java/java-binding-wake-word-reference.html#pryonlite-java-wake-word-api","text":"Details specific to wake word functionality are briefly described below.","title":"PryonLite Java Wake Word API"},{"location":"api-reference/wrappers/java/java-binding-wake-word-reference.html#1-wake-word-configuration","text":"The Java client must supply arguments for the following parameters as part of wake word configuration: /** * Wake Word model binary. */ public byte [] wakewordModel ; /** * Wake Word detection threshold. Integer in range [1, 1000]. * Recommended value is 500. * 1 = most permissive threshold, most detections. * 1000 = least permissive threshold, fewest detections. */ public int detectThreshold ; /** * Flag for enabling voice activity detector pre-stage. * For most application-core integrations, this should be set to false. */ public boolean useVad ; /** * Flag for enabling low-latency detection mode. * Only valid for type 'U' models. Results in ~200ms lower detection * latency, at the cost of less accurate ww end index reporting. */ public boolean lowLatency ;","title":"1. Wake Word Configuration"},{"location":"api-reference/wrappers/java/java-binding-wake-word-reference.html#2-wake-word-attributes","text":"Attributes associated with the wake word component of the PryonLite engine are listed below: public String wakewordApiVersion ; ///< Wake Word API version public String wakewordConfigVersion ; ///< Wake Word configuration version (wakeword model)","title":"2. Wake Word Attributes"},{"location":"api-reference/wrappers/java/java-binding-wake-word-reference.html#3-wake-word-commands","text":"","title":"3. Wake Word Commands"},{"location":"api-reference/wrappers/java/java-binding-wake-word-reference.html#31-selecting-detection-threshold","text":"The Wake Word detection threshold can be modified at run-time via the following method: /** * Sets the wake word detection threshold. * * @param threshold [in] Threshold in range [1, 1000], with 1 being least and 1000 being most permissive. * @return Zero if successful, non-zero error code otherwise. * @note This function should be invoked only after initialize() has been called, * and should not be invoked after a destroy(). * @note Here is a list of error codes for this function and their meanings. * 0: Success, nominal operation. * -38: The object has not been initialized. * All other values: Internal code, consult vendor. */ public synchronized int wakewordSetDetectionThreshold ( final int threshold );","title":"3.1 Selecting Detection Threshold"},{"location":"api-reference/wrappers/java/java-binding-wake-word-reference.html#4-wake-word-events","text":"","title":"4. Wake Word Events"},{"location":"api-reference/wrappers/java/java-binding-wake-word-reference.html#41-wake-word-detection","text":"/** * Function signature of event handler for wake word detection events. * * @param wakeWord [in] The detected wake word. * @param beginSampleIndex [in] The first sample of the wake word, relative to the last sample pushed. * @param endSampleIndex [in] The last sample of the wake word, relative to the last sample pushed. * @param metadata [in] Wake Word Engine Metadata associated with wake word event. */ void wakeWordDetected ( final String wakeWord , final long beginSampleIndex , final long endSampleIndex , final byte [] metadata );","title":"4.1 Wake Word Detection"},{"location":"api-reference/wrappers/java/java-binding-wake-word-reference.html#42-voice-activity-detection","text":"Historically, voice activity detection has been tightly coupled with the wake word detector; however, one could consider it a more generic engine event that may be emitted in future engine configurations where wake word detection is disabled. For the purpose of documentation, we leave this event in the Wake Word section, but reserve the right to move it to the Engine section. /** * Function signature of event handler for voice activity detection (VAD) state changes. * * @param state - new VAD state. */ void vadStateChanged ( final int state ); This event notifies the client when the PryonLite voice activity detector transitions between 'voice present' and 'voice not present' states.","title":"4.2 Voice Activity Detection"},{"location":"api-reference/wrappers/swift/index.html","text":"Swift Wrapper The Pryonlite Swift wrapper provides a Swift interface around the PryonLite library's C APIs. Files required for integration The following files need to be added to the target iOS App's workspace. BridgingHeader.h Imports the pryonlite C header files and provides a bridging interface needed by the Swift wrapper. PryonWrapper.swift Provides the Swift interface for iOS Apps to use the pryonlite library. PryonWrapper APIs: /// Initializer method of the PronLite wrapper class. /// Configures and sets up the PryonLite Wake Word detector. /// /// - Parameters: /// - detectThreshold: wake word detection threshold in the range [1-1000], /// 1 = lowest threshold, most detections, 1000 = highest threshold, fewer restrictive. /// - modelFile: Wake Word model file. /// - modelDir: Directory of the wake word model file, optional, default is nil /// - internalConfigReserved: Reserved config pointer, optional, default is nil. /// - useVad: Use Pryonlite's Voice Activity Detector, optional, default is 0. /// - lowLatency: enable Low latency mode. Only valid for type 'U' models. /// Results in ~200ms lower detection latency, at the cost of less accurate ww end indexes reported in /// the detection callback. /// init ( detectThreshold : Int32 , modelFile : NSString , modelDir : String ? = nil , internalConfigReserved : UnsafeMutableRawPointer ? = nil , useVad : Int32 ? = 0 , lowLatency : Int32 ? = 0 ) /// Push audio samples for PryonLite's processing. /// /// - Parameters: /// - samples: Pointer to Int16 array of the input audio samples. /// - sampleCount: Number of samples in the samples array. This should be equal to 160. /// - Returns: PryonLiteError code, nominally PRYON_LITE_ERROR_OK. /// Refer to pryon_lite_error.h for the list of error values. /// pushSamples ( samples : UnsafeMutablePointer < Int16 >, sampleCount : Int32 ) -> PryonLiteError /// Enable or disable one or all wakeword keywords /// - Parameters: /// - keyword: Pointer to C CHAR array type for keyword to enable or disable; ex. \"ALEXA\". /// Pass keyword \"ALL\" to enable or disable all keywords. /// - enable: (Int32): Integer 0: disable keyword, 1 enable keyword. /// wakewordEnableKeyword ( keyword : UnsafeMutablePointer < CChar >, enable : Int32 ) -> PryonLiteStatus /// Destory the PryonLite Wrapper Instance. /// /// - Returns: PryonLiteError code, nominally PRYON_LITE_ERROR_OK. /// Refer to pryon_lite_error.h for the list of error values. /// destroy () -> PryonLiteError Usage The iOS App can invoke the pryonlite library in it's various stages as follows: Initialization stage to configure and initialize the pryonlite library. var pryonWrapper : PryonWrapper let detectionThreshold = 500 let useVad = 0 let lowLatency = 0 modelFilePath = \"D.en-US.alexa\" modelDir = \"models/common\" pryonWrapper = PryonWrapper ( detectThreshold : detectionThreshold , modelFile : modelFilePath , modelDir : modelDir , internalConfigReserved : nil , useVad : useVad , lowLatency : lowLatency ) Audio processing loop to push the audio samples to processing. let pryonliteStatusCode = pryonWrapper . pushSamples ( samples : & samplesArray , sampleCount : Int32 ( samplesArray . count )) if ( pryonliteStatusCode . publicCode != Int32 ( PRYON_LITE_ERROR_OK . rawValue )) { print ( \"\"\" Error pushing Samples with index \\( readSampleIndex ) Public Error Code: \\( pryonliteStatusCode . publicCode ) Internal Error Code: \\( pryonliteStatusCode . internalCode ) \"\"\" , to :& LogFileOutput ) } Teardown stage of the application to destory the pryonlite instance. // Flush the decoder let pryonliteStatusCode = pryonWrapper . destroy () if ( pryonliteStatusCode . publicCode != Int32 ( PRYON_LITE_ERROR_OK . rawValue )) { print ( \"\"\" Error flushing decoder. Public Error Code: \\( pryonliteStatusCode . publicCode ) Internal Error Code: \\( pryonliteStatusCode . internalCode ) \"\"\" , to :& LogFileOutput ) }","title":"Swift"},{"location":"api-reference/wrappers/swift/index.html#swift-wrapper","text":"The Pryonlite Swift wrapper provides a Swift interface around the PryonLite library's C APIs.","title":"Swift Wrapper"},{"location":"api-reference/wrappers/swift/index.html#files-required-for-integration","text":"The following files need to be added to the target iOS App's workspace.","title":"Files required for integration"},{"location":"api-reference/wrappers/swift/index.html#usage","text":"The iOS App can invoke the pryonlite library in it's various stages as follows: Initialization stage to configure and initialize the pryonlite library. var pryonWrapper : PryonWrapper let detectionThreshold = 500 let useVad = 0 let lowLatency = 0 modelFilePath = \"D.en-US.alexa\" modelDir = \"models/common\" pryonWrapper = PryonWrapper ( detectThreshold : detectionThreshold , modelFile : modelFilePath , modelDir : modelDir , internalConfigReserved : nil , useVad : useVad , lowLatency : lowLatency ) Audio processing loop to push the audio samples to processing. let pryonliteStatusCode = pryonWrapper . pushSamples ( samples : & samplesArray , sampleCount : Int32 ( samplesArray . count )) if ( pryonliteStatusCode . publicCode != Int32 ( PRYON_LITE_ERROR_OK . rawValue )) { print ( \"\"\" Error pushing Samples with index \\( readSampleIndex ) Public Error Code: \\( pryonliteStatusCode . publicCode ) Internal Error Code: \\( pryonliteStatusCode . internalCode ) \"\"\" , to :& LogFileOutput ) } Teardown stage of the application to destory the pryonlite instance. // Flush the decoder let pryonliteStatusCode = pryonWrapper . destroy () if ( pryonliteStatusCode . publicCode != Int32 ( PRYON_LITE_ERROR_OK . rawValue )) { print ( \"\"\" Error flushing decoder. Public Error Code: \\( pryonliteStatusCode . publicCode ) Internal Error Code: \\( pryonliteStatusCode . internalCode ) \"\"\" , to :& LogFileOutput ) }","title":"Usage"},{"location":"avs-device-sdk/index.html","text":"AVS Device SDK Integration For applications using the AVS Device SDK , an adapter (\"Amazon Lite\") is available that integrates Amazon's Wake Word Engine into the Alexa client. Note In order to access Amazon's Wake Word, you will need a managed account with Alexa Voice Services (AVS). This adapter, \"Amazon Lite\", is available on the AVS Developer Console. Talk to your AVS Solution Architect to get access. The adapter comes in the form of an archive containing patches and scripts that are applied to the AVS Device SDK. These scripts add and modify files in the SDK to enable Amazon's Wake Word. Once the adapter is installed, the Sample App can be run for verification. Currently the adapter integrates the following features in the AVS Client SDK: Wake Word (+ VAD and Preroll) Wake Word Diagnostic Information (WWDI) Other wake word features such as DAVS and Fingerprinting will soon be added to the adapter. Check with your AVS Solution Architect for availability. Once you have access to the adapter, follow the Integration Guide to install it into your AVS Device SDK Sample App.","title":"Overview"},{"location":"avs-device-sdk/index.html#avs-device-sdk-integration","text":"For applications using the AVS Device SDK , an adapter (\"Amazon Lite\") is available that integrates Amazon's Wake Word Engine into the Alexa client. Note In order to access Amazon's Wake Word, you will need a managed account with Alexa Voice Services (AVS). This adapter, \"Amazon Lite\", is available on the AVS Developer Console. Talk to your AVS Solution Architect to get access. The adapter comes in the form of an archive containing patches and scripts that are applied to the AVS Device SDK. These scripts add and modify files in the SDK to enable Amazon's Wake Word. Once the adapter is installed, the Sample App can be run for verification. Currently the adapter integrates the following features in the AVS Client SDK: Wake Word (+ VAD and Preroll) Wake Word Diagnostic Information (WWDI) Other wake word features such as DAVS and Fingerprinting will soon be added to the adapter. Check with your AVS Solution Architect for availability. Once you have access to the adapter, follow the Integration Guide to install it into your AVS Device SDK Sample App.","title":"AVS Device SDK Integration"},{"location":"avs-device-sdk/integration-guide.html","text":"AVS Device SDK Integration Guide Dependencies Note These instructions are for AmazonLite version 2.5.0. If using any version other than 2.5.0, please refer to to the README.md packaged in the Amazon Wake Word Engine Wake Word Adapter archive for up-to-date instructions. AVS Device SDK 1.26 or higher, set up as per the official documentation Amazon Wake Word PRL2000 2.16.x or newer AmazonLite Wake Word Engine Adapter (\"AmazonLite\") 2.5.0 for AVS Device SDK. Python 2.7+ Compatibility Chart Certain versions of the AmazonLite Wake Word Engine Adapters are only compatible with certain versions of the AVS Device SDK. Here is the list of supported adapter versions and which wakeword engine and AVS Device SDK versions they are compatible with. AmazonLite Wake Word Engine Adapter Version Wakeword Engine Versions AVS Device SDK Versions 2.5.0 2.16.x + 1.26+ 2.4.0 2.11.x - 2.15.x 1.26+ 2.3.1 2.2.x - 2.15.x 1.23 - 1.25 Integration Steps To build the AVS Device SDK SampleApp with the Amazon Wake Word enabled, follow the steps below: Set up the AVS Device SDK and the SampleApp in the AVS Device SDK as per the official documentation and its dependencies. Verify that this works before applying the Amazon Wake Word Adapter patches. Unzip the Amazon Wake Word package containing the PRL2000 library and models. Unzip the Amazon Wake Word Adapter for the AVS Device SDK. Up to date instructions for integration can also be found in the README.md in the Adapter. Run CMAKE on the AVS C++ SDK passing SampleApp and Amazon Wake Word Engine arguments . See the AVS Device SDK official documentation for the SampleApp-specific arguments ( Ubuntu and MacOS ). MacOS example: cmake /Users/username/my_project/source/avs-device-sdk \\ -DGSTREAMER_MEDIA_PLAYER = ON \\ -DCURL_LIBRARY = /usr/local/opt/curl-openssl/lib/libcurl.dylib \\ -DCURL_INCLUDE_DIR = /usr/local/opt/curl-openssl/include \\ -DPORTAUDIO = ON \\ -DPORTAUDIO_LIB_PATH = /Users/username/my_project/third-party/portaudio/lib/.libs/libportaudio.a \\ -DPORTAUDIO_INCLUDE_DIR = /Users/username/my_project/third-party/portaudio/include \\ -DCMAKE_BUILD_TYPE = DEBUG \\ -DAMAZONLITE_KEY_WORD_DETECTOR = ON \\ -DAMAZONLITE_KEY_WORD_DETECTOR_LIB_PATH = /amazonlite_package_dir/target/PRL2000/libpryon_lite-PRL2000.a \\ -DAMAZONLITE_KEY_WORD_DETECTOR_INCLUDE_DIR = /amazonlite_package_dir/target \\ -DEXTENSION_PATH = <existing-extension-paths> ; <path-to-source-dir/avs-cpp-sdk/KWD> Configure AlexaClientSDKConfig.json to select the desired wake word if using SampleApp. The configuration is explained in detail in the next section for specifics. Run make SampleApp in the cmake build directory as if building the SampleApp. See the applicable Quick Start guide in the AVS Device SDK documentation . SampleApp Configuration Model selection As per the information in wake word models , a model must be selected for the SampleApp to detect wake words. There are two methods of loading the model: loading the model from storage at run-time (most common), and embedding the model in the application by statically linking the model at compile time. This is done during CMake call time. See this section for the CMake arguments. Note that both can be done: If the binaries have an embedded model, the Amazon Lite wake word adapter will default to the embedded model if an external file is not passed in and if an external file is passed in will load the external file. If such embedded model is not present, passing the location of an external file containing the model becomes mandatory. On the SampleApp, this file location should be added to the sampleApp.PryonLiteModelPath field in the JSON configuration file. See an example below: { (...) \"sampleApp\": { \"PryonLiteModelPath\":\"/amazonlite/models/D.en-US.alexa.bin\" } } Fingerprint List Selection A fingerprint list is a file containing a list of acoustic characteristics identifying a wake word in media submitted to the wake word team, otherwise known as \" fingerprints \". The wakeword engine matches the audio stream input to the engine to these \"fingerprints\" to determine if the wake word detected should be suppressed or sent to the cloud. The location of the fingerprint list file must be passed to the adapter. On the SampleApp, this file location must be added to sampleApp.PryonLiteFingerprintListPath field on the JSON configuration file for this feature. See an example below. { (...) \"sampleApp\": { \"PryonLiteFingerprintListPath\":\"/amazonlite/fingerprints/fingerprint_test_list\" } } Watermark Configuration Selection A watermark configuration is a file containing watermarks (similar to visual watermarks) the audio stream input to the wakeword engine is matched to in order to determine whether or not a wake word should be suppressed or sent to the cloud. The location of the watermark configuration file must be passed to the adapter. On the SampleApp, this file location must be added to sampleApp.PryonLiteWatermarkConfigPath field on the JSON configuration file for this feature. See an example below. { (...) \"sampleApp\": { \"PryonLiteWatermarkConfigPath\":\"/amazonlite/watermarks/watermark_cfg.bin\" } } Amazon Wake Word Engine Build Arguments -DAMAZONLITE_KEY_WORD_DETECTOR=ON -DAMAZONLITE_KEY_WORD_DETECTOR_LIB_PATH=<path-to-lib> -DAMAZONLITE_KEY_WORD_DETECTOR_INCLUDE_DIR=<path-to-directory> -DAMAZONLITE_KEY_WORD_DETECTOR_EMBEDDED_MODEL_CPP_PATH=<path-to-model-cpp-file> -DEXTENSION_PATH=<existing-extension-paths>;<path-to-source-dir/avs-cpp-sdk/KWD> Where: AMAZONLITE_KEY_WORD_DETECTOR=ON - Enables the Amazon Wake Word Engine adapter. AMAZONLITE_KEY_WORD_DETECTOR_LIB_PATH - The location of the Amazon Wake Word Engine library. AMAZONLITE_KEY_WORD_DETECTOR_INCLUDE_DIR - The directory where the Amazon Wake Word Engine header files are located. AMAZONLITE_KEY_WORD_DETECTOR_EMBEDDED_MODEL_CPP_PATH - (optional) The location of a C++ file containing a wake word model. If you pass this argument, the model in the C++ file will be embedded into the binaries generated during compilation and therefore will be always available as a fallback option. You can still dynamically load another model. Example: -DAMAZONLITE_KEY_WORD_DETECTOR_EMBEDDED_MODEL_CPP_PATH=/amazonlite/models/D.en-US.alexa.cpp EXTENSION_PATH - A semi-colon separated list of paths to search for CMake projects - the adapter path needs to be included here. Running the AmazonLite Unit Tests A unit-test is packaged with the AmazonLite Wakeword Engine Adapter in avs-cpp-sdk/KWD/AmazonLite/test/PryonLiteKeywordDetectorTest.cpp. In order to run the unit-test, the following steps must be followed: Set up the AVS Device SDK and the SampleApp in the AVS Device SDK as per the official documentation and its dependencies. Verify that this works before applying the Amazon Wake Word Adapter patches. Unzip the Amazon Wake Word package containing the PRL2000 library and models. Unzip the Amazon Wake Word Adapter for the AVS Device SDK. Up to date instructions for integration can also be found in the README.md in the Adapter. Run CMAKE on the AVS C++ SDK passing SampleApp and Amazon Wake Word Engine arguments and a few additional unit-test specific arguments noted below. See the AVS Device SDK official documentation for the SampleApp-specific arguments ( Ubuntu and MacOS ). AMAZONLITE_KEY_WORD_DETECTOR_EMBEDDED_MODEL_CPP_PATH MUST BE provided - this should be the location of a C++ file containing a wake word model. (optional) To enable fingerprint testing, AMAZONLITE_KEY_WORD_DETECTOR_FINGERPRINT_TEST_DIR must be added to the CMAKE arguments and the sample/fingerprint folder moved to the testing file inputs folder, typically found in the AVS Device SDK source at avs-device-sdk/shared/KWD/acsdkKWDImplementations/inputs/. (optional) To enable watermark testing, AMAZONLITE_KEY_WORD_DETECTOR_WATERMARK_TEST_DIR must be added to the CMAKE Arguments and the sample/watermark folder moved to the testing inputs folder, typically found in the AVS Device SDK source at avs-device-sdk/shared/KWD/acsdkKWDImplementations/inputs/. MacOS example: cmake /Users/username/my_project/source/avs-device-sdk \\ -DGSTREAMER_MEDIA_PLAYER = ON \\ -DCURL_LIBRARY = /usr/local/opt/curl-openssl/lib/libcurl.dylib \\ -DCURL_INCLUDE_DIR = /usr/local/opt/curl-openssl/include \\ -DPORTAUDIO = ON \\ -DPORTAUDIO_LIB_PATH = /Users/username/my_project/third-party/portaudio/lib/.libs/libportaudio.a \\ -DPORTAUDIO_INCLUDE_DIR = /Users/username/my_project/third-party/portaudio/include \\ -DCMAKE_BUILD_TYPE = DEBUG \\ -DAMAZONLITE_KEY_WORD_DETECTOR = ON \\ -DAMAZONLITE_KEY_WORD_DETECTOR_LIB_PATH = /amazonlite_package_dir/target/PRL2000/libpryon_lite-PRL2000.a \\ -DAMAZONLITE_KEY_WORD_DETECTOR_INCLUDE_DIR = /amazonlite_package_dir/target \\ -DEXTENSION_PATH = <existing-extension-paths> ; <path-to-source-dir/avs-cpp-sdk/KWD> \\ -DAMAZONLITE_KEY_WORD_DETECTOR_EMBEDDED_MODEL_CPP_PATH = /amazonlite_package_dir/models/doppler.en-US.cpp \\ -DAMAZONLITE_KEY_WORD_DETECTOR_WATERMARK_TEST_DIR -DAMAZONLITE_KEY_WORD_DETECTOR_FINGERPRINT_TEST_DIR Run make PryonLiteKeywordDetectorTest in the cmake build directory. Run ./EXTENSION/KWD/AmazonLite/test/PryonLiteKeywordDetectorTest ../source/avs-device-sdk/shared/KWD/acsdkKWDImplementations/inputs/ in the cmake build directory to run the unit-tests.","title":"Integration Guide"},{"location":"avs-device-sdk/integration-guide.html#avs-device-sdk-integration-guide","text":"","title":"AVS Device SDK Integration Guide"},{"location":"avs-device-sdk/integration-guide.html#dependencies","text":"Note These instructions are for AmazonLite version 2.5.0. If using any version other than 2.5.0, please refer to to the README.md packaged in the Amazon Wake Word Engine Wake Word Adapter archive for up-to-date instructions. AVS Device SDK 1.26 or higher, set up as per the official documentation Amazon Wake Word PRL2000 2.16.x or newer AmazonLite Wake Word Engine Adapter (\"AmazonLite\") 2.5.0 for AVS Device SDK. Python 2.7+","title":"Dependencies"},{"location":"avs-device-sdk/integration-guide.html#compatibility-chart","text":"Certain versions of the AmazonLite Wake Word Engine Adapters are only compatible with certain versions of the AVS Device SDK. Here is the list of supported adapter versions and which wakeword engine and AVS Device SDK versions they are compatible with. AmazonLite Wake Word Engine Adapter Version Wakeword Engine Versions AVS Device SDK Versions 2.5.0 2.16.x + 1.26+ 2.4.0 2.11.x - 2.15.x 1.26+ 2.3.1 2.2.x - 2.15.x 1.23 - 1.25","title":"Compatibility Chart"},{"location":"avs-device-sdk/integration-guide.html#integration-steps","text":"To build the AVS Device SDK SampleApp with the Amazon Wake Word enabled, follow the steps below: Set up the AVS Device SDK and the SampleApp in the AVS Device SDK as per the official documentation and its dependencies. Verify that this works before applying the Amazon Wake Word Adapter patches. Unzip the Amazon Wake Word package containing the PRL2000 library and models. Unzip the Amazon Wake Word Adapter for the AVS Device SDK. Up to date instructions for integration can also be found in the README.md in the Adapter. Run CMAKE on the AVS C++ SDK passing SampleApp and Amazon Wake Word Engine arguments . See the AVS Device SDK official documentation for the SampleApp-specific arguments ( Ubuntu and MacOS ). MacOS example: cmake /Users/username/my_project/source/avs-device-sdk \\ -DGSTREAMER_MEDIA_PLAYER = ON \\ -DCURL_LIBRARY = /usr/local/opt/curl-openssl/lib/libcurl.dylib \\ -DCURL_INCLUDE_DIR = /usr/local/opt/curl-openssl/include \\ -DPORTAUDIO = ON \\ -DPORTAUDIO_LIB_PATH = /Users/username/my_project/third-party/portaudio/lib/.libs/libportaudio.a \\ -DPORTAUDIO_INCLUDE_DIR = /Users/username/my_project/third-party/portaudio/include \\ -DCMAKE_BUILD_TYPE = DEBUG \\ -DAMAZONLITE_KEY_WORD_DETECTOR = ON \\ -DAMAZONLITE_KEY_WORD_DETECTOR_LIB_PATH = /amazonlite_package_dir/target/PRL2000/libpryon_lite-PRL2000.a \\ -DAMAZONLITE_KEY_WORD_DETECTOR_INCLUDE_DIR = /amazonlite_package_dir/target \\ -DEXTENSION_PATH = <existing-extension-paths> ; <path-to-source-dir/avs-cpp-sdk/KWD> Configure AlexaClientSDKConfig.json to select the desired wake word if using SampleApp. The configuration is explained in detail in the next section for specifics. Run make SampleApp in the cmake build directory as if building the SampleApp. See the applicable Quick Start guide in the AVS Device SDK documentation .","title":"Integration Steps"},{"location":"features/davs/davs-filters.html","text":"DAVS: Filter maps in wake word engine package The <target>/davs directory in the wake word engine package contains the DAVS filters to be used for each artifact type. Eg. <target>/davs/lowpower-wakeword contains DAVS filters for wake word models. Wake word artifacts The wake word engine package might include multiple DAVS filter map files in <target>/davs/lowpower-wakeword . The JSON content in each file serves as a reference DAVS filter for fetching the appropriate artifact. The names of these filter map JSON files indicate the purpose for using the respective filter. Wake word artifacts for a single wake word Below is a sample of the DAVS filter map for requesting an artifact for a single wake word \"alexa\" in en-US locale: { \"artifactType\" : \"lowpower-wakeword\" , \"artifactKey\" : \"alexa\" , \"locale\" : \"en-US\" , \"encodedFilters\" : \"eyJlbmdpbmVDb21wYXRpYmlsaXR5SWRMaXN0IjogWyI...\" , \"urlEndpoint\" : \"<DAVS_ENDPOINT eg. 9876...>\" } In the above, \"artifactType\" indicates the type of this DAVS artifact (\"lowpower-wakeword\"). \"artifactKey\" indicates the wake word to fetch the artifact for. \"locale\" is used to filter down to a wake word model suitable to the requested locale. \"encodedFilters\" is a base64 encoded string of attributes used internally to make sure the right artifact is returned for a specific device. It is encoded to reduce the complexity and potential confusion. \"urlEndpoint\" is the DAVS URL Endpoint assigned to your device. Please reach out to your AVS Solutions Architect for the value to use here. Note The DAVS filter map JSON files in the wake word engine package do not contain the \"urlEndpoint\" key value pair. You would need to add the \"urlEndpoint\": \"<DAVS_ENDPOINT>\" to these JSONs. The filter map JSON filenames for single wake word artifacts have the following naming convention: <artifactKey>.<locale>.<useCase>.<useGroup>.davs.json where \"userGroup\" is nominally \"production\". It is a value assigned to you for vending artifacts to your group of devices. \"useCase\" is a string between the <locale> and <userGroup> describing the device's use case for the wake word artifact. Wake word artifacts for multiple wake words Below is a sample of the DAVS filter map for requesting an artifact for multiple wake words \"alexa\" + \"hey_custom\" in en-US locale: { \"artifactType\" : \"lowpower-wakeword\" , \"artifactKey\" : \"multiwakeword\" , \"locale\" : \"en-US\" , \"encodedFilters\" : \"eyJlbmdpbmVDb21wYXRpYmlsaXR5SWRMaXN0IjogWyI...\" , \"urlEndpoint\" : \"<DAVS_ENDPOINT eg. 9876...>\" } Note that in this JSON, the artifactKey is \"multiwakeword\". The filter map JSON filenames for multi-wakeword artifacts have the following naming convention: multiwakeword.<locale>.<useCase>.<useGroup>.davs.json Fingerprinting artifacts The wake word engine package might include multiple DAVS filter map files in <target>/davs/fingerprint . Similar to wake word artifact filter Map files, the JSON content in each fingerprint DAVS filter map file serves as a reference DAVS filter for fetching the appropriate artifact. The names of these filter map JSON files indicate the purpose for using the respective filter. Below is a sample of the DAVS filter map for requesting a fingerprint artifact for the \"alexa\" wake word in en-US locale: { \"artifactType\" : \"fingerprint\" , \"artifactKey\" : \"alexa\" , \"locale\" : \"en-US\" , \"encodedFilters\" : \"eyJlbmdpbmVDb21wYXRpYmlsaXR5SWRMaXN0IjogWyI...\" , \"urlEndpoint\" : \"<DAVS_ENDPOINT eg. 9876...>\" } In the above, \"artifactType\" indicates the type of this DAVS artifact (\"fingerprint\"). \"artifactKey\" indicates the wake word to fetch the fingerprint artifact for. \"locale\" is used to filter down to a fingerprint artifact suitable to the requested locale. \"encodedFilters\" is a base64 encoded string of attributes used internally to make sure the right artifact is returned for a specific device. It is encoded to reduce the complexity and potential confusion. \"urlEndpoint\" is the DAVS URL Endpoint assigned to your device. Please reach out to your AVS Solutions Architect for the value to use here. Note The DAVS filter map JSON files in the wake word engine package do not contain the \"urlEndpoint\" key value pair. You would need to add the \"urlEndpoint\": \"<DAVS_ENDPOINT>\" to these JSONs. The filter map JSON filenames for fingerprint artifacts have the following naming convention: <artifactKey>.<locale>.davs.json Watermarking artifacts The wake word engine package might include multiple DAVS filter map files in <target>/davs/lowpower-watermark . Similar to wake word artifact filter Map files, the JSON content in each watermark DAVS filter map file serves as a reference DAVS filter for fetching the appropriate artifact. The names of these filter map JSON files indicate the purpose for using the respective filter. Below is a sample of the DAVS filter map for requesting a fingerprint artifact for the \"alexa\" wake word in en-US locale: { \"artifactType\" : \"lowpower-watermark\" , \"artifactKey\" : \"watermark\" , \"encodedFilters\" : \"eyJlbmdpbmVDb21wYXRpYmlsaXR5SWRMaXN0IjogWyI...\" , \"urlEndpoint\" : \"<DAVS_ENDPOINT eg. 9876...>\" } In the above, \"artifactType\" indicates the type of this DAVS artifact (\"lowpower-watermark\"). \"artifactKey\" is assigned the value of \"watermark\". \"encodedFilters\" is a base64 encoded string of attributes used internally to make sure the right artifact is returned for a specific device. It is encoded to reduce the complexity and potential confusion. \"urlEndpoint\" is the DAVS URL Endpoint assigned to your device. Please reach out to your AVS Solutions Architect for the value to use here. Note The DAVS filter map JSON files in the wake word engine package do not contain the \"urlEndpoint\" key value pair. You would need to add the \"urlEndpoint\": \"<DAVS_ENDPOINT>\" to these JSONs. The filter map JSON filenames for watermark artifacts have the following naming convention: <artifactKey>.<wakewords>.davs.json","title":"DAVS filters in wakeword engine package"},{"location":"features/davs/davs-filters.html#davs-filter-maps-in-wake-word-engine-package","text":"The <target>/davs directory in the wake word engine package contains the DAVS filters to be used for each artifact type. Eg. <target>/davs/lowpower-wakeword contains DAVS filters for wake word models.","title":"DAVS: Filter maps in wake word engine package"},{"location":"features/davs/davs-filters.html#wake-word-artifacts","text":"The wake word engine package might include multiple DAVS filter map files in <target>/davs/lowpower-wakeword . The JSON content in each file serves as a reference DAVS filter for fetching the appropriate artifact. The names of these filter map JSON files indicate the purpose for using the respective filter.","title":"Wake word artifacts"},{"location":"features/davs/davs-filters.html#fingerprinting-artifacts","text":"The wake word engine package might include multiple DAVS filter map files in <target>/davs/fingerprint . Similar to wake word artifact filter Map files, the JSON content in each fingerprint DAVS filter map file serves as a reference DAVS filter for fetching the appropriate artifact. The names of these filter map JSON files indicate the purpose for using the respective filter. Below is a sample of the DAVS filter map for requesting a fingerprint artifact for the \"alexa\" wake word in en-US locale: { \"artifactType\" : \"fingerprint\" , \"artifactKey\" : \"alexa\" , \"locale\" : \"en-US\" , \"encodedFilters\" : \"eyJlbmdpbmVDb21wYXRpYmlsaXR5SWRMaXN0IjogWyI...\" , \"urlEndpoint\" : \"<DAVS_ENDPOINT eg. 9876...>\" } In the above, \"artifactType\" indicates the type of this DAVS artifact (\"fingerprint\"). \"artifactKey\" indicates the wake word to fetch the fingerprint artifact for. \"locale\" is used to filter down to a fingerprint artifact suitable to the requested locale. \"encodedFilters\" is a base64 encoded string of attributes used internally to make sure the right artifact is returned for a specific device. It is encoded to reduce the complexity and potential confusion. \"urlEndpoint\" is the DAVS URL Endpoint assigned to your device. Please reach out to your AVS Solutions Architect for the value to use here. Note The DAVS filter map JSON files in the wake word engine package do not contain the \"urlEndpoint\" key value pair. You would need to add the \"urlEndpoint\": \"<DAVS_ENDPOINT>\" to these JSONs. The filter map JSON filenames for fingerprint artifacts have the following naming convention: <artifactKey>.<locale>.davs.json","title":"Fingerprinting artifacts"},{"location":"features/davs/davs-filters.html#watermarking-artifacts","text":"The wake word engine package might include multiple DAVS filter map files in <target>/davs/lowpower-watermark . Similar to wake word artifact filter Map files, the JSON content in each watermark DAVS filter map file serves as a reference DAVS filter for fetching the appropriate artifact. The names of these filter map JSON files indicate the purpose for using the respective filter. Below is a sample of the DAVS filter map for requesting a fingerprint artifact for the \"alexa\" wake word in en-US locale: { \"artifactType\" : \"lowpower-watermark\" , \"artifactKey\" : \"watermark\" , \"encodedFilters\" : \"eyJlbmdpbmVDb21wYXRpYmlsaXR5SWRMaXN0IjogWyI...\" , \"urlEndpoint\" : \"<DAVS_ENDPOINT eg. 9876...>\" } In the above, \"artifactType\" indicates the type of this DAVS artifact (\"lowpower-watermark\"). \"artifactKey\" is assigned the value of \"watermark\". \"encodedFilters\" is a base64 encoded string of attributes used internally to make sure the right artifact is returned for a specific device. It is encoded to reduce the complexity and potential confusion. \"urlEndpoint\" is the DAVS URL Endpoint assigned to your device. Please reach out to your AVS Solutions Architect for the value to use here. Note The DAVS filter map JSON files in the wake word engine package do not contain the \"urlEndpoint\" key value pair. You would need to add the \"urlEndpoint\": \"<DAVS_ENDPOINT>\" to these JSONs. The filter map JSON filenames for watermark artifacts have the following naming convention: <artifactKey>.<wakewords>.davs.json","title":"Watermarking artifacts"},{"location":"features/davs/integration-guide.html","text":"DAVS: Integration Guide Where to get the DAVS client Contact your AVS Solutions Architect to get access to DAVS. Authentication DAVS uses Login with Amazon (LWA) as user authentication system. For details, see LWA Overview and Steps to retrieve an Access Token and Refresh Token . Sample LWA implementation is provided in the DAVS client code. Dependencies The DAVS Client application requires libcurl and cJSON. To install the libcurl, simply go the libcurl homepage and follow the installation instructions. cJSON is built in as source within the DAVS library, thus no need to install separately. Steps to run DAVS Client 1. Build DAVS Client From the DAVS package, run the following command to build reference DAVS client: cd referenceClient cmake CMakeLists.txt -DINSTALL_LIBDIR = <PATH_TO_LIBRARY_DIRECTORY> -DINSTALL_INCLUDEDIR = <PATH_TO_HEADER_DIRECTORY> -DINSTALL_RUNTIMEDIR = <PATH_TO_RUNTIME_DIRECTORY> -DCMAKE_INSTALL_PREFIX = <PREFIX_OF_INSTALLATION_PATH> make make install The DAVS shared library should be built under <PREFIX_OF_INSTALLATION_PATH>/<PATH_TO_LIBRARY_DIRECTORY> , and headers can be found at <PREFIX_OF_INSTALLATION_PATH>/<PATH_TO_HEADER_DIRECTORY> . The executable DavsClient should be built under <PREFIX_OF_INSTALLATION_PATH>/<PATH_TO_RUNTIME_DIRECTORY> . 2. Input to DAVS Client 2.1. LWA Auth Config JSON This is a JSON file that holds LWA configurations. Sample content { \"refresh_token\" : \"<REPLACE_WITH_YOUR_LWA_REFRESH_TOKEN>\" , \"LWA_endpoint\" : \"<REPLACE_WITH_YOUR_LWA_ENDPOINT>\" } 2.2. Filter Map JSON This is the JSON file that describes the artifact request. You can find this JSON file from the artifact vendor. For example, you can get artifact filter map for wake word model along with wake word engine package release. For more details, please see DAVS filter maps in the wake word engine package Sample content { \"locale\" : \"<LOCALE>\" , \"encodedFilters\" : \"<ENCODED_FILTERS>\" , \"urlEndpoint\" : \"<DAVS_ENDPOINT>\" , \"artifactType\" : \"<ARTIFACT_TYPE>\" , \"artifactKey\" : \"<ARTIFACT_KEY>\" } 3. File System Before running the DAVS client, make sure the download directory specified by command line flag exists on system with read/write permissions. 4. Run DAVS Client ./DavsClient --auth-config <PATH_TO_AUTH_CONFIG> --filter-map <PATH_TO_FILTER_MAP> --download-directory <PATH_TO_DOWNLOAD_DIRECTORY> 5. Expected Output The executable will download the artifact and exit if successful. If anything is wrong, it could possibly fall back on exponential backoff retry. The following logs might be helpful to debug. DavsClient.log : Logs all overall info/debug/error. Metric.log : Logs metadata regarding artifacts. ArtifactIDs.log : Logs existing unique artifact IDs on device.","title":"Integration Guide"},{"location":"features/davs/integration-guide.html#davs-integration-guide","text":"","title":"DAVS: Integration Guide"},{"location":"features/davs/integration-guide.html#where-to-get-the-davs-client","text":"Contact your AVS Solutions Architect to get access to DAVS.","title":"Where to get the DAVS client"},{"location":"features/davs/integration-guide.html#authentication","text":"DAVS uses Login with Amazon (LWA) as user authentication system. For details, see LWA Overview and Steps to retrieve an Access Token and Refresh Token . Sample LWA implementation is provided in the DAVS client code.","title":"Authentication"},{"location":"features/davs/integration-guide.html#dependencies","text":"The DAVS Client application requires libcurl and cJSON. To install the libcurl, simply go the libcurl homepage and follow the installation instructions. cJSON is built in as source within the DAVS library, thus no need to install separately.","title":"Dependencies"},{"location":"features/davs/integration-guide.html#steps-to-run-davs-client","text":"","title":"Steps to run DAVS Client"},{"location":"features/davs/overview.html","text":"DAVS: Overview Overview Regular software updates to devices deployed in the field are an essential part of maintaining any product, from upgrading to the latest version of an application and its accompanying data to responding to field issues. Amazon uses Device Artifacts Vending Service (DAVS) to accomplish this. DAVS decouples the delivery of certain artifacts to the device from the firmware OTA cycles, which are typically 2-3 months. DAVS is a client/server mechanism that provides Alexa-enabled devices self-service updates to software components of Amazon's wake word Engine. The DAVS client application on device queries the DAVS service periodically for software update, and downloads the update based on the request. It can be used to update wake word models, fingerprint lists, and more artifacts that support other feature. An AVS device can use DAVS to check whether an update for any artifact (wake word model, fingerprint list, etc) is available and downloads / updates the artifact automatically. No manual delivery of updates need to be coordinated between the Amazon and the AVS device. Dependencies DAVS is an independent process from Wake Word detection on the device. The only requirement is that the device have a network connection and persistent storage with read/write access. Resource Requirement Since DAVS runs as a background process that is spawned periodically (once every hour or so), the CPU overhead is minimal. Memory (KB) CPU (MIPS) Disk Space (KB) 340 N/A 750 KB How it works As shown by the chart above, Amazon teams who act as artifact vender would upload new artifacts to the DAVS service with a set of parameters that describe the attributes of the artifact. The set of parameters as a whole is called filter. More specifically, a filter is a string representation of a JSON structure comprising of key-value pairs of artifact-properties. For instance, for wake word artifact, the actual wake word \"Alexa\" would form a part of the filter. A filter might include Base64 encoded component depending on the use case to hide the unnecessary complexity and potential confusion. The encoded parameters are usually attributes that are used internally to make sure right artifact is returned for a specific device. Below is a example of filter for a wake word model in en-US locale. { \"artifactType\" : \"lowpower-wakeword\" , \"artifactKey\" : \"alexa\" , \"locale\" : \"en-US\" , \"encodedFilters\" : \"eyJmaWx0ZXJWZXJzaW9uIjpbIjEiXSwiZW5naW5lQ29tcGF0aWJpbGl0eUlkTGlzdCI6WyIxIl0sIm1vZGVsQ2xhc3MiOlsiRCJdLCJhY291c3RpY0Vudmlyb25tZW50IjpbImZhci1maWVsZCJdLCJvcGVyYXRpbmdNb2RlIjpbInN0YW5kYWxvbmUiXSwibW9kZWxGb3JtYXQiOlsiZjgiXX0K\" } DAVS server contains database of artifacts and matching filters. On device DAVS client contacts DAVS periodically and sends a filter describing the requested artifact. DAVS server matches the request to the latest available artifact and sends back a link to the artifact to the client. The client can then download the artifact. DAVS Artifact Types Wake Word Models Amazon is continuously developing new models that provide better wake word detection performance. These models are made available to Alexa-enabled devices through DAVS. This enables faster response to field issues (Traditional OTAs taking weeks to months vs. with DAVS we can update in hours to days) and keeps the partner device performance on par with Echo Family Devices performance when Amazon deploys new models. It also provides an automatic model rollback mechanism. Any poor performed model can be rolled back by DAVS. Fingerprint Lists Amazon regularly compiles a fingerprint list of the highest exposure media broadcasts and makes this list available to Alexa-enabled devices through DAVS. The list is usually updated weekly. The downloaded fingerprint list is then used for on-device fingerprint matching. Please refer to the Fingerprinting section for more information. Note The Wake Word Engine and other executable binaries/libraries are not updated via DAVS Watermark Configuration Binaries Amazon provids a watermarking configuration binary for the wakeword engine and makes this binary availabe to Alexa-enabled devices through DAVS. The downloaded watermark configuration binary is then used to identify watermarked audio on device. Please refer to the [Watermarking][../wakeword/media-wakes/watermarking/overview.md] section for more information. Note The Wake Word Engine and other executable binaries/libraries are not updated via DAVS Why should I use DAVS? DAVS greatly simplifies artifact update process. With DAVS, no more OTA is required for artifact update. All it takes is a piece of on-device code that runs occasionally checking for update. It is also an essential component for features that requires frequent model/database update, such as Fingerprinting . Another benefit is disk space savings, since a device would need to store only the model for the active locale. i.e. a device in Germany would only download the de-DE model, and a device in Japan would only download the ja-JP model. DAVS client does not require significant CPU MIPS allocation, as it is not a real-time module, and it can run in the background. Usually the client would access the server every hour to check whether a new artifact is available. Artifact update frequency for fingerprints is about once a week, for models it is less frequent.","title":"Overview"},{"location":"features/davs/overview.html#davs-overview","text":"","title":"DAVS: Overview"},{"location":"features/davs/overview.html#overview","text":"Regular software updates to devices deployed in the field are an essential part of maintaining any product, from upgrading to the latest version of an application and its accompanying data to responding to field issues. Amazon uses Device Artifacts Vending Service (DAVS) to accomplish this. DAVS decouples the delivery of certain artifacts to the device from the firmware OTA cycles, which are typically 2-3 months. DAVS is a client/server mechanism that provides Alexa-enabled devices self-service updates to software components of Amazon's wake word Engine. The DAVS client application on device queries the DAVS service periodically for software update, and downloads the update based on the request. It can be used to update wake word models, fingerprint lists, and more artifacts that support other feature. An AVS device can use DAVS to check whether an update for any artifact (wake word model, fingerprint list, etc) is available and downloads / updates the artifact automatically. No manual delivery of updates need to be coordinated between the Amazon and the AVS device.","title":"Overview"},{"location":"features/davs/overview.html#dependencies","text":"DAVS is an independent process from Wake Word detection on the device. The only requirement is that the device have a network connection and persistent storage with read/write access.","title":"Dependencies"},{"location":"features/davs/overview.html#resource-requirement","text":"Since DAVS runs as a background process that is spawned periodically (once every hour or so), the CPU overhead is minimal. Memory (KB) CPU (MIPS) Disk Space (KB) 340 N/A 750 KB","title":"Resource Requirement"},{"location":"features/davs/overview.html#how-it-works","text":"As shown by the chart above, Amazon teams who act as artifact vender would upload new artifacts to the DAVS service with a set of parameters that describe the attributes of the artifact. The set of parameters as a whole is called filter. More specifically, a filter is a string representation of a JSON structure comprising of key-value pairs of artifact-properties. For instance, for wake word artifact, the actual wake word \"Alexa\" would form a part of the filter. A filter might include Base64 encoded component depending on the use case to hide the unnecessary complexity and potential confusion. The encoded parameters are usually attributes that are used internally to make sure right artifact is returned for a specific device. Below is a example of filter for a wake word model in en-US locale. { \"artifactType\" : \"lowpower-wakeword\" , \"artifactKey\" : \"alexa\" , \"locale\" : \"en-US\" , \"encodedFilters\" : \"eyJmaWx0ZXJWZXJzaW9uIjpbIjEiXSwiZW5naW5lQ29tcGF0aWJpbGl0eUlkTGlzdCI6WyIxIl0sIm1vZGVsQ2xhc3MiOlsiRCJdLCJhY291c3RpY0Vudmlyb25tZW50IjpbImZhci1maWVsZCJdLCJvcGVyYXRpbmdNb2RlIjpbInN0YW5kYWxvbmUiXSwibW9kZWxGb3JtYXQiOlsiZjgiXX0K\" } DAVS server contains database of artifacts and matching filters. On device DAVS client contacts DAVS periodically and sends a filter describing the requested artifact. DAVS server matches the request to the latest available artifact and sends back a link to the artifact to the client. The client can then download the artifact.","title":"How it works"},{"location":"features/davs/overview.html#davs-artifact-types","text":"","title":"DAVS Artifact Types"},{"location":"features/davs/overview.html#wake-word-models","text":"Amazon is continuously developing new models that provide better wake word detection performance. These models are made available to Alexa-enabled devices through DAVS. This enables faster response to field issues (Traditional OTAs taking weeks to months vs. with DAVS we can update in hours to days) and keeps the partner device performance on par with Echo Family Devices performance when Amazon deploys new models. It also provides an automatic model rollback mechanism. Any poor performed model can be rolled back by DAVS.","title":"Wake Word Models"},{"location":"features/davs/overview.html#fingerprint-lists","text":"Amazon regularly compiles a fingerprint list of the highest exposure media broadcasts and makes this list available to Alexa-enabled devices through DAVS. The list is usually updated weekly. The downloaded fingerprint list is then used for on-device fingerprint matching. Please refer to the Fingerprinting section for more information. Note The Wake Word Engine and other executable binaries/libraries are not updated via DAVS","title":"Fingerprint Lists"},{"location":"features/davs/overview.html#watermark-configuration-binaries","text":"Amazon provids a watermarking configuration binary for the wakeword engine and makes this binary availabe to Alexa-enabled devices through DAVS. The downloaded watermark configuration binary is then used to identify watermarked audio on device. Please refer to the [Watermarking][../wakeword/media-wakes/watermarking/overview.md] section for more information. Note The Wake Word Engine and other executable binaries/libraries are not updated via DAVS","title":"Watermark Configuration Binaries"},{"location":"features/davs/overview.html#why-should-i-use-davs","text":"DAVS greatly simplifies artifact update process. With DAVS, no more OTA is required for artifact update. All it takes is a piece of on-device code that runs occasionally checking for update. It is also an essential component for features that requires frequent model/database update, such as Fingerprinting . Another benefit is disk space savings, since a device would need to store only the model for the active locale. i.e. a device in Germany would only download the de-DE model, and a device in Japan would only download the ja-JP model. DAVS client does not require significant CPU MIPS allocation, as it is not a real-time module, and it can run in the background. Usually the client would access the server every hour to check whether a new artifact is available. Artifact update frequency for fingerprints is about once a week, for models it is less frequent.","title":"Why should I use DAVS?"},{"location":"features/vad/overview.html","text":"Standalone VAD: Standalone Voice Activity Detection Warning Do not use this software-based VAD in conjunction with a hardware-based VAD. The interaction between the two will adversely affect performance. Overview Voice Activity Detection (VAD) preprocesses an audio stream to determine whether or not it contains voice. It is typically used as a gate in front of higher-CPU algorithms that perform more complex tasks such as speech recognition, keyword spotting, voice authentication, etc. in order to optimize power consumption. There are several different methods for performing VAD, each with its own accuracy, latency and resource trade-offs. Amazon Wake Word Engine offers a way to do this through its own software-based implementations. EnergyDetection EnergyDetection is an acoustic energy detector that triggers off any acoustic energy in the input, be it human or otherwise. See EnergyDetector for more details. FAQ See the VAD section of the FAQ.","title":"Overview"},{"location":"features/vad/overview.html#standalone-vad-standalone-voice-activity-detection","text":"Warning Do not use this software-based VAD in conjunction with a hardware-based VAD. The interaction between the two will adversely affect performance.","title":"Standalone VAD: Standalone Voice Activity Detection"},{"location":"features/vad/overview.html#overview","text":"Voice Activity Detection (VAD) preprocesses an audio stream to determine whether or not it contains voice. It is typically used as a gate in front of higher-CPU algorithms that perform more complex tasks such as speech recognition, keyword spotting, voice authentication, etc. in order to optimize power consumption. There are several different methods for performing VAD, each with its own accuracy, latency and resource trade-offs. Amazon Wake Word Engine offers a way to do this through its own software-based implementations.","title":"Overview"},{"location":"features/vad/overview.html#energydetection","text":"EnergyDetection is an acoustic energy detector that triggers off any acoustic energy in the input, be it human or otherwise. See EnergyDetector for more details.","title":"EnergyDetection"},{"location":"features/vad/overview.html#faq","text":"See the VAD section of the FAQ.","title":"FAQ"},{"location":"features/vad/energydetection/integration-guide.html","text":"EnergyDetection: Energy Detection Integration Guide EnergyDetection must be explicitly enabled during engine initialization by: Initializing a PryonLiteEnergyDetectionConfig struct to the appropriate values. Set enableGate to 1. Initializing a PryonLiteVadConfig struct. Set the energyDetection pointer in the PryonLiteVadConfig to point to the first step's PryonLiteEnergyDetectionConfig struct. Setting vad in PryonLiteV2Config to the above PryonLiteVadConfig. (optional) Enabling the PryonLiteVadEvent, if a notification when the EnergyDetection state changes is desired. See the api sample included in your release for a full example. // Optionally define a VAD event callback // if enabled, the callback will be called when the EnergyDetection state changes // Client will have until next frame to potentially scale CPU frequency static void vadEventHandler ( PryonLiteV2Handle * handle , const PryonLiteVadEvent * vadEvent ) { printf ( \"VAD state %d \\n \" , ( int ) vadEvent -> vadState ); // perform a change in CPU clock speed if necessary } // Event handler // Must be passed to PryonLite_Initialize static void handleEvent ( PryonLiteV2Handle * handle , const PryonLiteV2Event * event ) { if ( event -> vadEvent != NULL ) { vadEventHandler ( handle , event -> vadEvent ); } // ... handle other events here } void init () { PryonLiteV2EventConfig engineEventConfig = { 0 }; PryonLiteV2Config engineConfig = { 0 }; // Initialize the VAD Configuration to use energy detection. PryonLiteVadConfig vadConfig = PryonLiteVadConfig_Default ; PryonLiteEnergyDetectionConfig energyDetection = PryonLiteEnergyDetectionConfig_Default ; energyDetection . enableGate = 1 ; ///< When used inline with any other functionality, true to prevent continued processing of non voice frames vadConfig . energyDetection = & energyDetection ; engineConfig . vad = & vadConfig ; // Optionally enable the VAD event to receive EnergyDetection state changes notifications engineEventConfig . enableVadEvent = true ; // ... rest of init code }","title":"Integration Guide"},{"location":"features/vad/energydetection/integration-guide.html#energydetection-energy-detection","text":"","title":"EnergyDetection: Energy Detection"},{"location":"features/vad/energydetection/integration-guide.html#integration-guide","text":"EnergyDetection must be explicitly enabled during engine initialization by: Initializing a PryonLiteEnergyDetectionConfig struct to the appropriate values. Set enableGate to 1. Initializing a PryonLiteVadConfig struct. Set the energyDetection pointer in the PryonLiteVadConfig to point to the first step's PryonLiteEnergyDetectionConfig struct. Setting vad in PryonLiteV2Config to the above PryonLiteVadConfig. (optional) Enabling the PryonLiteVadEvent, if a notification when the EnergyDetection state changes is desired. See the api sample included in your release for a full example. // Optionally define a VAD event callback // if enabled, the callback will be called when the EnergyDetection state changes // Client will have until next frame to potentially scale CPU frequency static void vadEventHandler ( PryonLiteV2Handle * handle , const PryonLiteVadEvent * vadEvent ) { printf ( \"VAD state %d \\n \" , ( int ) vadEvent -> vadState ); // perform a change in CPU clock speed if necessary } // Event handler // Must be passed to PryonLite_Initialize static void handleEvent ( PryonLiteV2Handle * handle , const PryonLiteV2Event * event ) { if ( event -> vadEvent != NULL ) { vadEventHandler ( handle , event -> vadEvent ); } // ... handle other events here } void init () { PryonLiteV2EventConfig engineEventConfig = { 0 }; PryonLiteV2Config engineConfig = { 0 }; // Initialize the VAD Configuration to use energy detection. PryonLiteVadConfig vadConfig = PryonLiteVadConfig_Default ; PryonLiteEnergyDetectionConfig energyDetection = PryonLiteEnergyDetectionConfig_Default ; energyDetection . enableGate = 1 ; ///< When used inline with any other functionality, true to prevent continued processing of non voice frames vadConfig . energyDetection = & energyDetection ; engineConfig . vad = & vadConfig ; // Optionally enable the VAD event to receive EnergyDetection state changes notifications engineEventConfig . enableVadEvent = true ; // ... rest of init code }","title":"Integration Guide"},{"location":"features/vad/energydetection/overview.html","text":"EnergyDetection: Energy Detection Warning Do not use this software-based energy detector in conjunction with a hardware-based VAD. The interaction between the two will adversely affect performance. Overview Amazon Wake Word Engine comes with an acoustic energy detection (EnergyDetection) VAD implementation. It will trigger on any acoustic energy found in the input signal, whether it be human voice or not. This EnergyDetection is disabled by default - it must be explicitly enabled at initialization time. Note that the EnergyDetection remains active for a certain period of time after signal energy has stopped - this is referred to as the \"hangover\" time, and is approximately 500 ms. EnergyDetection is based on absolute signal level , and will trigger on signal levels above -52dBFS . EnergyDetection can be used to reduce CPU usage during periods of very low or zero acoustic energy in the input stream. In systems with automatic CPU schedulers built in to the OS (Linux, iOS, Android, etc.), CPU savings are acquired automatically - EnergyDetection will gate running wakeword detection and other dependent algorithms if configured alongside those features unless explicitly disabled. In systems without automatic CPU scaling (i.e. low power DSPs/MCUs with no OS), the client application can use the EnergyDetection event detection to scale up or down the clock speed of the host processor programmatically. EnergyDetection CPU consumption varies depending on processor architecture, but it typically consumes less than 1 MHz on most processors. Note that the use of EnergyDetection does very slightly degrade the accuracy of the WW Engine , typically 0.5-1.0% relative decrease in FAR/FRR, depending on the model being used. Dependencies None (v1 and v2 both come with EnergyDetection) Resource Requirements EnergyDetection is built in to all versions of the Wake Word engine. It's memory footprint is in the 1-2K range and the EnergyDetection itself consumes between 0.25 and .75 Mhz on typical ARM processors (armv7a).","title":"Overview"},{"location":"features/vad/energydetection/overview.html#energydetection-energy-detection","text":"Warning Do not use this software-based energy detector in conjunction with a hardware-based VAD. The interaction between the two will adversely affect performance.","title":"EnergyDetection: Energy Detection"},{"location":"features/vad/energydetection/overview.html#overview","text":"Amazon Wake Word Engine comes with an acoustic energy detection (EnergyDetection) VAD implementation. It will trigger on any acoustic energy found in the input signal, whether it be human voice or not. This EnergyDetection is disabled by default - it must be explicitly enabled at initialization time. Note that the EnergyDetection remains active for a certain period of time after signal energy has stopped - this is referred to as the \"hangover\" time, and is approximately 500 ms. EnergyDetection is based on absolute signal level , and will trigger on signal levels above -52dBFS . EnergyDetection can be used to reduce CPU usage during periods of very low or zero acoustic energy in the input stream. In systems with automatic CPU schedulers built in to the OS (Linux, iOS, Android, etc.), CPU savings are acquired automatically - EnergyDetection will gate running wakeword detection and other dependent algorithms if configured alongside those features unless explicitly disabled. In systems without automatic CPU scaling (i.e. low power DSPs/MCUs with no OS), the client application can use the EnergyDetection event detection to scale up or down the clock speed of the host processor programmatically. EnergyDetection CPU consumption varies depending on processor architecture, but it typically consumes less than 1 MHz on most processors. Note that the use of EnergyDetection does very slightly degrade the accuracy of the WW Engine , typically 0.5-1.0% relative decrease in FAR/FRR, depending on the model being used.","title":"Overview"},{"location":"features/vad/energydetection/overview.html#dependencies","text":"None (v1 and v2 both come with EnergyDetection)","title":"Dependencies"},{"location":"features/vad/energydetection/overview.html#resource-requirements","text":"EnergyDetection is built in to all versions of the Wake Word engine. It's memory footprint is in the 1-2K range and the EnergyDetection itself consumes between 0.25 and .75 Mhz on typical ARM processors (armv7a).","title":"Resource Requirements"},{"location":"features/wakeword/cascade-mode.html","text":"Cascade mode Overview Wake Word is said to be operated in cascade mode when a relatively permissive first-stage WW detector is cascaded with a more conservative second-stage WW detector. This page describes the operation of wake word in cascade mode in more detail. Description In cascade mode, on-device wake word detection occurs in two stages. The first stage runs a smaller, permissive wake word model that is tuned to have lower false rejects and higher false accepts, while the second-stage runs a larger, more accurate model to perform a second-pass verification. Cascade or two-stage detection is typically required in scenarios when the device needs to be operated in low-power conditions, for instance, in tablets, in battery-powered devices and in other such low-power devices. When wake word detection is cascaded, the same level of accuracy can be achieved with considerable power savings because the second stage gets invoked only when the smaller first-stage model makes a detection. The first-stage detector can also be combined with a low-cost Voice Activity Detector which triggers the first-stage detector only when there is valid speech activity. This further reduces power consumption since CPU is spent running only the VAD for most of the time. Typical Use Case: Cascade WW Detection using VAD VAD and a first-stage Wake Word Engine (50K model) runs on the DSP. The SoC is powered down. Most of the time spent is in VAD, so the DSP can be clocked to run at an extremely low rate When VAD triggers, DSP clock speed is increased and runs the 50K (more permissive) 1st stage model If 1st stage triggers, power up the SoC, transfer audio, and do WW detection with a large rmodel Reduces power consumption greatly as most of the time is spent running only VAD However, using this power-saving cascade configuration has the following trade-offs: Increased latency due to two-pass verification. The exact amount is dependent on system hardware in transferring audio from the first-stage chip to the second-stage chip. System Complexity: Two wake word engines & models, and code to coordinate the two between two chips Extra memory required on the DSP for a 2 second audio ring buffer FAQ See the Cascade Mode section of the FAQ.","title":"Cascade Mode"},{"location":"features/wakeword/cascade-mode.html#cascade-mode","text":"","title":"Cascade mode"},{"location":"features/wakeword/cascade-mode.html#overview","text":"Wake Word is said to be operated in cascade mode when a relatively permissive first-stage WW detector is cascaded with a more conservative second-stage WW detector. This page describes the operation of wake word in cascade mode in more detail.","title":"Overview"},{"location":"features/wakeword/cascade-mode.html#description","text":"In cascade mode, on-device wake word detection occurs in two stages. The first stage runs a smaller, permissive wake word model that is tuned to have lower false rejects and higher false accepts, while the second-stage runs a larger, more accurate model to perform a second-pass verification. Cascade or two-stage detection is typically required in scenarios when the device needs to be operated in low-power conditions, for instance, in tablets, in battery-powered devices and in other such low-power devices. When wake word detection is cascaded, the same level of accuracy can be achieved with considerable power savings because the second stage gets invoked only when the smaller first-stage model makes a detection. The first-stage detector can also be combined with a low-cost Voice Activity Detector which triggers the first-stage detector only when there is valid speech activity. This further reduces power consumption since CPU is spent running only the VAD for most of the time.","title":"Description"},{"location":"features/wakeword/cascade-mode.html#typical-use-case-cascade-ww-detection-using-vad","text":"VAD and a first-stage Wake Word Engine (50K model) runs on the DSP. The SoC is powered down. Most of the time spent is in VAD, so the DSP can be clocked to run at an extremely low rate When VAD triggers, DSP clock speed is increased and runs the 50K (more permissive) 1st stage model If 1st stage triggers, power up the SoC, transfer audio, and do WW detection with a large rmodel Reduces power consumption greatly as most of the time is spent running only VAD However, using this power-saving cascade configuration has the following trade-offs: Increased latency due to two-pass verification. The exact amount is dependent on system hardware in transferring audio from the first-stage chip to the second-stage chip. System Complexity: Two wake word engines & models, and code to coordinate the two between two chips Extra memory required on the DSP for a 2 second audio ring buffer","title":"Typical Use Case: Cascade WW Detection using VAD"},{"location":"features/wakeword/cascade-mode.html#faq","text":"See the Cascade Mode section of the FAQ.","title":"FAQ"},{"location":"features/wakeword/client-properties.html","text":"Client Properties The engine can benefit from knowing various states of the device. This information can be taken into account when determining if a wake word is present to make the engine more or less sensitive. Use the public API call PryonLite_SetClientProperty to inform the engine of any device state changes at run-time. Common Client Properties The table below lists all of the common client properties defined in pryon_lite_common_client_properties.h Group Property Description CLIENT_PROP_GROUP_COMMON CLIENT_PROP_COMMON_AUDIO_PLAYBACK Generic audio playback (any sound) CLIENT_PROP_GROUP_COMMON CLIENT_PROP_COMMON_ALARM_STATE Alarm playback CLIENT_PROP_GROUP_COMMON CLIENT_PROP_COMMON_MEDIA_PLAYER_STATE Media playback CLIENT_PROP_GROUP_COMMON CLIENT_PROP_COMMON_EARCON_PLAYER_STATE Earcon playback CLIENT_PROP_GROUP_COMMON CLIENT_PROP_COMMON_TTS_PLAYER_STATE Text to speech playback All common client properties share the following possible values Value Meaning -1 Unknown State 0 Not Playing 1 Playing","title":"Client Properties"},{"location":"features/wakeword/client-properties.html#client-properties","text":"The engine can benefit from knowing various states of the device. This information can be taken into account when determining if a wake word is present to make the engine more or less sensitive. Use the public API call PryonLite_SetClientProperty to inform the engine of any device state changes at run-time.","title":"Client Properties"},{"location":"features/wakeword/client-properties.html#common-client-properties","text":"The table below lists all of the common client properties defined in pryon_lite_common_client_properties.h Group Property Description CLIENT_PROP_GROUP_COMMON CLIENT_PROP_COMMON_AUDIO_PLAYBACK Generic audio playback (any sound) CLIENT_PROP_GROUP_COMMON CLIENT_PROP_COMMON_ALARM_STATE Alarm playback CLIENT_PROP_GROUP_COMMON CLIENT_PROP_COMMON_MEDIA_PLAYER_STATE Media playback CLIENT_PROP_GROUP_COMMON CLIENT_PROP_COMMON_EARCON_PLAYER_STATE Earcon playback CLIENT_PROP_GROUP_COMMON CLIENT_PROP_COMMON_TTS_PLAYER_STATE Text to speech playback All common client properties share the following possible values Value Meaning -1 Unknown State 0 Not Playing 1 Playing","title":"Common Client Properties"},{"location":"features/wakeword/detection-threshold.html","text":"Detection Threshold Overview The detection threshold determines the sensitivity of the wake word detector. Lower thresholds are more permissive (most detections), while higher thresholds are more restrictive. While this threshold can be adjusted by the application, it's important to note that adjusting the threshold will result in opposite changes in FAR and FRR. For example, if the threshold is set to its minimum, the FRR will be reduced (which may seem ideal), but because the detection is more permissive, the FAR will also increase. And vice-versa if the threshold to its maximum value. Adjusting the Threshold The Amazon Wake Word Engine provides an API to adjust the detection threshold. This can be done at any time during the lifetime of the engine instance (i.e. at both initialization and run-time). Acceptable values for the threshold are 1 to 1000, where 1 is the most permissive and 1000 is the most restrictive. The default value is 500. See the API Reference for details. The following illustrates the effect of changing the threshold on FRR and FAR. The \"Higher Sensitivity\" arrow trends toward a threshold of 1 while the \"Lower Sensitivity\" arrow trends toward a threshold of 1000 .","title":"Detection Threshold"},{"location":"features/wakeword/detection-threshold.html#detection-threshold","text":"","title":"Detection Threshold"},{"location":"features/wakeword/detection-threshold.html#overview","text":"The detection threshold determines the sensitivity of the wake word detector. Lower thresholds are more permissive (most detections), while higher thresholds are more restrictive. While this threshold can be adjusted by the application, it's important to note that adjusting the threshold will result in opposite changes in FAR and FRR. For example, if the threshold is set to its minimum, the FRR will be reduced (which may seem ideal), but because the detection is more permissive, the FAR will also increase. And vice-versa if the threshold to its maximum value.","title":"Overview"},{"location":"features/wakeword/detection-threshold.html#adjusting-the-threshold","text":"The Amazon Wake Word Engine provides an API to adjust the detection threshold. This can be done at any time during the lifetime of the engine instance (i.e. at both initialization and run-time). Acceptable values for the threshold are 1 to 1000, where 1 is the most permissive and 1000 is the most restrictive. The default value is 500. See the API Reference for details. The following illustrates the effect of changing the threshold on FRR and FAR. The \"Higher Sensitivity\" arrow trends toward a threshold of 1 while the \"Lower Sensitivity\" arrow trends toward a threshold of 1000 .","title":"Adjusting the Threshold"},{"location":"features/wakeword/overview.html","text":"Wake Word Detection Wake Word detection is the process of detecting a specific word in an audio stream in order to activate Alexa on the device. This is the primary function provided by the engine. There are two components: a library (the \"engine\") and one or more models trained to detect one or more specific keywords. Wake Word Engine The Wake Word Engine is the code (library) that runs the wake word detection algorithm. This library comes either as a static library (.a) or shared object (.so/.dll). A single instance of a wake word engine is capable of loading one or more models, the capacity being determined by the CPU and memory resources available on the system. Each release will contain an engine built for one or more processor architectures. See Release Contents for more information. Wake Word Models Wake Word Models are required by the engine to recognize keywords. The model is a binary blob specifying the parameters that will be used by the engine to recognize specify the wake word(s). Models can be loaded into the engine at run-time or compiled statically into the application. Each model comes in two forms: a .bin file (for loading at runtime), and a .cpp file (for statically linking at compile time). Model Classes, Sizes, and Names The Amazon Wake Word Engine is capable of running different classes of models. Not all engines are capable of running all models. The type of model is described in the filename itself. The model name tells a few characteristics of the model - for example, the following model name: WR_250k.en-US.alexa Is made up of the following elements: WR : The first group of letters is the model class . This is used for class identification only. 250K : A number representing the model complexity. What this number represents is dependent on the model class, so cannot be compared between model classes. For W class models, it represents the approximate size of the model, in bytes. For other model classes such as X , it may indicate a different property. Note Some model classes do not have this numeric entry in their name, namely D class models, which are generally 750K in size. en-US The model locale, in ISO format (i.e. language-country ) alexa The wake word(s) the model is trained to detect. For multi-word models, each wake word is separated by a + sign. Model Formats Inside the release package is a models folder. Inside this folder there are subfolders, one for each model format . At minimum, there will be a common folder, which contains models in a generic, unoptimized format. If your release contains an architecture that can benefit from a custom model format, you will have additional subdirectories, each with different formats (but functionally equivalent) optimized for your architecture. For example, most ARM architectures will use the models in the f8 folder - these have been optimized for use on ARM cores. Important To determine which model your application should use, see Model Selection . Engine / Model Compatibility In general, the models used must match the version of the engine provided in the same release. While we are currently working on improving backward compatibility, this constraint must be followed. If a newer model is used with an older engine, or vice-versa, the library will return PRL_ERR_MODEL_INCOMPATIBLE during the initialization phase. In exceptional circumstances, if a newer model must be used with an older engine, the team can take this on as a special request. Please contact us if this is the case. Resource Requirements Wake word resource requirements are dependent on 3 characteristics: Wake Word Engine Type / Capabilities Wake Word Model Processor Architecture Below are resource requirements of commonly used engine types and models, measured on armv7a architecture, for reference. Engine Type Model Size/Class Memory (kB) Average CPU (MHz) v1 / v2 PRL1000 50K 250 12 v1 / v2 PRL1000 WR_250K 450 17 v1 / v2 PRL1000 D 1100 35 v2 PRL2000 X 390 39 Note These are estimates only, based on a typical ARM Cortex A53 processor. Exact numbers will vary on other processors.","title":"Overview"},{"location":"features/wakeword/overview.html#wake-word-detection","text":"Wake Word detection is the process of detecting a specific word in an audio stream in order to activate Alexa on the device. This is the primary function provided by the engine. There are two components: a library (the \"engine\") and one or more models trained to detect one or more specific keywords.","title":"Wake Word Detection"},{"location":"features/wakeword/overview.html#wake-word-engine","text":"The Wake Word Engine is the code (library) that runs the wake word detection algorithm. This library comes either as a static library (.a) or shared object (.so/.dll). A single instance of a wake word engine is capable of loading one or more models, the capacity being determined by the CPU and memory resources available on the system. Each release will contain an engine built for one or more processor architectures. See Release Contents for more information.","title":"Wake Word Engine"},{"location":"features/wakeword/overview.html#wake-word-models","text":"Wake Word Models are required by the engine to recognize keywords. The model is a binary blob specifying the parameters that will be used by the engine to recognize specify the wake word(s). Models can be loaded into the engine at run-time or compiled statically into the application. Each model comes in two forms: a .bin file (for loading at runtime), and a .cpp file (for statically linking at compile time).","title":"Wake Word Models"},{"location":"features/wakeword/overview.html#model-classes-sizes-and-names","text":"The Amazon Wake Word Engine is capable of running different classes of models. Not all engines are capable of running all models. The type of model is described in the filename itself. The model name tells a few characteristics of the model - for example, the following model name: WR_250k.en-US.alexa Is made up of the following elements: WR : The first group of letters is the model class . This is used for class identification only. 250K : A number representing the model complexity. What this number represents is dependent on the model class, so cannot be compared between model classes. For W class models, it represents the approximate size of the model, in bytes. For other model classes such as X , it may indicate a different property. Note Some model classes do not have this numeric entry in their name, namely D class models, which are generally 750K in size. en-US The model locale, in ISO format (i.e. language-country ) alexa The wake word(s) the model is trained to detect. For multi-word models, each wake word is separated by a + sign.","title":"Model Classes, Sizes, and Names"},{"location":"features/wakeword/overview.html#engine-model-compatibility","text":"In general, the models used must match the version of the engine provided in the same release. While we are currently working on improving backward compatibility, this constraint must be followed. If a newer model is used with an older engine, or vice-versa, the library will return PRL_ERR_MODEL_INCOMPATIBLE during the initialization phase. In exceptional circumstances, if a newer model must be used with an older engine, the team can take this on as a special request. Please contact us if this is the case.","title":"Engine / Model Compatibility"},{"location":"features/wakeword/overview.html#resource-requirements","text":"Wake word resource requirements are dependent on 3 characteristics: Wake Word Engine Type / Capabilities Wake Word Model Processor Architecture Below are resource requirements of commonly used engine types and models, measured on armv7a architecture, for reference. Engine Type Model Size/Class Memory (kB) Average CPU (MHz) v1 / v2 PRL1000 50K 250 12 v1 / v2 PRL1000 WR_250K 450 17 v1 / v2 PRL1000 D 1100 35 v2 PRL2000 X 390 39 Note These are estimates only, based on a typical ARM Cortex A53 processor. Exact numbers will vary on other processors.","title":"Resource Requirements"},{"location":"features/wakeword/performance.html","text":"Evaluating Wake Word Performance Overview The standard method of evaluating detection performance of a Wake Word engine involves two metrics: False Reject Rate (FRR) and False Accept Rate (FAR). False Reject Rate (FRR) False Reject Rate, also known as \"Recall Rate\" or \"Hit Count\" (more specifically, these two are the inverse of FRR), is a measure of how many wakewords are rejected in a sample of audio containing a known number of wakewords. This is typically expressed as a percentage, as there is known number of true examples in the test audio. For example, if a test set has 100 utterances, each beginning with the wake word, and 8 of them are NOT detected by the engine, the FRR is 8/100 = 0.08 = 8% . False Accept Rate (FAR) False Accept Rate is defined as the number of detections made by the engine over a certain period of time when subject to test audio that does not contain the wake word. This is expressed as a number of detections over a period of time. For example, if a wake word engine is fed 24 hours of NPR news (without any wake word present), and it detects 3 wakewords, the FAR would be 3/24hr (or 0.125/hr ). IMPORTANT FRR and FAR are entirely dependent on the test set being used. It is invalid to compare FAR/FRR performance of one wake word detector against another unless the test set and environment are identical. AVS Certification The AVS Certification process defines requirements/limits for FRR and FAR to ensure sufficient wake word performance of Alexa enabled devices before going to production. Contact your AVS Solution Architect for details.","title":"Wake Word Performance"},{"location":"features/wakeword/performance.html#evaluating-wake-word-performance","text":"","title":"Evaluating Wake Word Performance"},{"location":"features/wakeword/performance.html#overview","text":"The standard method of evaluating detection performance of a Wake Word engine involves two metrics: False Reject Rate (FRR) and False Accept Rate (FAR).","title":"Overview"},{"location":"features/wakeword/performance.html#false-reject-rate-frr","text":"False Reject Rate, also known as \"Recall Rate\" or \"Hit Count\" (more specifically, these two are the inverse of FRR), is a measure of how many wakewords are rejected in a sample of audio containing a known number of wakewords. This is typically expressed as a percentage, as there is known number of true examples in the test audio. For example, if a test set has 100 utterances, each beginning with the wake word, and 8 of them are NOT detected by the engine, the FRR is 8/100 = 0.08 = 8% .","title":"False Reject Rate (FRR)"},{"location":"features/wakeword/performance.html#false-accept-rate-far","text":"False Accept Rate is defined as the number of detections made by the engine over a certain period of time when subject to test audio that does not contain the wake word. This is expressed as a number of detections over a period of time. For example, if a wake word engine is fed 24 hours of NPR news (without any wake word present), and it detects 3 wakewords, the FAR would be 3/24hr (or 0.125/hr ). IMPORTANT FRR and FAR are entirely dependent on the test set being used. It is invalid to compare FAR/FRR performance of one wake word detector against another unless the test set and environment are identical.","title":"False Accept Rate (FAR)"},{"location":"features/wakeword/performance.html#avs-certification","text":"The AVS Certification process defines requirements/limits for FRR and FAR to ensure sufficient wake word performance of Alexa enabled devices before going to production. Contact your AVS Solution Architect for details.","title":"AVS Certification"},{"location":"features/wakeword/media-wakes/media-wakes.html","text":"False Media Wake Suppression False Media Wake Suppression refers to mechanisms that prevent Alexa-enabled devices from waking up to media sources of speech containing \"Alexa\". These media induced wakes can come from a variety of sources such as radio, TV, online videos, and others. There several mechanisms in place to prevent these - some cloud-based, and some locally on the device. Such \"false wakes\" can result in unintended audio streaming to the cloud, and ultimately poor customer experience and privacy breaches. Cloud Media Wake Suppression The default line of defense towards preventing these media induced wakes, which all utterances from all Alexa devices are subject to, is performed in the cloud. The wake word portion of all streams sent to Alexa services is compared against a large database of known media occurrences of the word \"Alexa\" - if a media-based wake is detected, the cloud immediately sends a StopCapture Directive to the device to shutdown the stream an ignore any subsequent audio. While this is effective in that large databases of known media can be stored, and more complex detection algorithms can be run in the cloud, the cloud based methods suffer from the fact that the device itself does in fact wake - the blue ring (or other audio/visual indicator) is activated on the device, a small portion of audio is sent to the cloud, and there a slight delay until the stream is terminated. In order to more quickly and securely detect and prevent these media wakes, the Wake Word engine has features that can provide similar detection performance on the device itself. Device Media Wake Suppression There are two methods of suppressing media induced wakes on the device. Both of these methods can detect and suppress the media wake in real-time as the wake word is processed, preventing any audio from streaming to the cloud and also preventing any audio/visual indication of audio streaming to the cloud. For these reasons, these device-side media wake rejection mechanisms provides the best customer experience. Fingerprinting Fingerprinting uses a local database of acoustic \"fingerprints\" to compare against the captured wake word in real-time on the device. For more information see the Fingerprinting section in this guide. Watermarking Watermarking is similar to photo watermarking - it embeds an inaudible \"key\" into the media's audio that can also be detected in real-time and further prevent these media induced wakes. For more information see the Watermarking section in this guide.","title":"Overview"},{"location":"features/wakeword/media-wakes/media-wakes.html#false-media-wake-suppression","text":"False Media Wake Suppression refers to mechanisms that prevent Alexa-enabled devices from waking up to media sources of speech containing \"Alexa\". These media induced wakes can come from a variety of sources such as radio, TV, online videos, and others. There several mechanisms in place to prevent these - some cloud-based, and some locally on the device. Such \"false wakes\" can result in unintended audio streaming to the cloud, and ultimately poor customer experience and privacy breaches.","title":"False Media Wake Suppression"},{"location":"features/wakeword/media-wakes/media-wakes.html#cloud-media-wake-suppression","text":"The default line of defense towards preventing these media induced wakes, which all utterances from all Alexa devices are subject to, is performed in the cloud. The wake word portion of all streams sent to Alexa services is compared against a large database of known media occurrences of the word \"Alexa\" - if a media-based wake is detected, the cloud immediately sends a StopCapture Directive to the device to shutdown the stream an ignore any subsequent audio. While this is effective in that large databases of known media can be stored, and more complex detection algorithms can be run in the cloud, the cloud based methods suffer from the fact that the device itself does in fact wake - the blue ring (or other audio/visual indicator) is activated on the device, a small portion of audio is sent to the cloud, and there a slight delay until the stream is terminated. In order to more quickly and securely detect and prevent these media wakes, the Wake Word engine has features that can provide similar detection performance on the device itself.","title":"Cloud Media Wake Suppression"},{"location":"features/wakeword/media-wakes/media-wakes.html#device-media-wake-suppression","text":"There are two methods of suppressing media induced wakes on the device. Both of these methods can detect and suppress the media wake in real-time as the wake word is processed, preventing any audio from streaming to the cloud and also preventing any audio/visual indication of audio streaming to the cloud. For these reasons, these device-side media wake rejection mechanisms provides the best customer experience.","title":"Device Media Wake Suppression"},{"location":"features/wakeword/media-wakes/fingerprinting/integration-guide.html","text":"On-Device Fingerprinting Integration Guide The following outlines the steps required to implement On-Device Fingerprinting. Add fingerprinting to existing Wake Word Application. To enable on-device fingerprinting, additional configuration values must be set in the configuration structure. PryonLiteV2Config has a fingerprinter field. This field should be a pointer to a structure containing three additional fields for fingerprinting. To enable fingerprinting, the fingerprintList field must be set to a buffer containing the fingerprint list binary, and the sizeofFingerprintList field must pass in the size of the list. PryonLiteFingerprintConfig engineConfig ; // Fingerprinting configuration PryonLiteFingerprintConfig fingerprintConfig = PryonLiteFingerprintConfig_Default ; // Example value; size of the fingerprint list binary in bytes fingerprintConfig -> sizeofFingerprintList = 27045 ; fingerprintConfig -> fingerprintList = & fingerprintListBuffer ; // Load the fingerprint config into the engine configuration. engineConfig . fingerprinter = & fingerprintConfig ; The complete setup can be found in the example file api_sample_PRL2000.cpp packaged with the library. Testing On-Device Fingerprinting A test fingerprint list binary and a matching sample audio file are found in the samples/fingerprinting folder. PryonLiteV2Config.fingerprinter.fingerprintList must be set to the memory address of the the test fingerprint list binary. Modify the sizeofFingerprintList field to the size of the fingerprint list binary in bytes. The audio can then be pushed to the PryonLite engine instance initialized above. The wake word in the sample audio file will be matched and suppressed and no wake word detection event happens. If the PryonLiteFingerprintMatchEvent is enabled, the engine will call the handler provided by the application. The handler may then print out the event to a console log. Sample Wake Word Engine Application A sample PryonLite engine application is included in the package. It can be run on the sample audio file and fingerprint list by preparing the audio file list with the samples/fingerprinting folder as per Prepare Input Audio List and using the below command. For PRL1000: PRL1000 does not support fingerprinting. For PRL2000 : ./<architecture>/amazon_filesim-PRL2000 -m models/<model_format>/D.en-US.alexa.bin samples/fingerprinting/all_wavs.list -f samples/fingerprinting/fingerprint_test_list","title":"Integration Guide"},{"location":"features/wakeword/media-wakes/fingerprinting/integration-guide.html#on-device-fingerprinting-integration-guide","text":"The following outlines the steps required to implement On-Device Fingerprinting.","title":"On-Device Fingerprinting Integration Guide"},{"location":"features/wakeword/media-wakes/fingerprinting/integration-guide.html#add-fingerprinting-to-existing-wake-word-application","text":"To enable on-device fingerprinting, additional configuration values must be set in the configuration structure. PryonLiteV2Config has a fingerprinter field. This field should be a pointer to a structure containing three additional fields for fingerprinting. To enable fingerprinting, the fingerprintList field must be set to a buffer containing the fingerprint list binary, and the sizeofFingerprintList field must pass in the size of the list. PryonLiteFingerprintConfig engineConfig ; // Fingerprinting configuration PryonLiteFingerprintConfig fingerprintConfig = PryonLiteFingerprintConfig_Default ; // Example value; size of the fingerprint list binary in bytes fingerprintConfig -> sizeofFingerprintList = 27045 ; fingerprintConfig -> fingerprintList = & fingerprintListBuffer ; // Load the fingerprint config into the engine configuration. engineConfig . fingerprinter = & fingerprintConfig ; The complete setup can be found in the example file api_sample_PRL2000.cpp packaged with the library.","title":"Add fingerprinting to existing Wake Word Application."},{"location":"features/wakeword/media-wakes/fingerprinting/integration-guide.html#testing-on-device-fingerprinting","text":"A test fingerprint list binary and a matching sample audio file are found in the samples/fingerprinting folder. PryonLiteV2Config.fingerprinter.fingerprintList must be set to the memory address of the the test fingerprint list binary. Modify the sizeofFingerprintList field to the size of the fingerprint list binary in bytes. The audio can then be pushed to the PryonLite engine instance initialized above. The wake word in the sample audio file will be matched and suppressed and no wake word detection event happens. If the PryonLiteFingerprintMatchEvent is enabled, the engine will call the handler provided by the application. The handler may then print out the event to a console log.","title":"Testing On-Device Fingerprinting"},{"location":"features/wakeword/media-wakes/fingerprinting/integration-guide.html#sample-wake-word-engine-application","text":"A sample PryonLite engine application is included in the package. It can be run on the sample audio file and fingerprint list by preparing the audio file list with the samples/fingerprinting folder as per Prepare Input Audio List and using the below command. For PRL1000: PRL1000 does not support fingerprinting. For PRL2000 : ./<architecture>/amazon_filesim-PRL2000 -m models/<model_format>/D.en-US.alexa.bin samples/fingerprinting/all_wavs.list -f samples/fingerprinting/fingerprint_test_list","title":"Sample Wake Word Engine Application"},{"location":"features/wakeword/media-wakes/fingerprinting/overview.html","text":"On-Device Fingerprinting Overview Fingerprinting requires that media be known \"apriori\" to the Alexa Wake Word team. Prior to media being broadcast, the media is submitted to the Wake Word team and scanned for wake words. For every wake word detected in the media, an acoustic \"fingerprint\" is generated and added to a database. These fingerprints are used for both cloud-side and device-side media-induced wake suppression. On-device fingerprinting is currently supported on all Amazon Echo Family devices. It is now also available to AVS partners that integrate an Amazon DAVS client. Dependencies Device Artifact Vending Service (DAVS) V2 API ( PRL2000 or above) Note DAVS is a prerequisite for fingerprinting, as the smaller, device side fingerprint database is smaller in capacity than the cloud. It must be updated on a weekly basis to ensure it contains the media fingerprints for the most highly broadcast media for any given week. Resource Requirements The on-device fingerprinting will require additional CPU and memory on top of the memory required for wake word detection . On an ARMv7A, the increase is as below: Fingerprint Database Capacity Memory (kB) CPU (MIPS) 50 75 5 100 115 13 FAQ See the Fingerprinting section of the FAQ.","title":"Overview"},{"location":"features/wakeword/media-wakes/fingerprinting/overview.html#on-device-fingerprinting","text":"","title":"On-Device Fingerprinting"},{"location":"features/wakeword/media-wakes/fingerprinting/overview.html#overview","text":"Fingerprinting requires that media be known \"apriori\" to the Alexa Wake Word team. Prior to media being broadcast, the media is submitted to the Wake Word team and scanned for wake words. For every wake word detected in the media, an acoustic \"fingerprint\" is generated and added to a database. These fingerprints are used for both cloud-side and device-side media-induced wake suppression. On-device fingerprinting is currently supported on all Amazon Echo Family devices. It is now also available to AVS partners that integrate an Amazon DAVS client.","title":"Overview"},{"location":"features/wakeword/media-wakes/fingerprinting/overview.html#dependencies","text":"Device Artifact Vending Service (DAVS) V2 API ( PRL2000 or above) Note DAVS is a prerequisite for fingerprinting, as the smaller, device side fingerprint database is smaller in capacity than the cloud. It must be updated on a weekly basis to ensure it contains the media fingerprints for the most highly broadcast media for any given week.","title":"Dependencies"},{"location":"features/wakeword/media-wakes/fingerprinting/overview.html#resource-requirements","text":"The on-device fingerprinting will require additional CPU and memory on top of the memory required for wake word detection . On an ARMv7A, the increase is as below: Fingerprint Database Capacity Memory (kB) CPU (MIPS) 50 75 5 100 115 13","title":"Resource Requirements"},{"location":"features/wakeword/media-wakes/fingerprinting/overview.html#faq","text":"See the Fingerprinting section of the FAQ.","title":"FAQ"},{"location":"features/wakeword/media-wakes/watermarking/integration.html","text":"Watermarking Integration Guide Add watermarking to existing Wake Word Application. Watermark based media wake suppression must be explicitly enabled during engine initialization by: Loading a Watermark config blob into memory and tracking a pointer to the memory and the sizeof the config blob loaded char * loadedWatermarkConfigBlob ; size_t sizeofLoadedWatermarkConfigBlob ; Defining and configuring an instance of PryonLiteWatermarkConfig which points to the prior loaded config blob PryonLiteWatermarkConfig watermarkConfig = PryonLiteWatermarkConfig_Default ; watermarkConfig . config = loadedWatermarkConfigBlob ; watermarkConfig . sizeofConfig = sizeofLoadedWatermarkConfigBlob ; Pointing to the configured PryonLiteWatermarkConfig instance from the PryonLiteV2Config instance used to configure the engine engineConfig . watermark = & watermarkConfig ; Testing A test watermark configuration binary (watermark.alexa+echo.bin) and matching sample audio files (alexa-watermark-media.##.wav) are found in the samples/watermarking folder. PryonLiteV2Config.watermark.config must be set to the memory address of the the test watermark configuration binary. Modify the sizeofConfig field to the size of the watermark configuration binary in bytes. The audio can then be pushed to the PryonLite engine instance initialized above. The wake word in the sample audio file will be matched and suppressed and no wake word detection event happens. Sample Wake Word Engine Application A sample PryonLite engine application is included in the package. It can be run on the sample audio file and watermark configuration binary by preparing the audio file list with the samples/watermaring folder as per Prepare Input Audio List and using the below command.. For PRL1000: PRL1000 does not support watermarking. For PRL2000 : ./<architecture>/amazon_filesim-PRL2000 -m models/<model_format>/D.en-US.alexa.bin samples/watermark/all_wavs.list -w samples/watermark/watermark.alexa+echo.bin","title":"Integration Guide"},{"location":"features/wakeword/media-wakes/watermarking/integration.html#watermarking-integration-guide","text":"","title":"Watermarking Integration Guide"},{"location":"features/wakeword/media-wakes/watermarking/integration.html#add-watermarking-to-existing-wake-word-application","text":"Watermark based media wake suppression must be explicitly enabled during engine initialization by: Loading a Watermark config blob into memory and tracking a pointer to the memory and the sizeof the config blob loaded char * loadedWatermarkConfigBlob ; size_t sizeofLoadedWatermarkConfigBlob ; Defining and configuring an instance of PryonLiteWatermarkConfig which points to the prior loaded config blob PryonLiteWatermarkConfig watermarkConfig = PryonLiteWatermarkConfig_Default ; watermarkConfig . config = loadedWatermarkConfigBlob ; watermarkConfig . sizeofConfig = sizeofLoadedWatermarkConfigBlob ; Pointing to the configured PryonLiteWatermarkConfig instance from the PryonLiteV2Config instance used to configure the engine engineConfig . watermark = & watermarkConfig ;","title":"Add watermarking to existing Wake Word Application."},{"location":"features/wakeword/media-wakes/watermarking/integration.html#testing","text":"A test watermark configuration binary (watermark.alexa+echo.bin) and matching sample audio files (alexa-watermark-media.##.wav) are found in the samples/watermarking folder. PryonLiteV2Config.watermark.config must be set to the memory address of the the test watermark configuration binary. Modify the sizeofConfig field to the size of the watermark configuration binary in bytes. The audio can then be pushed to the PryonLite engine instance initialized above. The wake word in the sample audio file will be matched and suppressed and no wake word detection event happens.","title":"Testing"},{"location":"features/wakeword/media-wakes/watermarking/integration.html#sample-wake-word-engine-application","text":"A sample PryonLite engine application is included in the package. It can be run on the sample audio file and watermark configuration binary by preparing the audio file list with the samples/watermaring folder as per Prepare Input Audio List and using the below command.. For PRL1000: PRL1000 does not support watermarking. For PRL2000 : ./<architecture>/amazon_filesim-PRL2000 -m models/<model_format>/D.en-US.alexa.bin samples/watermark/all_wavs.list -w samples/watermark/watermark.alexa+echo.bin","title":"Sample Wake Word Engine Application"},{"location":"features/wakeword/media-wakes/watermarking/overview.html","text":"Watermarking Overview Similar to a visual watermark , an audio watermark is a short bit of audio embedded in a larger audio file that can be used to identify the host audio. This watermark is encoded into the host audio, and at a later time, the host audio can be decoded to determine if there is a watermark present in the audio or not. As part of the the on-device wake word engine, the Amazon Wake Word Engine is capable of detecting these watermarks. If a watermark is present when the engine hears the word \"Alexa\", we suppress the wake word. Our audio watermarks are designed to be inaudible, while also remaining robust to different environments. This means our watermark is still detectable even when watermarked advertisements are played by a variety of different televisions, home living room acoustic environments, and noisy backgrounds. More information can be found in this Alexa Science blog post . Audio watermarks (red squiggles) are embedded imperceptibly in a media signal (black). Each watermark consists of a repeating sequence of audio building blocks (colored shapes). A detector segments the watermark and aligns the segments to see if they match. Randomly inverting the building blocks prevents rhythmic patterns in the media signal from triggering the detector; the detector uses a binary key to restore the inverted blocks. Dependencies Device Artifact Vending Service (DAVS) V2 API ( PRL2000 or above) Note DAVS is a prerequisite for watermarking, as new watermarks must have a mechanism of being deployed rapidly to Alexa devices in the field. Resource Requirements Watermarking resource requirement estimates for armv7a: Memory (KB) CPU (MIPS) 70 32 Note These are estimates only, based on a typical ARM Cortex A53 processor. Exact numbers will vary on other processors. FAQ See the Watermarking section of the FAQ.","title":"Overview"},{"location":"features/wakeword/media-wakes/watermarking/overview.html#watermarking-overview","text":"Similar to a visual watermark , an audio watermark is a short bit of audio embedded in a larger audio file that can be used to identify the host audio. This watermark is encoded into the host audio, and at a later time, the host audio can be decoded to determine if there is a watermark present in the audio or not. As part of the the on-device wake word engine, the Amazon Wake Word Engine is capable of detecting these watermarks. If a watermark is present when the engine hears the word \"Alexa\", we suppress the wake word. Our audio watermarks are designed to be inaudible, while also remaining robust to different environments. This means our watermark is still detectable even when watermarked advertisements are played by a variety of different televisions, home living room acoustic environments, and noisy backgrounds. More information can be found in this Alexa Science blog post . Audio watermarks (red squiggles) are embedded imperceptibly in a media signal (black). Each watermark consists of a repeating sequence of audio building blocks (colored shapes). A detector segments the watermark and aligns the segments to see if they match. Randomly inverting the building blocks prevents rhythmic patterns in the media signal from triggering the detector; the detector uses a binary key to restore the inverted blocks.","title":"Watermarking Overview"},{"location":"features/wakeword/media-wakes/watermarking/overview.html#dependencies","text":"Device Artifact Vending Service (DAVS) V2 API ( PRL2000 or above) Note DAVS is a prerequisite for watermarking, as new watermarks must have a mechanism of being deployed rapidly to Alexa devices in the field.","title":"Dependencies"},{"location":"features/wakeword/media-wakes/watermarking/overview.html#resource-requirements","text":"Watermarking resource requirement estimates for armv7a: Memory (KB) CPU (MIPS) 70 32 Note These are estimates only, based on a typical ARM Cortex A53 processor. Exact numbers will vary on other processors.","title":"Resource Requirements"},{"location":"features/wakeword/media-wakes/watermarking/overview.html#faq","text":"See the Watermarking section of the FAQ.","title":"FAQ"},{"location":"features/wakeword/preroll/preroll-integration-guide.html","text":"Wake Word Pre-Roll Integration Guide Background For the definition and requirements for pre-roll for AVS devices, see the Overview . Implementation The basic idea is to identify the pre-roll start index as being 500 milliseconds ahead of the wake word start index, as illustrated in the AVS Streaming Requirements Shared Memory Ring Buffer recommendation. We also have a detailed reference available to guide general implementation and integration, in the AVS Device SDK . Before we continue with highlighting details specific to how the AVS Device SDK prepends pre-roll to wake word audio for AVS Recognize events, we reiterate that while an audio ring buffer implementation is recommended for maintaining history of audio associated with the PryonLite wake word engine, integrators are free to meet the pre-roll requirements using alternative means. Basic AVS Device SDK Architecture There is an AudioInputProcessor that implements the AVS SpeechRecognizer interface. The SDK relies on an instance of a Shared Data Stream to collect and share audio data associated with (among other things) wake word detection. The SDK's Keyword Detector emits detection events identifying the wake word segment within the shared data stream. The SDK's AudioInputProcessor Capability Agent then rolls the start reference back by 500 milliseconds, to identify where the pre-roll segment starts, for the purpose of composing the AVS Recognize event. The specific code where this pre-roll is managed is the executeRecognize method, where we see the following : begin -= preroll ; where preroll is defined as // 500ms preroll. avsCommon :: avs :: AudioInputStream :: Index preroll = provider . format . sampleRateHz / 2 ; Note that the 500 millisecond pre-roll duration is equivalent to 8000 samples at the 16 kHz sampling rate used for wake word detection. In addition, a 500 millisecond pre-roll definition /// Preroll duration is a fixed 500ms. static const std :: chrono :: milliseconds PREROLL_DURATION = std :: chrono :: milliseconds ( 500 ); is also used to adjust timestamps in related parts of the audio input processor: startOfStreamTimestamp -= PREROLL_DURATION ; For further details on the AVS C++ Device SDK, and integration of the PryonLite wake word engine as a KeywordDetectorProvider adapter, see the AVS Device SDK section. Collecting audio from an empty state. Q: What about the initial startup period where audio history is being collected? A: If (a) - the audio history collection buffer and wake word engine are streamed audio starting from the same time index, and (b) a wake word exists very early on in that audio stream, the PryonLite wake word detector may emit a detection event, such that the start index of the detected wake word lies within the first 500 milliseconds of audio streamed to the engine (which is the same audio collected in the audio history buffer). This leaves us with a possibility where we have collected insufficient pre-roll. This is illustrated below. Case where wake word is late enough into audio stream for sufficient pre-roll to be available: Case wake word is encountered early enough into audio stream such that pre-roll collected is insufficient . Your integration should be designed to minimize or eliminate the insufficient pre-roll condition, by avoiding unnecessary tear-downs and restarts of the audio history collection mechanism that would flush collection history, as well as unnecessary tear-downs and restarts of the wake word engine. With a wake word detector instance supporting an always-listening use case, the frequency with which these 'possible that insufficient pre-roll has been collected' periods should be confined to a one-time occurrence at the beginning of the listening period. One must keep in mind that each time streaming to the wake word engine is interrupted, this can lead to a service outage that impairs the \"always listening\" nature of the AVS product. If you absolutely cannot avoid the 'insufficient preroll' condition, prepending zero samples to make up for any pre-roll history shortfall is not advised . If a wake word engine tear-down and restart is absolutely necessary, one can still ensure that the necessary pre-roll is still available, by avoiding a teardown of the associated audio history collector: We can see in the middle of the diagram above, that if the audio history collector retains its state - even if a wake word is detected very early on into the audio streaming of a restarted wake word detector, there is sufficient pre-roll history available. Note that we can see above in the left of the diagram, that we still have the period of time from when we start filling our audio history collection buffer, that there may be insufficient history to meet pre-roll requirements. In that case, one can offset the streaming of audio to the wake word detector, such that the first 500 milliseconds of audio in the audio ring buffer are not transferred as part of the audio stream to the wake word detector: We can see that in the diagram above, even if the wake word engine detects a wake word at the very beginning of the audio stream it receives, the surrounding history collection framework is guaranteed to be able to meet the AVS system pre-roll requirements because it retains audio state older than that streamed to the wake word detector. Note that while this delays the time from which wake words can be detected by 500 milliseconds, it does not introduce any latency to wake word detection itself, nor does it delay the immediate transfer of new audio samples from the source microphone to the wake word detector. Be careful not to implement any mechanisms that would unnecessarily introduce any such buffering latency in the microphone-to-wake-word-detector path, just to meet pre-roll requirements. For most products, a delay of about 500 milliseconds to fully engage to \"always listening\" wake word detection mode should be acceptable. Also note that if subsequent teardowns and restarts do not flush any collected audio history, this 500 millisecond delay is unnecessary, because the audio history collection buffer is not starting from a completely empty state.","title":"Integration Guide"},{"location":"features/wakeword/preroll/preroll-integration-guide.html#wake-word-pre-roll-integration-guide","text":"","title":"Wake Word Pre-Roll Integration Guide"},{"location":"features/wakeword/preroll/preroll-integration-guide.html#background","text":"For the definition and requirements for pre-roll for AVS devices, see the Overview .","title":"Background"},{"location":"features/wakeword/preroll/preroll-integration-guide.html#implementation","text":"The basic idea is to identify the pre-roll start index as being 500 milliseconds ahead of the wake word start index, as illustrated in the AVS Streaming Requirements Shared Memory Ring Buffer recommendation. We also have a detailed reference available to guide general implementation and integration, in the AVS Device SDK . Before we continue with highlighting details specific to how the AVS Device SDK prepends pre-roll to wake word audio for AVS Recognize events, we reiterate that while an audio ring buffer implementation is recommended for maintaining history of audio associated with the PryonLite wake word engine, integrators are free to meet the pre-roll requirements using alternative means.","title":"Implementation"},{"location":"features/wakeword/preroll/preroll-integration-guide.html#basic-avs-device-sdk-architecture","text":"There is an AudioInputProcessor that implements the AVS SpeechRecognizer interface. The SDK relies on an instance of a Shared Data Stream to collect and share audio data associated with (among other things) wake word detection. The SDK's Keyword Detector emits detection events identifying the wake word segment within the shared data stream. The SDK's AudioInputProcessor Capability Agent then rolls the start reference back by 500 milliseconds, to identify where the pre-roll segment starts, for the purpose of composing the AVS Recognize event. The specific code where this pre-roll is managed is the executeRecognize method, where we see the following : begin -= preroll ; where preroll is defined as // 500ms preroll. avsCommon :: avs :: AudioInputStream :: Index preroll = provider . format . sampleRateHz / 2 ; Note that the 500 millisecond pre-roll duration is equivalent to 8000 samples at the 16 kHz sampling rate used for wake word detection. In addition, a 500 millisecond pre-roll definition /// Preroll duration is a fixed 500ms. static const std :: chrono :: milliseconds PREROLL_DURATION = std :: chrono :: milliseconds ( 500 ); is also used to adjust timestamps in related parts of the audio input processor: startOfStreamTimestamp -= PREROLL_DURATION ; For further details on the AVS C++ Device SDK, and integration of the PryonLite wake word engine as a KeywordDetectorProvider adapter, see the AVS Device SDK section.","title":"Basic AVS Device SDK Architecture"},{"location":"features/wakeword/preroll/preroll-integration-guide.html#collecting-audio-from-an-empty-state","text":"Q: What about the initial startup period where audio history is being collected? A: If (a) - the audio history collection buffer and wake word engine are streamed audio starting from the same time index, and (b) a wake word exists very early on in that audio stream, the PryonLite wake word detector may emit a detection event, such that the start index of the detected wake word lies within the first 500 milliseconds of audio streamed to the engine (which is the same audio collected in the audio history buffer). This leaves us with a possibility where we have collected insufficient pre-roll. This is illustrated below. Case where wake word is late enough into audio stream for sufficient pre-roll to be available: Case wake word is encountered early enough into audio stream such that pre-roll collected is insufficient . Your integration should be designed to minimize or eliminate the insufficient pre-roll condition, by avoiding unnecessary tear-downs and restarts of the audio history collection mechanism that would flush collection history, as well as unnecessary tear-downs and restarts of the wake word engine. With a wake word detector instance supporting an always-listening use case, the frequency with which these 'possible that insufficient pre-roll has been collected' periods should be confined to a one-time occurrence at the beginning of the listening period. One must keep in mind that each time streaming to the wake word engine is interrupted, this can lead to a service outage that impairs the \"always listening\" nature of the AVS product. If you absolutely cannot avoid the 'insufficient preroll' condition, prepending zero samples to make up for any pre-roll history shortfall is not advised . If a wake word engine tear-down and restart is absolutely necessary, one can still ensure that the necessary pre-roll is still available, by avoiding a teardown of the associated audio history collector: We can see in the middle of the diagram above, that if the audio history collector retains its state - even if a wake word is detected very early on into the audio streaming of a restarted wake word detector, there is sufficient pre-roll history available. Note that we can see above in the left of the diagram, that we still have the period of time from when we start filling our audio history collection buffer, that there may be insufficient history to meet pre-roll requirements. In that case, one can offset the streaming of audio to the wake word detector, such that the first 500 milliseconds of audio in the audio ring buffer are not transferred as part of the audio stream to the wake word detector: We can see that in the diagram above, even if the wake word engine detects a wake word at the very beginning of the audio stream it receives, the surrounding history collection framework is guaranteed to be able to meet the AVS system pre-roll requirements because it retains audio state older than that streamed to the wake word detector. Note that while this delays the time from which wake words can be detected by 500 milliseconds, it does not introduce any latency to wake word detection itself, nor does it delay the immediate transfer of new audio samples from the source microphone to the wake word detector. Be careful not to implement any mechanisms that would unnecessarily introduce any such buffering latency in the microphone-to-wake-word-detector path, just to meet pre-roll requirements. For most products, a delay of about 500 milliseconds to fully engage to \"always listening\" wake word detection mode should be acceptable. Also note that if subsequent teardowns and restarts do not flush any collected audio history, this 500 millisecond delay is unnecessary, because the audio history collection buffer is not starting from a completely empty state.","title":"Collecting audio from an empty state."},{"location":"features/wakeword/preroll/preroll-overview.html","text":"Wake Word Pre-Roll Overview Definition The following diagram illustrates the structure of a typical Alexa utterance, and defines the term \"wake word pre-roll\". The pre-roll audio segment is the first segment of a wake word-initiated audio utterance, preceding the wake word audio segment. AVS Pre-Roll Streaming Requirements AVS Streaming Requirements mandate that each AVS SpeechRecognizer request MUST provide a 500 millisecond span of pre-roll audio ahead of the wake word audio segment. Additional context is provided in the linked requirements document regarding the importance of pre-roll for optimal AVS performance, taking cloud-based wake word verification as an example. Please read the AVS streaming requirements document linked above in full, and also note that although the document is titled \"Requirements for Cloud-Based Wake Word Verification\", pre-roll audio has an importance over and above just cloud-based wake word verification. Many downstream audio algorithms require or benefit from the additional pre-wake-word acoustic context that pre-roll provides, including media-induced wake suppression and automatic speech recognition. Failure to send sufficient pre-roll will degrade the AVS response to your device's requests. More Information See the Pre-Roll Integration section in this guide for more details.","title":"Overview"},{"location":"features/wakeword/preroll/preroll-overview.html#wake-word-pre-roll-overview","text":"","title":"Wake Word Pre-Roll Overview"},{"location":"features/wakeword/preroll/preroll-overview.html#definition","text":"The following diagram illustrates the structure of a typical Alexa utterance, and defines the term \"wake word pre-roll\". The pre-roll audio segment is the first segment of a wake word-initiated audio utterance, preceding the wake word audio segment.","title":"Definition"},{"location":"features/wakeword/preroll/preroll-overview.html#avs-pre-roll-streaming-requirements","text":"AVS Streaming Requirements mandate that each AVS SpeechRecognizer request MUST provide a 500 millisecond span of pre-roll audio ahead of the wake word audio segment. Additional context is provided in the linked requirements document regarding the importance of pre-roll for optimal AVS performance, taking cloud-based wake word verification as an example. Please read the AVS streaming requirements document linked above in full, and also note that although the document is titled \"Requirements for Cloud-Based Wake Word Verification\", pre-roll audio has an importance over and above just cloud-based wake word verification. Many downstream audio algorithms require or benefit from the additional pre-wake-word acoustic context that pre-roll provides, including media-induced wake suppression and automatic speech recognition. Failure to send sufficient pre-roll will degrade the AVS response to your device's requests.","title":"AVS Pre-Roll Streaming Requirements"},{"location":"features/wakeword/preroll/preroll-overview.html#more-information","text":"See the Pre-Roll Integration section in this guide for more details.","title":"More Information"},{"location":"features/wakeword/self-wake/self-wake-overview.html","text":"Self-Wake Mitigation Overview Self-wakes are a special type of \"media induced wake\" where the audio containing the Wake Word comes from the device itself. That is, an AVS device's own audio playback through its loudspeaker is picked up by its microphone, and ultimately leads to the device waking up from it's own audio output. Mitigating Self-Wakes Acoustic Echo Cancellation (AEC) This is the standard method of preventing self-wakes. Since the device itself has access to the audio being played back, if an Acoustic Echo Canceller is enabled and functioning properly, it should eliminate enough of the playback signal from the microphone input to prevent self-wakes. Tip Most modern multimedia products have an echo canceller built-in at the audio codec level. For example, Android devices have an Acoustic Echo Canceller API that can be used out of the box. Similarly, iOS also has a system level AEC that's built in. Benefits: Addresses the root cause of self-wake - acoustic echo feeding back from device loudspeakers to device microphones is removed through digital signal processing techniques. Does not care if the wake word in playback media was real speech or TTS-synthesized, or whether the media was fingerprinted or watermarked. Drawbacks: Can be power and resource intensive, especially for always-on applications. One may need to enable/disable an AEC based on the necessity - only required when loudspeaker playback is active. Playback-Path Wake Word Detection Run a secondary wake word detector on the playback audio audio before it's sent to the loudspeakers. One can time-align loudspeaker-path detection events with those from the 'primary' microphone-path wake word detector, and ignore the microphone-path events closely aligned with loudspeaker-path events. Benefits: Analyzes a playback-path signal in real-time with generally less cost than an AEC, and can be used in a system where an AEC is not available. Drawbacks: A somewhat heavy-handed approach that should have no effect when an AEC is working properly, but may also block a real near-end talker from invoking a wake word request if a wake word is being played back through the loudspeaker at the same time. Requires tightly synchronized communications between the playback path and microphone-capture-path audio processing blocks, and may be of sufficiently high integration complexity to be easily integrated.","title":"Self-Wake Suppression"},{"location":"features/wakeword/self-wake/self-wake-overview.html#self-wake-mitigation-overview","text":"Self-wakes are a special type of \"media induced wake\" where the audio containing the Wake Word comes from the device itself. That is, an AVS device's own audio playback through its loudspeaker is picked up by its microphone, and ultimately leads to the device waking up from it's own audio output.","title":"Self-Wake Mitigation Overview"},{"location":"features/wakeword/self-wake/self-wake-overview.html#mitigating-self-wakes","text":"","title":"Mitigating Self-Wakes"},{"location":"features/wakeword/self-wake/self-wake-overview.html#acoustic-echo-cancellation-aec","text":"This is the standard method of preventing self-wakes. Since the device itself has access to the audio being played back, if an Acoustic Echo Canceller is enabled and functioning properly, it should eliminate enough of the playback signal from the microphone input to prevent self-wakes. Tip Most modern multimedia products have an echo canceller built-in at the audio codec level. For example, Android devices have an Acoustic Echo Canceller API that can be used out of the box. Similarly, iOS also has a system level AEC that's built in. Benefits: Addresses the root cause of self-wake - acoustic echo feeding back from device loudspeakers to device microphones is removed through digital signal processing techniques. Does not care if the wake word in playback media was real speech or TTS-synthesized, or whether the media was fingerprinted or watermarked. Drawbacks: Can be power and resource intensive, especially for always-on applications. One may need to enable/disable an AEC based on the necessity - only required when loudspeaker playback is active.","title":"Acoustic Echo Cancellation (AEC)"},{"location":"features/wakeword/self-wake/self-wake-overview.html#playback-path-wake-word-detection","text":"Run a secondary wake word detector on the playback audio audio before it's sent to the loudspeakers. One can time-align loudspeaker-path detection events with those from the 'primary' microphone-path wake word detector, and ignore the microphone-path events closely aligned with loudspeaker-path events. Benefits: Analyzes a playback-path signal in real-time with generally less cost than an AEC, and can be used in a system where an AEC is not available. Drawbacks: A somewhat heavy-handed approach that should have no effect when an AEC is working properly, but may also block a real near-end talker from invoking a wake word request if a wake word is being played back through the loudspeaker at the same time. Requires tightly synchronized communications between the playback path and microphone-capture-path audio processing blocks, and may be of sufficiently high integration complexity to be easily integrated.","title":"Playback-Path Wake Word Detection"},{"location":"features/wwdi/integration-guide.html","text":"WWDI: Wake Word Diagnostic Information Integration Guide This section provides guidelines on incorporating Amazon WWDI into SpeechRecognizer Recognize Events WWDI is opaquely emitted during a wake word detection event from the PryonLite library as a byte array in PryonLiteWakewordResult. typedef struct PryonLiteWakewordResult { ... PryonLiteMetadataBlob metadataBlob ; ... }; The application is responsible for prepending this WWDI byte array to the SpeechRecognizer Recognize event. Since the integration of WWDI is very application-specific and depends on the network transport code, we illustrate this integration using the AVS Device SDK here instead of providing generic instructions. WWDI should be attached to SpeechRecognizer Recognize events as an HTTP/2 encoded multipart message part with name \" wakewordEngineMetadata \". With WWDI incorporated into the Recognize event, the parts are ordered in the HTTP/2 payload as follows: Multipart Message Part 1, name = \"metadata\" Multipart Message Part 2, name = \" wakewordEngineMetadata \" Multipart Message Part 3, name = \"audio\" The following illustrates the placement of the wake word diagnostic information: WWDI integration in AVS Device SDK Amazon WWE functionality is integrated into the AVS Device SDK via the AbstractKeywordDetector object. A class called KeywordDetectorProvider has been implemented specifically for PryonLite by the Amazon Lite adapter . This KeywordDetectorProvider implementation may be used as a reference to identify how various Amazon WWE interfaces should be connected to the AbstractKeywordDetector interfaces. Those partners that use the PryonLite KeywordDetectorProvider should have little or no need to make any modifications. In the AVS Device SDK, WWDI is referred to as KWDMetadata (Keyword Detector metadata). The Amazon Lite adapter's KeywordDetectorProvider implementation emits WWDI to the AIP layer (Audio Input Processor) of the AVS Device SDK using the KeyWordObserverInterface . Interface returning WWDI in KeyWordObserverInterface.h virtual void onKeyWordDetected ( std :: shared_ptr < avs :: AudioInputStream > stream , std :: string keyword , avs :: AudioInputStream :: Index beginIndex = UNSPECIFIED_INDEX , avs :: AudioInputStream :: Index endIndex = UNSPECIFIED_INDEX , std :: shared_ptr < const std :: vector < char >> KWDMetadata = nullptr ) = 0 ; }; The AIP layer is where the Recognize message is composed. During the Recognize multi-part MIME message composition, a 'wakewordEngineMetadata' message is inserted between the 'metadata' and 'audio' messages, as shown here : Composition of Recognize Event in AudioInputProcessor.cpp m_recognizeRequest = std :: make_shared < avsCommon :: avs :: MessageRequest > ( msgIdAndJsonEvent . second ); if ( m_KWDMetadataReader ) { m_recognizeRequest -> addAttachmentReader ( KWD_METADATA_FIELD_NAME , m_KWDMetadataReader ); } m_recognizeRequest -> addAttachmentReader ( AUDIO_ATTACHMENT_FIELD_NAME , m_reader ); Definitions of Wake word audio and WWDI MIME attachment names /// The field name for the user voice attachment. static const std :: string AUDIO_ATTACHMENT_FIELD_NAME = \"audio\" ; /// The field name for the wake word engine metadata. static const std :: string KWD_METADATA_FIELD_NAME = \"wakewordEngineMetadata\" ;","title":"Integration Guide"},{"location":"features/wwdi/integration-guide.html#wwdi-wake-word-diagnostic-information","text":"","title":"WWDI: Wake Word Diagnostic Information"},{"location":"features/wwdi/integration-guide.html#integration-guide","text":"This section provides guidelines on incorporating Amazon WWDI into SpeechRecognizer Recognize Events WWDI is opaquely emitted during a wake word detection event from the PryonLite library as a byte array in PryonLiteWakewordResult. typedef struct PryonLiteWakewordResult { ... PryonLiteMetadataBlob metadataBlob ; ... }; The application is responsible for prepending this WWDI byte array to the SpeechRecognizer Recognize event. Since the integration of WWDI is very application-specific and depends on the network transport code, we illustrate this integration using the AVS Device SDK here instead of providing generic instructions. WWDI should be attached to SpeechRecognizer Recognize events as an HTTP/2 encoded multipart message part with name \" wakewordEngineMetadata \". With WWDI incorporated into the Recognize event, the parts are ordered in the HTTP/2 payload as follows: Multipart Message Part 1, name = \"metadata\" Multipart Message Part 2, name = \" wakewordEngineMetadata \" Multipart Message Part 3, name = \"audio\" The following illustrates the placement of the wake word diagnostic information:","title":"Integration Guide"},{"location":"features/wwdi/integration-guide.html#wwdi-integration-in-avs-device-sdk","text":"Amazon WWE functionality is integrated into the AVS Device SDK via the AbstractKeywordDetector object. A class called KeywordDetectorProvider has been implemented specifically for PryonLite by the Amazon Lite adapter . This KeywordDetectorProvider implementation may be used as a reference to identify how various Amazon WWE interfaces should be connected to the AbstractKeywordDetector interfaces. Those partners that use the PryonLite KeywordDetectorProvider should have little or no need to make any modifications. In the AVS Device SDK, WWDI is referred to as KWDMetadata (Keyword Detector metadata). The Amazon Lite adapter's KeywordDetectorProvider implementation emits WWDI to the AIP layer (Audio Input Processor) of the AVS Device SDK using the KeyWordObserverInterface . Interface returning WWDI in KeyWordObserverInterface.h virtual void onKeyWordDetected ( std :: shared_ptr < avs :: AudioInputStream > stream , std :: string keyword , avs :: AudioInputStream :: Index beginIndex = UNSPECIFIED_INDEX , avs :: AudioInputStream :: Index endIndex = UNSPECIFIED_INDEX , std :: shared_ptr < const std :: vector < char >> KWDMetadata = nullptr ) = 0 ; }; The AIP layer is where the Recognize message is composed. During the Recognize multi-part MIME message composition, a 'wakewordEngineMetadata' message is inserted between the 'metadata' and 'audio' messages, as shown here : Composition of Recognize Event in AudioInputProcessor.cpp m_recognizeRequest = std :: make_shared < avsCommon :: avs :: MessageRequest > ( msgIdAndJsonEvent . second ); if ( m_KWDMetadataReader ) { m_recognizeRequest -> addAttachmentReader ( KWD_METADATA_FIELD_NAME , m_KWDMetadataReader ); } m_recognizeRequest -> addAttachmentReader ( AUDIO_ATTACHMENT_FIELD_NAME , m_reader ); Definitions of Wake word audio and WWDI MIME attachment names /// The field name for the user voice attachment. static const std :: string AUDIO_ATTACHMENT_FIELD_NAME = \"audio\" ; /// The field name for the wake word engine metadata. static const std :: string KWD_METADATA_FIELD_NAME = \"wakewordEngineMetadata\" ;","title":"WWDI integration in AVS Device SDK"},{"location":"features/wwdi/overview.html","text":"WWDI: Wake Word Diagnostic Information Notice As of August 2020, implementing WWDI is a requirement to pass AVS Certification Wake word diagnostic information provides Alexa Voice Service a way to improve customer experience by attaching certain diagnostic information with detection events emitted by Amazon wake word engine (also known as Keyword Detectors/KWD in the AVS Device SDK ). The diagnostic information is used for understanding wake word engine performance and assess health of devices in the field. It is minimal in size (on the order of 2KB) to ensure there is no effect on user perceived latency. This feature also improves the ability for AVS endpoints to detect and reject false wakes. It does NOT contain audio that would breach privacy, nor does it contain customer specific information. The diagnostic information is prepended to SpeechRecognizer Recognize events as an HTTP/2 encoded multipart message. Wake word diagnostic metadata contains information such as the following: WWDI version Detected wake word name Engine Version Model Version Uptime of the wake word engine Wake word detection score Wake word detection thresholds In future, Amazon may add fields as new diagnostic information becomes available with new features and engine versions. Wake word diagnostic information is emitted only when wake word event is detected. The diagnostic information is pre-pended to the audio stream for SpeechRecognizer Recognize events. The WWDI is in the range of 2KB in size. There is a hard limit of 4KB before the WWDI is rejected by the Alexa Service. For details on integration WWDI into your application, see the Integration Guide . Dependencies None. WWDI is output by all versions of the Wake Word Engine. Resource Requirements None FAQ See the WWDI section of the FAQ.","title":"Overview"},{"location":"features/wwdi/overview.html#wwdi-wake-word-diagnostic-information","text":"Notice As of August 2020, implementing WWDI is a requirement to pass AVS Certification Wake word diagnostic information provides Alexa Voice Service a way to improve customer experience by attaching certain diagnostic information with detection events emitted by Amazon wake word engine (also known as Keyword Detectors/KWD in the AVS Device SDK ). The diagnostic information is used for understanding wake word engine performance and assess health of devices in the field. It is minimal in size (on the order of 2KB) to ensure there is no effect on user perceived latency. This feature also improves the ability for AVS endpoints to detect and reject false wakes. It does NOT contain audio that would breach privacy, nor does it contain customer specific information. The diagnostic information is prepended to SpeechRecognizer Recognize events as an HTTP/2 encoded multipart message. Wake word diagnostic metadata contains information such as the following: WWDI version Detected wake word name Engine Version Model Version Uptime of the wake word engine Wake word detection score Wake word detection thresholds In future, Amazon may add fields as new diagnostic information becomes available with new features and engine versions. Wake word diagnostic information is emitted only when wake word event is detected. The diagnostic information is pre-pended to the audio stream for SpeechRecognizer Recognize events. The WWDI is in the range of 2KB in size. There is a hard limit of 4KB before the WWDI is rejected by the Alexa Service. For details on integration WWDI into your application, see the Integration Guide .","title":"WWDI: Wake Word Diagnostic Information"},{"location":"features/wwdi/overview.html#dependencies","text":"None. WWDI is output by all versions of the Wake Word Engine.","title":"Dependencies"},{"location":"features/wwdi/overview.html#resource-requirements","text":"None","title":"Resource Requirements"},{"location":"features/wwdi/overview.html#faq","text":"See the WWDI section of the FAQ.","title":"FAQ"},{"location":"getting-started/filesim.html","text":"File Simulators File simulators are offline .wav file processor utilities to aid in evaluation of the Wake Word engine on pre-recoded audio files. It takes WAV files containing in a mono 16-bit 16 kHz audio format, a wake word model, and outputs detection events to stdout. The file simulator application is provided only for those architectures where file I/O and stdio C run-time library functionality is readily available (i.e. x86, darwin, linux, etc.). How to run the application For an exhaustive list of options, run the application with -h . Locate the executable in the package If supplied as part of the PryonLite package, an amazon_ww_filesim application binary will be present in a given <target> subfolder. For example, an Ubuntu x86 version of the filesim application would be found in the x86 folder. File simulators for V2 API are located in subfolders of the architecture folder. The following table shows locations of file simulator applications built for various API versions: API / Version File Simulator Location v2 PRL1000 ./<architecture>/PRL1000/amazon_filesim-PRL1000 v2 PRL2000 ./<architecture>/PRL2000/amazon_filesim-PRL2000 Note Some target subfolders will not contain an amazon filesim application binary. This is due to lack of the target's file i/o capabilities. Prepare Input Audio List If you have a directory containing the .wav files to process, run the following command to generate a list file with all the .wav files in that directory: ls -1 *.wav > all_wavs.list The wake word engine uses adaptation logic to learn the characteristics of the acoustic environment. This improves detection quality. However if the audio clips are from different environments and do not contain at least 1 second of background noise prior to the wake word, we suggest using -c which will force the engine to clear the adaptation statistics after each file. Tip You can also run a single .wav file without a list. Just use the .wav file path instead of a list file path on the command line Sample input A sample WAV list file and associated .WAV files can be found in the following package locations: ./sample-wakeword/alexas.list ./sample-wakeword/alexa-*.wav The list file will look like this: alexa-01.wav alexa-02.wav alexa-03.wav alexa-04.wav alexa-05.wav alexa-06.wav alexa-07.wav alexa-08.wav alexa-09.wav alexa-10.wav Select a wake word model The wake word model must be selected based on the paths specified by a particular architecture's WakewordModelMapping.json configuration file. For example, for the x86 (Ubuntu) architecture, the WakewordModelMapping.json file specifies that the following model is suitable for the x86 Ubuntu PryonLite engine: common/D.en-US.alexa The appropriate wake word model is then found in the 'models' subfolder, with a .bin extension. For example: path_to_model = models/common/D.en-US.alexa.bin For detailed information on model selection, see this section Run the application The application takes the wake word model and an audio list file in as command line parameters: ./<filesim application name> -m <path_to_model> all_wavs.list To display a full list of options, run the filesim application with the '-h' option. Reference file simulation output Reference output from file simulation applications is provided in the package. In the example below, the reference file name is specifically for an x86 Ubuntu architecture using a D.en-US.alexa wake word model. Your package may contain a sample output for a different architecture or model, depending on your package request. ./sample-wakeword/ref-output-x86-D.en-US.alexa-alexas.list The application output will look similar to the following: Loaded model from path: models/common/D.en-US.alexa.bin Decoder instance memory allocated: 266344 Model Version: en-US_D_ALEXA+STOP_2018SuperBowl_v5.0 Engine Version: 2.9.0 Supported keywords: ALEXA, STOP Threshold: 500 alexa-01: 'ALEXA' detected during [0, 9600] alexa-02: 'ALEXA' detected during [8640, 17920] alexa-03: 'ALEXA' detected during [3680, 13920] alexa-04: 'ALEXA' detected during [4480, 14720] alexa-05: 'ALEXA' detected during [3040, 13280] alexa-06: 'ALEXA' detected during [5440, 15680] alexa-07: 'ALEXA' detected during [8800, 20480] alexa-08: 'ALEXA' detected during [7840, 19520] alexa-09: 'ALEXA' detected during [12000, 24160] alexa-10: 'ALEXA' detected during [30400, 43040] *** 10 wake word(s) detected in 10 files *** Fingerprinting File Simulator See the Fingerprinting Integration Guide for instructions on how to pass a fingerprint list to the file simulator.","title":"File Simulators"},{"location":"getting-started/filesim.html#file-simulators","text":"File simulators are offline .wav file processor utilities to aid in evaluation of the Wake Word engine on pre-recoded audio files. It takes WAV files containing in a mono 16-bit 16 kHz audio format, a wake word model, and outputs detection events to stdout. The file simulator application is provided only for those architectures where file I/O and stdio C run-time library functionality is readily available (i.e. x86, darwin, linux, etc.).","title":"File Simulators"},{"location":"getting-started/filesim.html#how-to-run-the-application","text":"For an exhaustive list of options, run the application with -h .","title":"How to run the application"},{"location":"getting-started/filesim.html#locate-the-executable-in-the-package","text":"If supplied as part of the PryonLite package, an amazon_ww_filesim application binary will be present in a given <target> subfolder. For example, an Ubuntu x86 version of the filesim application would be found in the x86 folder. File simulators for V2 API are located in subfolders of the architecture folder. The following table shows locations of file simulator applications built for various API versions: API / Version File Simulator Location v2 PRL1000 ./<architecture>/PRL1000/amazon_filesim-PRL1000 v2 PRL2000 ./<architecture>/PRL2000/amazon_filesim-PRL2000 Note Some target subfolders will not contain an amazon filesim application binary. This is due to lack of the target's file i/o capabilities.","title":"Locate the executable in the package"},{"location":"getting-started/filesim.html#prepare-input-audio-list","text":"If you have a directory containing the .wav files to process, run the following command to generate a list file with all the .wav files in that directory: ls -1 *.wav > all_wavs.list The wake word engine uses adaptation logic to learn the characteristics of the acoustic environment. This improves detection quality. However if the audio clips are from different environments and do not contain at least 1 second of background noise prior to the wake word, we suggest using -c which will force the engine to clear the adaptation statistics after each file. Tip You can also run a single .wav file without a list. Just use the .wav file path instead of a list file path on the command line","title":"Prepare Input Audio List"},{"location":"getting-started/filesim.html#select-a-wake-word-model","text":"The wake word model must be selected based on the paths specified by a particular architecture's WakewordModelMapping.json configuration file. For example, for the x86 (Ubuntu) architecture, the WakewordModelMapping.json file specifies that the following model is suitable for the x86 Ubuntu PryonLite engine: common/D.en-US.alexa The appropriate wake word model is then found in the 'models' subfolder, with a .bin extension. For example: path_to_model = models/common/D.en-US.alexa.bin For detailed information on model selection, see this section","title":"Select a wake word model"},{"location":"getting-started/filesim.html#run-the-application","text":"The application takes the wake word model and an audio list file in as command line parameters: ./<filesim application name> -m <path_to_model> all_wavs.list To display a full list of options, run the filesim application with the '-h' option.","title":"Run the application"},{"location":"getting-started/filesim.html#reference-file-simulation-output","text":"Reference output from file simulation applications is provided in the package. In the example below, the reference file name is specifically for an x86 Ubuntu architecture using a D.en-US.alexa wake word model. Your package may contain a sample output for a different architecture or model, depending on your package request. ./sample-wakeword/ref-output-x86-D.en-US.alexa-alexas.list The application output will look similar to the following: Loaded model from path: models/common/D.en-US.alexa.bin Decoder instance memory allocated: 266344 Model Version: en-US_D_ALEXA+STOP_2018SuperBowl_v5.0 Engine Version: 2.9.0 Supported keywords: ALEXA, STOP Threshold: 500 alexa-01: 'ALEXA' detected during [0, 9600] alexa-02: 'ALEXA' detected during [8640, 17920] alexa-03: 'ALEXA' detected during [3680, 13920] alexa-04: 'ALEXA' detected during [4480, 14720] alexa-05: 'ALEXA' detected during [3040, 13280] alexa-06: 'ALEXA' detected during [5440, 15680] alexa-07: 'ALEXA' detected during [8800, 20480] alexa-08: 'ALEXA' detected during [7840, 19520] alexa-09: 'ALEXA' detected during [12000, 24160] alexa-10: 'ALEXA' detected during [30400, 43040] *** 10 wake word(s) detected in 10 files ***","title":"Reference file simulation output"},{"location":"getting-started/filesim.html#fingerprinting-file-simulator","text":"See the Fingerprinting Integration Guide for instructions on how to pass a fingerprint list to the file simulator.","title":"Fingerprinting File Simulator"},{"location":"getting-started/model-selection.html","text":"Model Selection Overview To determine which model should be used for a given locale and architecture, refer the `localesToModels.json' file in the corresponding architecture folder (alongside the Engine ). This file is a mapping of locales/wakeword combination to model files included in the release. Important There may not be a direct 1:1 mapping between locale and model name. This is because a) some models are multi-locale and b) some models are used for other locales as \"proxy\" or substitute models where a locale-specific model is not required. How To Use The Mapping File Find the WakewordModelMapping.json file in the architecture directory being integrated In the json file find the section for the wake word being used In the wake word section find the target locale Select the first model from the list which fits the architecture budget File Format The json file is a architecture-specific mapping of a requested wake word and locale to the model that should be used in this scenario. The snippet below shows the schema for the WakewordModelMapping.json file. { \"Wake Word\" : { \"Locale\" : [ \"path/to/modelA\" , \"path/to/modelB\" ] } } In most cases, there will be one entry. In some cases (releases which have multiple target architectures with different resources), there may be multiple entries. In this case, choose the 1st entry ( modelA in the example above) for the highest CPU/Memory architecture and the remaining entries ( modelB in the example above) for lower power processors. What Is A Locale? A locale defines a language and country combination using ISO 639-1 for country. Locales are typically stylized as language-COUNTRY. The table below shows a couple example locales along with their meanings. Locale Meaning en-US English spoken in the United States fr-FR French spoken in France fr-CA French spoken in Canada","title":"Model Selection"},{"location":"getting-started/model-selection.html#model-selection","text":"","title":"Model Selection"},{"location":"getting-started/model-selection.html#overview","text":"To determine which model should be used for a given locale and architecture, refer the `localesToModels.json' file in the corresponding architecture folder (alongside the Engine ). This file is a mapping of locales/wakeword combination to model files included in the release. Important There may not be a direct 1:1 mapping between locale and model name. This is because a) some models are multi-locale and b) some models are used for other locales as \"proxy\" or substitute models where a locale-specific model is not required.","title":"Overview"},{"location":"getting-started/model-selection.html#how-to-use-the-mapping-file","text":"Find the WakewordModelMapping.json file in the architecture directory being integrated In the json file find the section for the wake word being used In the wake word section find the target locale Select the first model from the list which fits the architecture budget","title":"How To Use The Mapping File"},{"location":"getting-started/model-selection.html#file-format","text":"The json file is a architecture-specific mapping of a requested wake word and locale to the model that should be used in this scenario. The snippet below shows the schema for the WakewordModelMapping.json file. { \"Wake Word\" : { \"Locale\" : [ \"path/to/modelA\" , \"path/to/modelB\" ] } } In most cases, there will be one entry. In some cases (releases which have multiple target architectures with different resources), there may be multiple entries. In this case, choose the 1st entry ( modelA in the example above) for the highest CPU/Memory architecture and the remaining entries ( modelB in the example above) for lower power processors.","title":"File Format"},{"location":"getting-started/model-selection.html#what-is-a-locale","text":"A locale defines a language and country combination using ISO 639-1 for country. Locales are typically stylized as language-COUNTRY. The table below shows a couple example locales along with their meanings. Locale Meaning en-US English spoken in the United States fr-FR French spoken in France fr-CA French spoken in Canada","title":"What Is A Locale?"},{"location":"getting-started/api-samples/index.html","text":"API Samples This section contains sample reference code for initializing and running the Wake Word engine. These samples are for manual integration of Amazon's Wake Word into a custom Alexa Client SDK. For integrating with the AVS Device SDK, see this section .","title":"Overview"},{"location":"getting-started/api-samples/index.html#api-samples","text":"This section contains sample reference code for initializing and running the Wake Word engine. These samples are for manual integration of Amazon's Wake Word into a custom Alexa Client SDK. For integrating with the AVS Device SDK, see this section .","title":"API Samples"},{"location":"getting-started/api-samples/api-sample-v1.html","text":"API Sample (V1 API) The following code demonstrates the general operation of the Wake Word Engine using the V1 API. /////////////////////////////////////////////////////////////////////////// // Copyright 2017 Amazon.com, Inc. or its affiliates. All Rights Reserved. /////////////////////////////////////////////////////////////////////////// #include <string.h> #include <stdio.h> #include <stdlib.h> #include \"pryon_lite.h\" #define SAMPLES_PER_FRAME (160) // global flag to stop processing, set by application static int quit = 0 ; // decoder handle static PryonLiteDecoderHandle sDecoder = NULL ; // binary model buffer, allocated by application // this buffer can be read-only memory as PryonLite will not modify the contents #define ALIGN(n) __attribute__((aligned(n))) ALIGN ( 4 ) static const char * modelBuffer = { 0 }; // should be an array large enough to hold the largest model static char * decoderBuffer = { 0 }; // should be an array large enough to hold the largest decoder //---- Application functions to be implemented by the client ------------------- static void loadModel ( const char ** model , size_t * sizeofModel ) { // In order to detect keywords, the decoder uses a model which defines the parameters, // neural network weights, classifiers, etc that are used at runtime to process the audio // and give detection results. // Each model is packaged in two formats: // 1. A .bin file that can be loaded from disk (via fopen, fread, etc) // *sizeofModel will be the size of the binary model byte array // *model will be a pointer to the model read into memory // // 2. A .cpp file that can be hard-coded at compile time // *sizeofModel will be prlBinaryModelLen // *model will be prlBinaryModelData * sizeofModel = 1 ; // example value, will be the size of the binary model byte array * model = modelBuffer ; // pointer to model in memory } // client implemented function to read audio samples static void readAudio ( short * samples , int sampleCount ) { // todo - read samples from file, audio system, etc. } //---- Decoder callback functions to be implemented by the client -------------- /// /// @brief Callback function triggered by the decoder when a wake word is detected. /// /// @param handle [in] Handle for the decoder which detected the wake word /// @param result [in] Result structure indicating which information about the wake word detection /// See pryon_lite_ww.h for a full description of this structure /// /// @return void /// static void detectionCallback ( PryonLiteDecoderHandle handle , const PryonLiteResult * result ) { printf ( \"Detected keyword '%s'\" , result -> keyword ); } /// /// @brief Callback function triggered by the decoder when a VAD state transition is detected. /// /// @param handle [in] Handle for the decoder which detected the VAD state transition /// @param vadEvent [in] Result structure indicating the current VAD state /// /// @return void /// static void vadCallback ( PryonLiteDecoderHandle handle , const PryonLiteVadEvent * vadEvent ) { printf ( \"VAD state %d \", (int) vadEvent->vadState); } //---- Main processing loop ---------------------------------------------------- // The main loop below shows the full life cycle of the wake word engine. This // life cycle is broken down into 3 phases. // // Phase 1 - Initialization // STEP 1.1 - Load the model // STEP 1.2 - Query for the size of instance memory required by the decoder // STEP 1.3 - Allocate/Check decoder buffer // STEP 1.4 - Configure Decoder // STEP 1.5 - Initialize Decoder // STEP 1.6 - Optional - Runtime configuration functions // Phase 2 - Audio Processing // STEP 2.1 - Gather audio // STEP 2.2 - Push audio to Decoder // STEP 2.3 - Handle Decoder events // Phase 3 - Cleanup // // The sample below is for a single locale/model. To change the locale/model // being used, complete Phase 3 - Cleanup for the engine instance and then // create a new instance by going back through Phase 1 - Initialization with // the new model. int main ( int argc , char ** argv ) { // Start Phase 1 - Initialization // // The initialization phase begins with nothing and ends with a fully // initialized instance of the wake word engine. // // STEP 1.1 - Load the model // This step covers loading the model data from source. Models are // delivered in two different forms, a C file with an array // containing the model source or a separate bin file. If using // the C file, the model source and size are already defined. If // using the bin file, the model source needs to be read into an // array and the length needs to be calculated. const char * model ; size_t sizeofModel ; loadModel ( & model , & sizeofModel ); // STEP 1.2 - Query for the size of instance memory required by the decoder // The wake word engine initialization requires a buffer be passed // in which is owned by the application layer. The size of this // buffer is dynamic depending on the model being used. Use the // PryonLite_GetModelAttributes function below to determine the // size of the decoder buffer needed. PryonLiteModelAttributes modelAttributes ; PryonLiteError status = PryonLite_GetModelAttributes ( model , sizeofModel , & modelAttributes ); if ( status != PRYON_LITE_ERROR_OK ) { return -1 ; } // STEP 1.3 - Allocate/Check decoder buffer // Once the size of the decoder buffer has been determined, the // application layer must create the buffer. This example uses // a statically defined buffer. If applicable to the device, // this buffer can be dynamically allocated as well. The // requirement is that a buffer that is at least // modelAttributes.requiredDecoderMem size, in bytes, is created. if ( modelAttributes . requiredDecoderMem > sizeof ( decoderBuffer )) { // handle error return -1 ; } // STEP 1.4 - Configure Decoder // PryonLiteDecoderConfig is used to configure the wake word engine. // Use PryonLiteDecodeConfig_Default to set up this structure with // default values. There are required fields which must be set // after the default values, see the example below. PryonLiteDecoderConfig config = PryonLiteDecoderConfig_Default ; // Required fields: model, sizeofModel loaded in STEP 1 config . model = model ; config . sizeofModel = sizeofModel ; // Required fields: decoderMem, sizeofDecoderMem created in STEP 3 config . decoderMem = decoderBuffer ; config . sizeofDecoderMem = modelAttributes . requiredDecoderMem ; // Required field: resultCallback, callback function which handles wake word detections config . resultCallback = detectionCallback ; // Optional fields: VAD Configuration // Enabling will use VAD in the wake word engine. If the // vadCallback is configured, it will be called whenever // there is a VAD state transition identified. int enableVad = 0 ; // disable voice activity detector, set to 1 to enable if ( enableVad ) { config . vadCallback = vadCallback ; // register VAD handler // this parameter is optional, // and may be set to NULL when VAD is enabled config . useVad = 1 ; // enable voice activity detector } // STEP 1.5 - Initialize Decoder // Pass the configuration from STEP 4 to PryonLiteDecoder_Initialize // to create an instance of the wake word engine. After this function // is called, the decoder instance pointed to by sDecoder is fully // functional. PryonLiteSessionInfo sessionInfo ; status = PryonLiteDecoder_Initialize ( & config , & sessionInfo , & sDecoder ); if ( status != PRYON_LITE_ERROR_OK ) { return -1 ; } // STEP 1.6 - Optional - Runtime configuration functions // The optional functions below allow for the runtime overrides of // configuration options. These functions can be called on a decoder // instance any time after a successful PryonLiteDecoder_Initialize // and before PryonLiteDecoder_Destroy. // Set detection threshold for all keywords int detectionThreshold = 500 ; status = PryonLiteDecoder_SetDetectionThreshold ( sDecoder , NULL , detectionThreshold ); if ( status != PRYON_LITE_ERROR_OK ) { return -1 ; } // End Phase 1 - Initialization // allocate buffer to hold audio samples short samples [ SAMPLES_PER_FRAME ]; // run decoder while ( 1 ) { // Start Phase 2 - Audio Processing // // The audio processing phase is where audio is pushed into an initialized // decoder instance and decoder events are handled. // // STEP 2.1 - Gather audio // Audio must be gathered into frames of length // sessionInfo.samplesPerFrame before sending to the decoder. readAudio ( samples , sessionInfo . samplesPerFrame ); // STEP 2.2 - Push audio to Decoder // Once an appropriate amount of audio has been gathered, push // the frame into the decoder using the PushAudioSamples. This // tells the decoder to process the audio and trigger any // generated events through the callback functions. status = PryonLiteDecoder_PushAudioSamples ( sDecoder , samples , sessionInfo . samplesPerFrame ); if ( status != PRYON_LITE_ERROR_OK ) { // handle error return -1 ; } // STEP 2.3 - Handle Decoder events // The decoder passes back event information to the application // layer through the callback functions configured in Phase 1. // Different event types trigger different callback functions. // This sample supports two different events, see the callback // function documnentation for more information. // Event | Callback // Wake Word Detection | detectionCallback // VAD State Change | vadCallback // // End Phase 2 - Audio Processing if ( quit ) { // Start Phase 3 - Cleanup // // Cleanup should only be run when there is no more use for the // decoder. This will flush any unprocessed audio that has been // pushed and destroy the decoder instance. status = PryonLiteDecoder_Destroy ( & sDecoder ); if ( status != PRYON_LITE_ERROR_OK ) { // handle error return -1 ; } break ; // End Phase 3 - Cleanup } } return 0 ; }","title":"API v1"},{"location":"getting-started/api-samples/api-sample-v1.html#api-sample-v1-api","text":"The following code demonstrates the general operation of the Wake Word Engine using the V1 API. /////////////////////////////////////////////////////////////////////////// // Copyright 2017 Amazon.com, Inc. or its affiliates. All Rights Reserved. /////////////////////////////////////////////////////////////////////////// #include <string.h> #include <stdio.h> #include <stdlib.h> #include \"pryon_lite.h\" #define SAMPLES_PER_FRAME (160) // global flag to stop processing, set by application static int quit = 0 ; // decoder handle static PryonLiteDecoderHandle sDecoder = NULL ; // binary model buffer, allocated by application // this buffer can be read-only memory as PryonLite will not modify the contents #define ALIGN(n) __attribute__((aligned(n))) ALIGN ( 4 ) static const char * modelBuffer = { 0 }; // should be an array large enough to hold the largest model static char * decoderBuffer = { 0 }; // should be an array large enough to hold the largest decoder //---- Application functions to be implemented by the client ------------------- static void loadModel ( const char ** model , size_t * sizeofModel ) { // In order to detect keywords, the decoder uses a model which defines the parameters, // neural network weights, classifiers, etc that are used at runtime to process the audio // and give detection results. // Each model is packaged in two formats: // 1. A .bin file that can be loaded from disk (via fopen, fread, etc) // *sizeofModel will be the size of the binary model byte array // *model will be a pointer to the model read into memory // // 2. A .cpp file that can be hard-coded at compile time // *sizeofModel will be prlBinaryModelLen // *model will be prlBinaryModelData * sizeofModel = 1 ; // example value, will be the size of the binary model byte array * model = modelBuffer ; // pointer to model in memory } // client implemented function to read audio samples static void readAudio ( short * samples , int sampleCount ) { // todo - read samples from file, audio system, etc. } //---- Decoder callback functions to be implemented by the client -------------- /// /// @brief Callback function triggered by the decoder when a wake word is detected. /// /// @param handle [in] Handle for the decoder which detected the wake word /// @param result [in] Result structure indicating which information about the wake word detection /// See pryon_lite_ww.h for a full description of this structure /// /// @return void /// static void detectionCallback ( PryonLiteDecoderHandle handle , const PryonLiteResult * result ) { printf ( \"Detected keyword '%s'\" , result -> keyword ); } /// /// @brief Callback function triggered by the decoder when a VAD state transition is detected. /// /// @param handle [in] Handle for the decoder which detected the VAD state transition /// @param vadEvent [in] Result structure indicating the current VAD state /// /// @return void /// static void vadCallback ( PryonLiteDecoderHandle handle , const PryonLiteVadEvent * vadEvent ) { printf ( \"VAD state %d \", (int) vadEvent->vadState); } //---- Main processing loop ---------------------------------------------------- // The main loop below shows the full life cycle of the wake word engine. This // life cycle is broken down into 3 phases. // // Phase 1 - Initialization // STEP 1.1 - Load the model // STEP 1.2 - Query for the size of instance memory required by the decoder // STEP 1.3 - Allocate/Check decoder buffer // STEP 1.4 - Configure Decoder // STEP 1.5 - Initialize Decoder // STEP 1.6 - Optional - Runtime configuration functions // Phase 2 - Audio Processing // STEP 2.1 - Gather audio // STEP 2.2 - Push audio to Decoder // STEP 2.3 - Handle Decoder events // Phase 3 - Cleanup // // The sample below is for a single locale/model. To change the locale/model // being used, complete Phase 3 - Cleanup for the engine instance and then // create a new instance by going back through Phase 1 - Initialization with // the new model. int main ( int argc , char ** argv ) { // Start Phase 1 - Initialization // // The initialization phase begins with nothing and ends with a fully // initialized instance of the wake word engine. // // STEP 1.1 - Load the model // This step covers loading the model data from source. Models are // delivered in two different forms, a C file with an array // containing the model source or a separate bin file. If using // the C file, the model source and size are already defined. If // using the bin file, the model source needs to be read into an // array and the length needs to be calculated. const char * model ; size_t sizeofModel ; loadModel ( & model , & sizeofModel ); // STEP 1.2 - Query for the size of instance memory required by the decoder // The wake word engine initialization requires a buffer be passed // in which is owned by the application layer. The size of this // buffer is dynamic depending on the model being used. Use the // PryonLite_GetModelAttributes function below to determine the // size of the decoder buffer needed. PryonLiteModelAttributes modelAttributes ; PryonLiteError status = PryonLite_GetModelAttributes ( model , sizeofModel , & modelAttributes ); if ( status != PRYON_LITE_ERROR_OK ) { return -1 ; } // STEP 1.3 - Allocate/Check decoder buffer // Once the size of the decoder buffer has been determined, the // application layer must create the buffer. This example uses // a statically defined buffer. If applicable to the device, // this buffer can be dynamically allocated as well. The // requirement is that a buffer that is at least // modelAttributes.requiredDecoderMem size, in bytes, is created. if ( modelAttributes . requiredDecoderMem > sizeof ( decoderBuffer )) { // handle error return -1 ; } // STEP 1.4 - Configure Decoder // PryonLiteDecoderConfig is used to configure the wake word engine. // Use PryonLiteDecodeConfig_Default to set up this structure with // default values. There are required fields which must be set // after the default values, see the example below. PryonLiteDecoderConfig config = PryonLiteDecoderConfig_Default ; // Required fields: model, sizeofModel loaded in STEP 1 config . model = model ; config . sizeofModel = sizeofModel ; // Required fields: decoderMem, sizeofDecoderMem created in STEP 3 config . decoderMem = decoderBuffer ; config . sizeofDecoderMem = modelAttributes . requiredDecoderMem ; // Required field: resultCallback, callback function which handles wake word detections config . resultCallback = detectionCallback ; // Optional fields: VAD Configuration // Enabling will use VAD in the wake word engine. If the // vadCallback is configured, it will be called whenever // there is a VAD state transition identified. int enableVad = 0 ; // disable voice activity detector, set to 1 to enable if ( enableVad ) { config . vadCallback = vadCallback ; // register VAD handler // this parameter is optional, // and may be set to NULL when VAD is enabled config . useVad = 1 ; // enable voice activity detector } // STEP 1.5 - Initialize Decoder // Pass the configuration from STEP 4 to PryonLiteDecoder_Initialize // to create an instance of the wake word engine. After this function // is called, the decoder instance pointed to by sDecoder is fully // functional. PryonLiteSessionInfo sessionInfo ; status = PryonLiteDecoder_Initialize ( & config , & sessionInfo , & sDecoder ); if ( status != PRYON_LITE_ERROR_OK ) { return -1 ; } // STEP 1.6 - Optional - Runtime configuration functions // The optional functions below allow for the runtime overrides of // configuration options. These functions can be called on a decoder // instance any time after a successful PryonLiteDecoder_Initialize // and before PryonLiteDecoder_Destroy. // Set detection threshold for all keywords int detectionThreshold = 500 ; status = PryonLiteDecoder_SetDetectionThreshold ( sDecoder , NULL , detectionThreshold ); if ( status != PRYON_LITE_ERROR_OK ) { return -1 ; } // End Phase 1 - Initialization // allocate buffer to hold audio samples short samples [ SAMPLES_PER_FRAME ]; // run decoder while ( 1 ) { // Start Phase 2 - Audio Processing // // The audio processing phase is where audio is pushed into an initialized // decoder instance and decoder events are handled. // // STEP 2.1 - Gather audio // Audio must be gathered into frames of length // sessionInfo.samplesPerFrame before sending to the decoder. readAudio ( samples , sessionInfo . samplesPerFrame ); // STEP 2.2 - Push audio to Decoder // Once an appropriate amount of audio has been gathered, push // the frame into the decoder using the PushAudioSamples. This // tells the decoder to process the audio and trigger any // generated events through the callback functions. status = PryonLiteDecoder_PushAudioSamples ( sDecoder , samples , sessionInfo . samplesPerFrame ); if ( status != PRYON_LITE_ERROR_OK ) { // handle error return -1 ; } // STEP 2.3 - Handle Decoder events // The decoder passes back event information to the application // layer through the callback functions configured in Phase 1. // Different event types trigger different callback functions. // This sample supports two different events, see the callback // function documnentation for more information. // Event | Callback // Wake Word Detection | detectionCallback // VAD State Change | vadCallback // // End Phase 2 - Audio Processing if ( quit ) { // Start Phase 3 - Cleanup // // Cleanup should only be run when there is no more use for the // decoder. This will flush any unprocessed audio that has been // pushed and destroy the decoder instance. status = PryonLiteDecoder_Destroy ( & sDecoder ); if ( status != PRYON_LITE_ERROR_OK ) { // handle error return -1 ; } break ; // End Phase 3 - Cleanup } } return 0 ; }","title":"API Sample (V1 API)"},{"location":"getting-started/api-samples/api-sample-v2.html","text":"API Sample (V2 API) The following code demonstrates the general operation of the Wake Word Engine using the V2 API. /////////////////////////////////////////////////////////////////////////// // Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. /////////////////////////////////////////////////////////////////////////// #include <string.h> #include <stdio.h> #include <stdlib.h> #include \"pryon_lite_common_client_properties.h\" #include \"pryon_lite_v2.h\" #define SAMPLES_PER_FRAME (160) #define false 0 #define true 1 // global flag to stop processing, set by application static int quit = 0 ; // engine handle static PryonLiteV2Handle sHandle = { 0 }; #define ALIGN(n) __attribute__((aligned(n))) static char * engineBuffer = { 0 }; // should be an array large enough to hold the largest engine //---- Application functions to be implemented by the client ------------------- // ---- Wake Word ---- // binary model buffer, allocated by application // this buffer can be read-only memory as PryonLite will not modify the contents ALIGN ( 4 ) static const char * wakewordModelBuffer = { 0 }; // should be an array large enough to hold the largest wake word model static void loadWakewordModel ( const char ** model , size_t * sizeofModel ) { // In order to detect keywords, the decoder uses a model which defines the parameters, // neural network weights, classifiers, etc that are used at runtime to process the audio // and give detection results. // Each model is packaged in two formats: // 1. A .bin file that can be loaded from disk (via fopen, fread, etc) // 2. A .cpp file that can be hard-coded at compile time * sizeofModel = 1 ; // example value, will be the size of the binary model byte array * model = wakewordModelBuffer ; // pointer to model in memory } // ---- Feature Extraction ---- // binary config buffer, allocated by application // this buffer can be read-only memory as PryonLite will not modify the contents ALIGN ( 4 ) static const char * featexConfigBuffer = { 0 }; // should be an array large enough to hold the largest featex config static void loadFeatexConfig ( const char ** blob , size_t * sizeofBlob ) { // In order to calculate features, the engine uses a config which defines the parameters, // that are used at runtime to process the audio // Each config is packaged in two formats: // 1. A .bin file that can be loaded from disk (via fopen, fread, etc) // 2. A .cpp file that can be hard-coded at compile time * sizeofBlob = 1 ; // example value, will be the size of the binary config byte array * blob = featexConfigBuffer ; // pointer to config in memory } // ---- Fingerprinting ---- // binary fingerprint list buffer, allocated by application // this buffer can be read-only memory as PryonLite will not modify the contents ALIGN ( 4 ) static const char * fingerprintListBuffer = { 0 }; // should be an array large enough to hold the fingerprint list data. static void loadFingerprintList ( const char ** fingerprintList , size_t * sizeofFingerprintList ) { // In order to suppress wakes from fingerprinted media, PryonLite uses a binary list which // tells its engine which audio to suppress. // Each list is a binary file that can be loaded from disk and should be downloaded // via a DAVS client. * sizeofFingerprintList = 1 ; // example value, will be the size of the binary fingerprint list byte array * fingerprintList = fingerprintListBuffer ; // pointer to fingerprint list in memory } // client implemented function to read audio samples static void readAudio ( short * samples , int sampleCount ) { // todo - read samples from file, audio system, etc. } //---- Engine callback functions to be implemented by the client -------------- // ---- Wake Word ---- // VAD event handler static void vadEventHandler ( PryonLiteV2Handle * handle , const PryonLiteVadEvent * vadEvent ) { printf ( \"VAD state %d \", (int) vadEvent->vadState); } // Wake Word event handler static void wakewordEventHandler ( PryonLiteV2Handle * handle , const PryonLiteWakewordResult * wwEvent ) { printf ( \"Detected wake word '%s' \", wwEvent->keyword); } // ---- Feature Extraction ---- // Featex event handler static void featexEventHandler ( PryonLiteV2Handle * handle , const PryonLiteFeatexResult * featexResult ) { printf ( \"Received %d features \", featexResult->featureVectorLength); } // ---- Fingerprinting ---- // Fingerprint match event handler static void fingerprintMatchEventHandler ( PryonLiteV2Handle * handle , const PryonLiteFingerprintMatchEvent * fingerprintMatchEvent ) { printf ( \"Detected fingerprint match with keyword '%s' \", fingerprintMatchEvent->keyword); } // ---- Wake Word Internal API ---- // DNN Score handler static void dnnScoreHandler ( PryonLiteV2Handle * handle , const PryonLiteDnnScoreInternal * dnnScore ) { int i ; printf ( \"DNN Scores Q: %d \", dnnScore->scoresQ); printf ( \"DNN Scores : \" ); for ( i = 0 ; i < dnnScore -> numScores ; i ++ ) { if ( i > 0 ) { printf ( \", \" ); } printf ( \"%d\" , dnnScore -> scores [ i ]); } printf ( \" \"); } /// /// @brief Callback function triggered by the engine when any event occurs. /// /// @param handle [in] Handle for the engine which created the event /// @param event [in] Event that occurred /// /// @return void /// static void handleEvent ( PryonLiteV2Handle * handle , const PryonLiteV2Event * event ) { // ---- Wake Word ---- if ( event -> vadEvent != NULL ) { vadEventHandler ( handle , event -> vadEvent ); } if ( event -> wwEvent != NULL ) { wakewordEventHandler ( handle , event -> wwEvent ); } // ---- Feature Extraction ---- if ( event -> featexEvent != NULL ) { featexEventHandler ( handle , event -> featexEvent ); } // ---- Fingerprinting ---- if ( event -> fingerprintMatchEvent != NULL ) { fingerprintMatchEventHandler ( handle , event -> fingerprintMatchEvent ); } // ---- Wake Word Internal API ---- if ( event -> dnnScore != NULL ) { dnnScoreHandler ( handle , event -> dnnScore ); } } //---- Main processing loop ---------------------------------------------------- // The main loop below shows the full life cycle of the engine. This // life cycle is broken down into three phases. // // Phase 1 - Initialization // STEP 1.1 - Load the models // STEP 1.2 - Configure engine // STEP 1.3 - Enable engine events // STEP 1.4 - Query for configuration specific attributes // STEP 1.5 - Allocate/Check engine buffer // STEP 1.6 - Initialize engine // STEP 1.7 - Post-init functionality setup // STEP 1.8 - Optional : Runtime configuration functions // Phase 2 - Audio Processing // STEP 2.1 - Gather audio // STEP 2.2 - Push audio to engine // STEP 2.3 - Handle engine events // Phase 3 - Cleanup // STEP 3.1 - Functionality specific cleanup // STEP 3.2 - Engine cleanup // // The sample below is for a single locale/model. To change the locale/model // being used, complete Phase 3 - Cleanup for the engine instance and then // create a new instance by going back through Phase 1 - Initialization with // the new model. int main ( int argc , char ** argv ) { PryonLiteV2ConfigAttributes configAttributes = { 0 }; // Start Phase 1 - Initialization // // The initialization phase begins with nothing and ends with a fully // initialized instance of the engine. // // STEP 1.1 - Load the models // This step covers loading the model data from source. Models are // delivered in two different forms, a C file with an array // containing the model source or a separate bin file. If using // the C file, the model source and size are already defined. If // using the bin file, the model source needs to be read into an // array and the length needs to be retrieved. // ---- Wake Word ---- const char * wakewordModel ; size_t wakewordModelSize ; loadWakewordModel ( & wakewordModel , & wakewordModelSize ); // ---- Feature Extraction ---- const char * featexBlob ; size_t sizeofFeatexBlob ; loadFeatexConfig ( & featexBlob , & sizeofFeatexBlob ); // ---- Fingerprinting ---- const char * fingerprintList ; size_t sizeofFingerprintList ; loadFingerprintList ( & fingerprintList , & sizeofFingerprintList ); // STEP 1.2 - Configure engine // PryonLiteV2Config contains initialization-time configuration // parameters. Each feature to be enabled must be configured // individually and then hooked to the top level engine // configuration. For each feature use the _Default macro to set // up the initial values of the configuration structure. There are // required fields which must be modified from their default // values; see the example below. PryonLiteV2Config engineConfig = { 0 }; // ---- Wake Word ---- PryonLiteWakewordConfig wakewordConfig = PryonLiteWakewordConfig_Default ; // Required fields: model, sizeofModel loaded in STEP 1 wakewordConfig . model = wakewordModel ; wakewordConfig . sizeofModel = wakewordModelSize ; // Optional fields: VAD Configuration // Enabling will use VAD in the wake word engine. PryonLiteVadConfig vadConfig ; PryonLiteEnergyDetectionConfig energyDetection ; vadConfig . energyDetection = & energyDetection ; wakewordConfig . vadConfig = & vadConfig ; energyDetection . enableGate = 0 ; // disable voice activity detection energy detection based gate, set to 1 to enable // Required: Link the wake word configuration to the engine configuration engineConfig . ww = & wakewordConfig ; // ---- Feature Extraction ---- PryonLiteFeatexConfig featexConfig = PryonLiteFeatexConfig_Default ; // Required fields: blob, sizeofBlob loaded in STEP 1 featexConfig . blob = featexBlob ; featexConfig . sizeofBlob = sizeofFeatexBlob ; // Optional fields: useFloatingPointOutput // Selects either fixed or floating point output in the events featexConfig . useFloatingPointOutput = true ; // Return floating point values // Required: Link the featex configuration to the engine configuration engineConfig . featex = & featexConfig ; // ---- Fingerprinting ---- PryonLiteFingerprintConfig fingerprintConfig = PryonLiteFingerprintConfig_Default ; // Required fields: fingerprintList, sizeofFingerprintList loaded in STEP 1 fingerprintConfig . fingerprintList = fingerprintList ; fingerprintConfig . sizeofFingerprintList = sizeofFingerprintList ; // Required: Disable VAD in the wake word config as it is incompatible with fingerprint match suppression wakewordConfig . useVad = 0 ; // Required: Link the fingerprinting configuration to the engine configuration engineConfig . fingerprinter = & fingerprintConfig ; // STEP 1.3 - Enable engine events // PryonLiteV2EventConfig is used to select which events the // engine will pass back to the application layer. Each field in // this structure is a flag that enables or disables the emission // of the event. PryonLiteV2EventConfig engineEventConfig = { 0 }; // ---- Wake Word ---- engineEventConfig . enableVadEvent = false ; // disable VAD event, set to true to receive VAD events engineEventConfig . enableWwEvent = true ; // ---- Feature Extraction ---- engineEventConfig . enableFeatexEvent = true ; engineEventConfig . enableVadEvent = false ; // disable VAD event, set to true to receive VAD events // ---- Fingerprinting ---- engineEventConfig . enableFingerprintMatchEvent = true ; // ---- Wake Word Internal API ---- engineEventConfig . enableDnnScoreEvent = true ; // STEP 1.4 - Query for configuration specific attributes // The engine initialization requires a buffer be passed in which // is owned by the application layer. This instance memory buffer // must persist for the life of the engine instance. The size of // this buffer is variable, and dependent on the client-specified // configuration. Use PryonLite_GetConfigAttributes to determine // the size of the buffer and other information about the // configuration. PryonLiteStatus status = PryonLite_GetConfigAttributes ( & engineConfig , & engineEventConfig , & configAttributes ); if ( status . publicCode != PRYON_LITE_ERROR_OK ) { // handle error return -1 ; } // ---- Wake Word ---- // Optional - Sample code showing how to list supported keywords printf ( \"Supported keywords: \" ); int keyword ; for ( keyword = 0 ; keyword < configAttributes . wwConfigAttributes . numKeywords ; keyword ++ ) { if ( keyword > 0 ) { printf ( \", \" ); } printf ( \"%s\" , configAttributes . wwConfigAttributes . keywords [ keyword ]); } printf ( \" \"); // STEP 1.5 - Allocate/Check engine buffer // Once the size of the engine buffer has been determined, the // application layer must create the buffer. This example uses // a statically-defined buffer. If applicable to the device, // this buffer can be dynamically allocated as well, but must // remain allocated for the duration that the engine is in use. // The requirement is that a buffer that is at least // configAttributes.requiredMem size, in bytes, is created. if ( configAttributes . requiredMem > sizeof ( engineBuffer )) { // handle error return -1 ; } // STEP 1.6 - Initialize engine // Pass the engine configuration from STEP 2, event configuration // from STEP 3, and engine buffer from STEP 5 to PryonLite_Initialize // to create an instance of the engine. After this function is // called, the engine instance referenced by sHandle is fully // functional. status = PryonLite_Initialize ( & engineConfig , & sHandle , handleEvent , & engineEventConfig , engineBuffer , sizeof ( engineBuffer )); if ( status . publicCode != PRYON_LITE_ERROR_OK ) { // handle error return -1 ; } // STEP 1.7 - Post-init functionality setup // Some functionalities require additional setup steps after the // engine is initialized. If such a step is required for a // functionality it will be implemented here. // STEP 1.8 - Optional : Runtime configuration functions // The optional functions below allow for the runtime configuration of // certain aspects of the engine. These functions can be called on // an engine instance any time after a successful PryonLite_Initialize // and before PryonLite_Destroy. // ---- Wake Word ---- // Set detection threshold for all keywords int detectionThreshold = 500 ; status = PryonLiteWakeword_SetDetectionThreshold ( sHandle . ww , NULL , detectionThreshold ); if ( status . publicCode != PRYON_LITE_ERROR_OK ) { // handle error return -1 ; } // End Phase 1 - Initialization // Examples - Optional runtime functions // The runtime functions below can be called on an engine // instance any time after a successful PryonLite_Initialize // and before PryonLite_Destroy. // ---- General ---- // Call the set client property API to inform the engine of client state changes status = PryonLite_SetClientProperty ( & sHandle , CLIENT_PROP_GROUP_COMMON , CLIENT_PROP_COMMON_AUDIO_PLAYBACK , 1 ); if ( status . publicCode != PRYON_LITE_ERROR_OK ) { return -1 ; } // ---- Wake Word Internal API ---- PryonLiteInternalWakewordAttributes wakewordAttributes ; // Fetch model properties status = PryonLiteWakeword_GetWakewordAttributes ( sHandle . ww , & wakewordAttributes ); if ( status . publicCode != PRYON_LITE_ERROR_OK ) { return -1 ; } // Iterate over all available wakewords for ( int i = 0 ; i < wakewordAttributes . numKeywordTargets ; i ++ ) { /* The following properties are available for each wake word wakewordAttributes.target[i].keyword wakewordAttributes.target[i].acceptThreshold wakewordAttributes.target[i].acceptMin wakewordAttributes.target[i].acceptMax wakewordAttributes.target[i].notify */ } // allocate buffer to hold audio samples short samples [ SAMPLES_PER_FRAME ]; // run engine while ( 1 ) { // Start Phase 2 - Audio Processing // // The audio processing phase is where audio is pushed into an initialized // engine instance, and engine events are emitted for handling by // registered application/client callbacks. // // STEP 1 - Gather audio // Audio must be gathered into frames of length // SAMPLES_PER_FRAME before sending to the engine. readAudio ( samples , SAMPLES_PER_FRAME ); // STEP 2.2 - Push audio to engine // Once the required amount of audio has been gathered, push // the frame into the engine using PushAudioSamples. This // signals the engine to process the audio and invokes callback // functions to pass any resulting events to the client. status = PryonLite_PushAudioSamples ( & sHandle , samples , SAMPLES_PER_FRAME ); if ( status . publicCode != PRYON_LITE_ERROR_OK ) { // handle error return -1 ; } // STEP 2.3 - Handle engine events // The engine passes back event information to the application // layer through a single callback function passed into // PryonLite_Initialize. The event types emitted depend on the // event configuration setup in Phase 1 Step 3. See the // EventConfig structure definition in the header files for more // information. // // End Phase 2 - Audio Processing // Examples - Optional runtime loop functions // The runtime functions below can be invoked any time // between a successful PryonLite_Initialize and // PryonLite_Destroy. These functions are typically // invoked during the core audio processing loop. if ( quit ) { // Start Phase 3 - Cleanup // // Cleanup should only occur when the engine is no longer needed // This will flush any unprocessed audio that has been // pushed and destroy the engine instance. // // STEP 3.1 - Functionality-specific cleanup // These functionality-specific cleanup functions should // be called before engine cleanup. If this step is // required it will be implemented below. // STEP 3.2 - Engine cleanup // This will flush any unprocessed audio that has been // pushed and destroy the engine instance. status = PryonLite_Destroy ( & sHandle ); if ( status . publicCode != PRYON_LITE_ERROR_OK ) { // handle error return -1 ; } break ; // End Phase 3 - Cleanup } } return 0 ; }","title":"API v2"},{"location":"getting-started/api-samples/api-sample-v2.html#api-sample-v2-api","text":"The following code demonstrates the general operation of the Wake Word Engine using the V2 API. /////////////////////////////////////////////////////////////////////////// // Copyright 2019 Amazon.com, Inc. or its affiliates. All Rights Reserved. /////////////////////////////////////////////////////////////////////////// #include <string.h> #include <stdio.h> #include <stdlib.h> #include \"pryon_lite_common_client_properties.h\" #include \"pryon_lite_v2.h\" #define SAMPLES_PER_FRAME (160) #define false 0 #define true 1 // global flag to stop processing, set by application static int quit = 0 ; // engine handle static PryonLiteV2Handle sHandle = { 0 }; #define ALIGN(n) __attribute__((aligned(n))) static char * engineBuffer = { 0 }; // should be an array large enough to hold the largest engine //---- Application functions to be implemented by the client ------------------- // ---- Wake Word ---- // binary model buffer, allocated by application // this buffer can be read-only memory as PryonLite will not modify the contents ALIGN ( 4 ) static const char * wakewordModelBuffer = { 0 }; // should be an array large enough to hold the largest wake word model static void loadWakewordModel ( const char ** model , size_t * sizeofModel ) { // In order to detect keywords, the decoder uses a model which defines the parameters, // neural network weights, classifiers, etc that are used at runtime to process the audio // and give detection results. // Each model is packaged in two formats: // 1. A .bin file that can be loaded from disk (via fopen, fread, etc) // 2. A .cpp file that can be hard-coded at compile time * sizeofModel = 1 ; // example value, will be the size of the binary model byte array * model = wakewordModelBuffer ; // pointer to model in memory } // ---- Feature Extraction ---- // binary config buffer, allocated by application // this buffer can be read-only memory as PryonLite will not modify the contents ALIGN ( 4 ) static const char * featexConfigBuffer = { 0 }; // should be an array large enough to hold the largest featex config static void loadFeatexConfig ( const char ** blob , size_t * sizeofBlob ) { // In order to calculate features, the engine uses a config which defines the parameters, // that are used at runtime to process the audio // Each config is packaged in two formats: // 1. A .bin file that can be loaded from disk (via fopen, fread, etc) // 2. A .cpp file that can be hard-coded at compile time * sizeofBlob = 1 ; // example value, will be the size of the binary config byte array * blob = featexConfigBuffer ; // pointer to config in memory } // ---- Fingerprinting ---- // binary fingerprint list buffer, allocated by application // this buffer can be read-only memory as PryonLite will not modify the contents ALIGN ( 4 ) static const char * fingerprintListBuffer = { 0 }; // should be an array large enough to hold the fingerprint list data. static void loadFingerprintList ( const char ** fingerprintList , size_t * sizeofFingerprintList ) { // In order to suppress wakes from fingerprinted media, PryonLite uses a binary list which // tells its engine which audio to suppress. // Each list is a binary file that can be loaded from disk and should be downloaded // via a DAVS client. * sizeofFingerprintList = 1 ; // example value, will be the size of the binary fingerprint list byte array * fingerprintList = fingerprintListBuffer ; // pointer to fingerprint list in memory } // client implemented function to read audio samples static void readAudio ( short * samples , int sampleCount ) { // todo - read samples from file, audio system, etc. } //---- Engine callback functions to be implemented by the client -------------- // ---- Wake Word ---- // VAD event handler static void vadEventHandler ( PryonLiteV2Handle * handle , const PryonLiteVadEvent * vadEvent ) { printf ( \"VAD state %d \", (int) vadEvent->vadState); } // Wake Word event handler static void wakewordEventHandler ( PryonLiteV2Handle * handle , const PryonLiteWakewordResult * wwEvent ) { printf ( \"Detected wake word '%s' \", wwEvent->keyword); } // ---- Feature Extraction ---- // Featex event handler static void featexEventHandler ( PryonLiteV2Handle * handle , const PryonLiteFeatexResult * featexResult ) { printf ( \"Received %d features \", featexResult->featureVectorLength); } // ---- Fingerprinting ---- // Fingerprint match event handler static void fingerprintMatchEventHandler ( PryonLiteV2Handle * handle , const PryonLiteFingerprintMatchEvent * fingerprintMatchEvent ) { printf ( \"Detected fingerprint match with keyword '%s' \", fingerprintMatchEvent->keyword); } // ---- Wake Word Internal API ---- // DNN Score handler static void dnnScoreHandler ( PryonLiteV2Handle * handle , const PryonLiteDnnScoreInternal * dnnScore ) { int i ; printf ( \"DNN Scores Q: %d \", dnnScore->scoresQ); printf ( \"DNN Scores : \" ); for ( i = 0 ; i < dnnScore -> numScores ; i ++ ) { if ( i > 0 ) { printf ( \", \" ); } printf ( \"%d\" , dnnScore -> scores [ i ]); } printf ( \" \"); } /// /// @brief Callback function triggered by the engine when any event occurs. /// /// @param handle [in] Handle for the engine which created the event /// @param event [in] Event that occurred /// /// @return void /// static void handleEvent ( PryonLiteV2Handle * handle , const PryonLiteV2Event * event ) { // ---- Wake Word ---- if ( event -> vadEvent != NULL ) { vadEventHandler ( handle , event -> vadEvent ); } if ( event -> wwEvent != NULL ) { wakewordEventHandler ( handle , event -> wwEvent ); } // ---- Feature Extraction ---- if ( event -> featexEvent != NULL ) { featexEventHandler ( handle , event -> featexEvent ); } // ---- Fingerprinting ---- if ( event -> fingerprintMatchEvent != NULL ) { fingerprintMatchEventHandler ( handle , event -> fingerprintMatchEvent ); } // ---- Wake Word Internal API ---- if ( event -> dnnScore != NULL ) { dnnScoreHandler ( handle , event -> dnnScore ); } } //---- Main processing loop ---------------------------------------------------- // The main loop below shows the full life cycle of the engine. This // life cycle is broken down into three phases. // // Phase 1 - Initialization // STEP 1.1 - Load the models // STEP 1.2 - Configure engine // STEP 1.3 - Enable engine events // STEP 1.4 - Query for configuration specific attributes // STEP 1.5 - Allocate/Check engine buffer // STEP 1.6 - Initialize engine // STEP 1.7 - Post-init functionality setup // STEP 1.8 - Optional : Runtime configuration functions // Phase 2 - Audio Processing // STEP 2.1 - Gather audio // STEP 2.2 - Push audio to engine // STEP 2.3 - Handle engine events // Phase 3 - Cleanup // STEP 3.1 - Functionality specific cleanup // STEP 3.2 - Engine cleanup // // The sample below is for a single locale/model. To change the locale/model // being used, complete Phase 3 - Cleanup for the engine instance and then // create a new instance by going back through Phase 1 - Initialization with // the new model. int main ( int argc , char ** argv ) { PryonLiteV2ConfigAttributes configAttributes = { 0 }; // Start Phase 1 - Initialization // // The initialization phase begins with nothing and ends with a fully // initialized instance of the engine. // // STEP 1.1 - Load the models // This step covers loading the model data from source. Models are // delivered in two different forms, a C file with an array // containing the model source or a separate bin file. If using // the C file, the model source and size are already defined. If // using the bin file, the model source needs to be read into an // array and the length needs to be retrieved. // ---- Wake Word ---- const char * wakewordModel ; size_t wakewordModelSize ; loadWakewordModel ( & wakewordModel , & wakewordModelSize ); // ---- Feature Extraction ---- const char * featexBlob ; size_t sizeofFeatexBlob ; loadFeatexConfig ( & featexBlob , & sizeofFeatexBlob ); // ---- Fingerprinting ---- const char * fingerprintList ; size_t sizeofFingerprintList ; loadFingerprintList ( & fingerprintList , & sizeofFingerprintList ); // STEP 1.2 - Configure engine // PryonLiteV2Config contains initialization-time configuration // parameters. Each feature to be enabled must be configured // individually and then hooked to the top level engine // configuration. For each feature use the _Default macro to set // up the initial values of the configuration structure. There are // required fields which must be modified from their default // values; see the example below. PryonLiteV2Config engineConfig = { 0 }; // ---- Wake Word ---- PryonLiteWakewordConfig wakewordConfig = PryonLiteWakewordConfig_Default ; // Required fields: model, sizeofModel loaded in STEP 1 wakewordConfig . model = wakewordModel ; wakewordConfig . sizeofModel = wakewordModelSize ; // Optional fields: VAD Configuration // Enabling will use VAD in the wake word engine. PryonLiteVadConfig vadConfig ; PryonLiteEnergyDetectionConfig energyDetection ; vadConfig . energyDetection = & energyDetection ; wakewordConfig . vadConfig = & vadConfig ; energyDetection . enableGate = 0 ; // disable voice activity detection energy detection based gate, set to 1 to enable // Required: Link the wake word configuration to the engine configuration engineConfig . ww = & wakewordConfig ; // ---- Feature Extraction ---- PryonLiteFeatexConfig featexConfig = PryonLiteFeatexConfig_Default ; // Required fields: blob, sizeofBlob loaded in STEP 1 featexConfig . blob = featexBlob ; featexConfig . sizeofBlob = sizeofFeatexBlob ; // Optional fields: useFloatingPointOutput // Selects either fixed or floating point output in the events featexConfig . useFloatingPointOutput = true ; // Return floating point values // Required: Link the featex configuration to the engine configuration engineConfig . featex = & featexConfig ; // ---- Fingerprinting ---- PryonLiteFingerprintConfig fingerprintConfig = PryonLiteFingerprintConfig_Default ; // Required fields: fingerprintList, sizeofFingerprintList loaded in STEP 1 fingerprintConfig . fingerprintList = fingerprintList ; fingerprintConfig . sizeofFingerprintList = sizeofFingerprintList ; // Required: Disable VAD in the wake word config as it is incompatible with fingerprint match suppression wakewordConfig . useVad = 0 ; // Required: Link the fingerprinting configuration to the engine configuration engineConfig . fingerprinter = & fingerprintConfig ; // STEP 1.3 - Enable engine events // PryonLiteV2EventConfig is used to select which events the // engine will pass back to the application layer. Each field in // this structure is a flag that enables or disables the emission // of the event. PryonLiteV2EventConfig engineEventConfig = { 0 }; // ---- Wake Word ---- engineEventConfig . enableVadEvent = false ; // disable VAD event, set to true to receive VAD events engineEventConfig . enableWwEvent = true ; // ---- Feature Extraction ---- engineEventConfig . enableFeatexEvent = true ; engineEventConfig . enableVadEvent = false ; // disable VAD event, set to true to receive VAD events // ---- Fingerprinting ---- engineEventConfig . enableFingerprintMatchEvent = true ; // ---- Wake Word Internal API ---- engineEventConfig . enableDnnScoreEvent = true ; // STEP 1.4 - Query for configuration specific attributes // The engine initialization requires a buffer be passed in which // is owned by the application layer. This instance memory buffer // must persist for the life of the engine instance. The size of // this buffer is variable, and dependent on the client-specified // configuration. Use PryonLite_GetConfigAttributes to determine // the size of the buffer and other information about the // configuration. PryonLiteStatus status = PryonLite_GetConfigAttributes ( & engineConfig , & engineEventConfig , & configAttributes ); if ( status . publicCode != PRYON_LITE_ERROR_OK ) { // handle error return -1 ; } // ---- Wake Word ---- // Optional - Sample code showing how to list supported keywords printf ( \"Supported keywords: \" ); int keyword ; for ( keyword = 0 ; keyword < configAttributes . wwConfigAttributes . numKeywords ; keyword ++ ) { if ( keyword > 0 ) { printf ( \", \" ); } printf ( \"%s\" , configAttributes . wwConfigAttributes . keywords [ keyword ]); } printf ( \" \"); // STEP 1.5 - Allocate/Check engine buffer // Once the size of the engine buffer has been determined, the // application layer must create the buffer. This example uses // a statically-defined buffer. If applicable to the device, // this buffer can be dynamically allocated as well, but must // remain allocated for the duration that the engine is in use. // The requirement is that a buffer that is at least // configAttributes.requiredMem size, in bytes, is created. if ( configAttributes . requiredMem > sizeof ( engineBuffer )) { // handle error return -1 ; } // STEP 1.6 - Initialize engine // Pass the engine configuration from STEP 2, event configuration // from STEP 3, and engine buffer from STEP 5 to PryonLite_Initialize // to create an instance of the engine. After this function is // called, the engine instance referenced by sHandle is fully // functional. status = PryonLite_Initialize ( & engineConfig , & sHandle , handleEvent , & engineEventConfig , engineBuffer , sizeof ( engineBuffer )); if ( status . publicCode != PRYON_LITE_ERROR_OK ) { // handle error return -1 ; } // STEP 1.7 - Post-init functionality setup // Some functionalities require additional setup steps after the // engine is initialized. If such a step is required for a // functionality it will be implemented here. // STEP 1.8 - Optional : Runtime configuration functions // The optional functions below allow for the runtime configuration of // certain aspects of the engine. These functions can be called on // an engine instance any time after a successful PryonLite_Initialize // and before PryonLite_Destroy. // ---- Wake Word ---- // Set detection threshold for all keywords int detectionThreshold = 500 ; status = PryonLiteWakeword_SetDetectionThreshold ( sHandle . ww , NULL , detectionThreshold ); if ( status . publicCode != PRYON_LITE_ERROR_OK ) { // handle error return -1 ; } // End Phase 1 - Initialization // Examples - Optional runtime functions // The runtime functions below can be called on an engine // instance any time after a successful PryonLite_Initialize // and before PryonLite_Destroy. // ---- General ---- // Call the set client property API to inform the engine of client state changes status = PryonLite_SetClientProperty ( & sHandle , CLIENT_PROP_GROUP_COMMON , CLIENT_PROP_COMMON_AUDIO_PLAYBACK , 1 ); if ( status . publicCode != PRYON_LITE_ERROR_OK ) { return -1 ; } // ---- Wake Word Internal API ---- PryonLiteInternalWakewordAttributes wakewordAttributes ; // Fetch model properties status = PryonLiteWakeword_GetWakewordAttributes ( sHandle . ww , & wakewordAttributes ); if ( status . publicCode != PRYON_LITE_ERROR_OK ) { return -1 ; } // Iterate over all available wakewords for ( int i = 0 ; i < wakewordAttributes . numKeywordTargets ; i ++ ) { /* The following properties are available for each wake word wakewordAttributes.target[i].keyword wakewordAttributes.target[i].acceptThreshold wakewordAttributes.target[i].acceptMin wakewordAttributes.target[i].acceptMax wakewordAttributes.target[i].notify */ } // allocate buffer to hold audio samples short samples [ SAMPLES_PER_FRAME ]; // run engine while ( 1 ) { // Start Phase 2 - Audio Processing // // The audio processing phase is where audio is pushed into an initialized // engine instance, and engine events are emitted for handling by // registered application/client callbacks. // // STEP 1 - Gather audio // Audio must be gathered into frames of length // SAMPLES_PER_FRAME before sending to the engine. readAudio ( samples , SAMPLES_PER_FRAME ); // STEP 2.2 - Push audio to engine // Once the required amount of audio has been gathered, push // the frame into the engine using PushAudioSamples. This // signals the engine to process the audio and invokes callback // functions to pass any resulting events to the client. status = PryonLite_PushAudioSamples ( & sHandle , samples , SAMPLES_PER_FRAME ); if ( status . publicCode != PRYON_LITE_ERROR_OK ) { // handle error return -1 ; } // STEP 2.3 - Handle engine events // The engine passes back event information to the application // layer through a single callback function passed into // PryonLite_Initialize. The event types emitted depend on the // event configuration setup in Phase 1 Step 3. See the // EventConfig structure definition in the header files for more // information. // // End Phase 2 - Audio Processing // Examples - Optional runtime loop functions // The runtime functions below can be invoked any time // between a successful PryonLite_Initialize and // PryonLite_Destroy. These functions are typically // invoked during the core audio processing loop. if ( quit ) { // Start Phase 3 - Cleanup // // Cleanup should only occur when the engine is no longer needed // This will flush any unprocessed audio that has been // pushed and destroy the engine instance. // // STEP 3.1 - Functionality-specific cleanup // These functionality-specific cleanup functions should // be called before engine cleanup. If this step is // required it will be implemented below. // STEP 3.2 - Engine cleanup // This will flush any unprocessed audio that has been // pushed and destroy the engine instance. status = PryonLite_Destroy ( & sHandle ); if ( status . publicCode != PRYON_LITE_ERROR_OK ) { // handle error return -1 ; } break ; // End Phase 3 - Cleanup } } return 0 ; }","title":"API Sample (V2 API)"},{"location":"getting-started/architecture-notes/index.html","text":"Architecture Specific Notes Some targets/architectures have specific, customized integration requirements. Be sure to read any of the instructions before integrating the Amazon Wake Word Engine on the corresponding architectures listed on the left. Note This section may be empty, if this release contains no architectures with special integration instructions.","title":"Architecture Specific Notes"},{"location":"getting-started/architecture-notes/index.html#architecture-specific-notes","text":"Some targets/architectures have specific, customized integration requirements. Be sure to read any of the instructions before integrating the Amazon Wake Word Engine on the corresponding architectures listed on the left. Note This section may be empty, if this release contains no architectures with special integration instructions.","title":"Architecture Specific Notes"},{"location":"getting-started/release-contents/index.html","text":"Release Contents This section describes the contents of the Amazon Wake Word Engine release package. Contents File/Folder Description LICENSE.txt The license under which the Amazon Wake Word Engine is provided <arch_name> Folder containing binaries for a given processor architecture. A release will contain engines for one or processor architectures. See the Engines section below. models/ Folder containing models for the wake word engine in source code and binary format for all architectures samples/wakeword Sample reference data for wake word. See the section File Simulators for details on this content samples/fingerprinting Sample reference data for fingerprinting. See the section on Fingerprinting Integration for details on the content. samples/watermarking Sample reference data for watermarking. See the section on Watermarking Integration for details on the content. Engines For each processor architecture provided in the release, there will be a folder at the root level with the architecture name. Inside each of these folders will be the following: File/Folder Description api_sample_PRLXXXX.cpp Sample code demonstrating usage of the V2 (PRLXXXX) library pryon_lite_PRLXXXX.h Header file defining V2 API for wake word and other features pryon_lite_common_client_properties.h Header file defining common client properties pryon_lite_metadata.h Header file defining metadata structures pryon_lite_vad_PRLXXXX.h Header file defining VAD structures pryon_lite_ww.h Header file defining Wake Word structures pryon_lite_error.h Header file defining error codes PRLXXXX Folder containing V2 API binaries for specific PRL versions WakewordModelMapping.json Lists the models compatible with a given architecture and has the locale mapping wwDavsFiltersMapping.json Lists the sample client side Wakeword DAVS filters for a given architecture and has the locale mapping davs/lowpower-wakeword Folder containing sample client side Wakeword DAVS filters There may be suffixes on the above binaries ( .so , .a , amazon_ww_filesim ) depending on what feature set your release contains. These are customized versions of the library to minimize footprint by providing a restricted set of functionality. The table below describes the functionality of binaries with suffixes: Suffix Description No suffix V1 API (Wake Word only). Supports all types of wake word models -U V1 API (Wake Word only). Supports \"U\" and \"W*\" type wake word models -PRLXXXX V2 API. Supports various features. See PRL Versions table for a comparison of the different PRL versions Models The models folder contains subfolders with various model formats for the provided targets. These format names are used only to distinguish between the formats and have no relationship to the target or locale name. To determine what model should be used for a given locale and architecture, see the section Model Selection . Each model will have a source code version (.cpp) and a binary (.bin) version. Depending on how your application will load the model (dynamically at run-time or statically at compile time), choose the corresponding version.","title":"Release Contents"},{"location":"getting-started/release-contents/index.html#release-contents","text":"This section describes the contents of the Amazon Wake Word Engine release package.","title":"Release Contents"},{"location":"getting-started/release-contents/index.html#contents","text":"File/Folder Description LICENSE.txt The license under which the Amazon Wake Word Engine is provided <arch_name> Folder containing binaries for a given processor architecture. A release will contain engines for one or processor architectures. See the Engines section below. models/ Folder containing models for the wake word engine in source code and binary format for all architectures samples/wakeword Sample reference data for wake word. See the section File Simulators for details on this content samples/fingerprinting Sample reference data for fingerprinting. See the section on Fingerprinting Integration for details on the content. samples/watermarking Sample reference data for watermarking. See the section on Watermarking Integration for details on the content.","title":"Contents"},{"location":"getting-started/release-contents/index.html#engines","text":"For each processor architecture provided in the release, there will be a folder at the root level with the architecture name. Inside each of these folders will be the following: File/Folder Description api_sample_PRLXXXX.cpp Sample code demonstrating usage of the V2 (PRLXXXX) library pryon_lite_PRLXXXX.h Header file defining V2 API for wake word and other features pryon_lite_common_client_properties.h Header file defining common client properties pryon_lite_metadata.h Header file defining metadata structures pryon_lite_vad_PRLXXXX.h Header file defining VAD structures pryon_lite_ww.h Header file defining Wake Word structures pryon_lite_error.h Header file defining error codes PRLXXXX Folder containing V2 API binaries for specific PRL versions WakewordModelMapping.json Lists the models compatible with a given architecture and has the locale mapping wwDavsFiltersMapping.json Lists the sample client side Wakeword DAVS filters for a given architecture and has the locale mapping davs/lowpower-wakeword Folder containing sample client side Wakeword DAVS filters There may be suffixes on the above binaries ( .so , .a , amazon_ww_filesim ) depending on what feature set your release contains. These are customized versions of the library to minimize footprint by providing a restricted set of functionality. The table below describes the functionality of binaries with suffixes: Suffix Description No suffix V1 API (Wake Word only). Supports all types of wake word models -U V1 API (Wake Word only). Supports \"U\" and \"W*\" type wake word models -PRLXXXX V2 API. Supports various features. See PRL Versions table for a comparison of the different PRL versions","title":"Engines"},{"location":"getting-started/release-contents/index.html#models","text":"The models folder contains subfolders with various model formats for the provided targets. These format names are used only to distinguish between the formats and have no relationship to the target or locale name. To determine what model should be used for a given locale and architecture, see the section Model Selection . Each model will have a source code version (.cpp) and a binary (.bin) version. Depending on how your application will load the model (dynamically at run-time or statically at compile time), choose the corresponding version.","title":"Models"},{"location":"support/getting-help.html","text":"Getting Help For all technical support inquiries please contact your AVS Solutions Architect (or the contact within Amazon that provided you this package). If you don't have a point of contact within Amazon, reach out to avs-sa-questions@amazon.com .","title":"Getting Help"},{"location":"support/getting-help.html#getting-help","text":"For all technical support inquiries please contact your AVS Solutions Architect (or the contact within Amazon that provided you this package). If you don't have a point of contact within Amazon, reach out to avs-sa-questions@amazon.com .","title":"Getting Help"},{"location":"support/md-cheatsheet.html","text":"Markdown Cheatsheet Headings # Heading 1 ## Heading 2 ### Heading 3 #### Heading 4 ##### Heading 5 Tables | Method | Description | | ------------ | -------------------- | | `getParam()` | Gets a parameter | | `setParam()` | Sets a parameter | | `destroy()` | The end of the world | Method Description getParam() Gets a parameter setParam() Sets a parameter destroy() The end of the world Images Use Relaive Paths For global or shared images, put them in resources/images and reference them using relative path: ![logo](../resources/images/logo-alexa-text-horiz-dark.png){style=\"height:50px\"} For local images, put them in a folder called <page_name>.assets and use a relative path: ![logo](md-cheatsheet.assets/logo-alexa-text-horiz-dark.png){style=\"height:50px;\"} Image Size Use the style block as follows after the image to specify with and/or height ![logo](md-cheatsheet.assets/logo-alexa-text-horiz-dark.png){style=\"height:25px;\"} Alignment In the style section after the image block, use .center , .left , or .right ![logo](md-cheatsheet.assets/logo-alexa-text-horiz-dark.png){.center style=\"height:50px;\"} Captions Captions (and other advanced features) require falling back to HTML. Use a standard HTML image tag inside a figure, and a <figcaption> block after your image. <figure> <img src='/Support/md-cheatsheet.assets/logo-alexa-text-horiz-dark.png' style=\"height:50px;\"} /> <figcaption>This is the logo for Alexa</figcaption> </figure> This is the logo for Alexa Code Blocks Use triple backticks followed by language (optional) on the first line, triple backticks again to close the block. \u200b```c typedef struct STrio { int one; int two; int three; } STrio; ``` typedef struct STrio { int one ; int two ; int three ; } STrio ; Inline Code For `inline code` wrap the text in single backticks. For inline code wrap the text in single backticks. Admonitions (Note/Info/Warn etc) !!! note You might want to note this. Only caveat: if you have a linebreak in here, make sure following lines are indented for the entire paragraph so it stays inside the block. Note You might want to note this. Only caveat: if you have a newline in here, make sure following lines are indented for the entire paragraph so it stays inside the block. !!! note \"Custom Title\" You might want add a custom title like this one has. Custom Title You might want add a custom title like this one has. ... and a few others Info Here's some info. Just make sure everything is indented for the entire paragraph so it stays inside the block. Tip Try doing this. It'll make your life way easier. Warning Don't do this. Wouldn't be a good idea. Error Well, you did it. Your fault. Collapsible Sections ??? \"Click to see more\" Glad you found me. Be sure to put my title in quotes on the first line. You can default me to start expanded with \"+\" next to the question marks. Click to see more Glad you found me. Be sure to put my title in quotes. You can default me to start expanded with \"+\" next to the question marks in the markdown. Tabbed Panels === \"C\" ``` c #include <stdio.h> int main(void) { printf(\"Hello world!\\n\"); return 0; } === \"C++\" ``` c++ #include <iostream> int main(void) { std::cout << \"Hello world!\" << std::endl; return 0; } C #include <stdio.h> int main ( void ) { printf ( \"Hello world! \\n \" ); return 0 ; } C++ #include <iostream> int main ( void ) { std :: cout << \"Hello world!\" << std :: endl ; return 0 ; }","title":"Markdown Cheatsheet"},{"location":"support/md-cheatsheet.html#markdown-cheatsheet","text":"","title":"Markdown Cheatsheet"},{"location":"support/md-cheatsheet.html#headings","text":"# Heading 1 ## Heading 2 ### Heading 3 #### Heading 4 ##### Heading 5","title":"Headings"},{"location":"support/md-cheatsheet.html#tables","text":"| Method | Description | | ------------ | -------------------- | | `getParam()` | Gets a parameter | | `setParam()` | Sets a parameter | | `destroy()` | The end of the world | Method Description getParam() Gets a parameter setParam() Sets a parameter destroy() The end of the world","title":"Tables"},{"location":"support/md-cheatsheet.html#images","text":"","title":"Images"},{"location":"support/md-cheatsheet.html#use-relaive-paths","text":"For global or shared images, put them in resources/images and reference them using relative path: ![logo](../resources/images/logo-alexa-text-horiz-dark.png){style=\"height:50px\"} For local images, put them in a folder called <page_name>.assets and use a relative path: ![logo](md-cheatsheet.assets/logo-alexa-text-horiz-dark.png){style=\"height:50px;\"}","title":"Use Relaive Paths"},{"location":"support/md-cheatsheet.html#image-size","text":"Use the style block as follows after the image to specify with and/or height ![logo](md-cheatsheet.assets/logo-alexa-text-horiz-dark.png){style=\"height:25px;\"}","title":"Image Size"},{"location":"support/md-cheatsheet.html#alignment","text":"In the style section after the image block, use .center , .left , or .right ![logo](md-cheatsheet.assets/logo-alexa-text-horiz-dark.png){.center style=\"height:50px;\"}","title":"Alignment"},{"location":"support/md-cheatsheet.html#captions","text":"Captions (and other advanced features) require falling back to HTML. Use a standard HTML image tag inside a figure, and a <figcaption> block after your image. <figure> <img src='/Support/md-cheatsheet.assets/logo-alexa-text-horiz-dark.png' style=\"height:50px;\"} /> <figcaption>This is the logo for Alexa</figcaption> </figure> This is the logo for Alexa","title":"Captions"},{"location":"support/md-cheatsheet.html#code-blocks","text":"Use triple backticks followed by language (optional) on the first line, triple backticks again to close the block. \u200b```c typedef struct STrio { int one; int two; int three; } STrio; ``` typedef struct STrio { int one ; int two ; int three ; } STrio ;","title":"Code Blocks"},{"location":"support/md-cheatsheet.html#inline-code","text":"For `inline code` wrap the text in single backticks. For inline code wrap the text in single backticks.","title":"Inline Code"},{"location":"support/md-cheatsheet.html#admonitions-noteinfowarn-etc","text":"!!! note You might want to note this. Only caveat: if you have a linebreak in here, make sure following lines are indented for the entire paragraph so it stays inside the block. Note You might want to note this. Only caveat: if you have a newline in here, make sure following lines are indented for the entire paragraph so it stays inside the block. !!! note \"Custom Title\" You might want add a custom title like this one has. Custom Title You might want add a custom title like this one has. ... and a few others Info Here's some info. Just make sure everything is indented for the entire paragraph so it stays inside the block. Tip Try doing this. It'll make your life way easier. Warning Don't do this. Wouldn't be a good idea. Error Well, you did it. Your fault.","title":"Admonitions (Note/Info/Warn etc)"},{"location":"support/md-cheatsheet.html#collapsible-sections","text":"??? \"Click to see more\" Glad you found me. Be sure to put my title in quotes on the first line. You can default me to start expanded with \"+\" next to the question marks. Click to see more Glad you found me. Be sure to put my title in quotes. You can default me to start expanded with \"+\" next to the question marks in the markdown.","title":"Collapsible Sections"},{"location":"support/md-cheatsheet.html#tabbed-panels","text":"=== \"C\" ``` c #include <stdio.h> int main(void) { printf(\"Hello world!\\n\"); return 0; } === \"C++\" ``` c++ #include <iostream> int main(void) { std::cout << \"Hello world!\" << std::endl; return 0; } C #include <stdio.h> int main ( void ) { printf ( \"Hello world! \\n \" ); return 0 ; } C++ #include <iostream> int main ( void ) { std :: cout << \"Hello world!\" << std :: endl ; return 0 ; }","title":"Tabbed Panels"},{"location":"support/faq/index.html","text":"FAQ This section addresses frequently asked questions related to the Amazon Wake Word Engine. General Questions What's the difference between V1 API and V2 API? V2 API lays the foundation for supporting media wake suppression and other features. V2 API needs minimal integration effort. V2 API is not backwards compatible with V1 API which will be deprecated in the future. libpryon_lite-PRL2000 uses the new Pryonlite V2 API and is a replacement for libpryon_lite which uses V1 API. There are many models under the models folder, which one do I use? See Model Selection . There are multiple binaries (libpryon_lite.a, libpryon_lite.so, amazon_ww_filesim) in the package, which one should I use? See Release Contents . Why is the wake word model performance worse on the device compared to results from simulation using filesim? Differences in audio front ends can cause differences in model performance. Our models are trained using data with background noise, and some processing algorithms (e.g. noise suppression) ahead of the wake word engine may adversely affect the wake word and Automatic Speech Recognition (ASR) performance. Under device playback, we see very high False Reject Rate (FRR), what can be done? Although the wake word detection threshold does not relate to the same performance for all models, we can tweak the threshold to adjust the overall sensitivity of the engine based solely on the device/front end in use. See: #### How do I set the wake word detection threshold? Client properties with model specific overrides for device playback state can be a better solution, but as of today, we do not have these override thresholds configured for all models. How do I enable Low Latency? The \"lowLatency\" mode, is valid only for \"U\" class models. This configuration parameter will reduce latency by 225 ms on average. It's disabled by default. To enable it, set PryonLiteDecoderConfig.lowLatency = true . The lower detection latency is at the cost of less accurate wake word end indices. How do I set the wake word detection threshold? Amazon PryonLite engine uses a detection threshold for trading off between False Accept (FA) and False Reject Ratio (FRR). The default is 500, while 1 is most sensitive (lowest FRR, highest FA) and 1000 is most restrictive (Lowest FA, highest FRR). It is generally recommend that if you have representative audio from the device\u2019s AFE, testing in steps of 50 around 500 (500, 450, 550, 400, 600, 350, 650) to see which gives the best balance of FA/FRR. Most devices should land near 500 +/- 100-150. To set the detection threshold, update wakewordConfig.detectThreshold in the main() loop or call PryonLiteWakeword_SetDetectionThreshold() any time after decoder initialization. Please refer to Detection Threshold . V2 API - Examples are provided in api_sample_PRL2000.cpp (or for PRL1000, PRL5000 ): wakewordConfig.detectThreshold = 500; // default threshold status = PryonLiteWakeword_SetDetectionThreshold(sHandle.ww, NULL, detectionThreshold) V1 API \u2013 Examples are provided in api_sample.cpp: config.detectThreshold = 500; // default threshold status = PryonLiteDecoder_SetDetectionThreshold(sDecoder, NULL, detectionThreshold); How do I interpret errors? PryonLite library API functions return PryonLiteStatus, which has 2 fields, publicCode and internalCode . publicCode type is enum PryonLiteError (defined in pryon_lite_error.h), and internalCode is an internal error code. For the public error code, please refer to pryon_lite_error.h in the package, under the folder for the architecture you are using (e.g. armv7a-linaro820). There is more information in the internal error code. Important When submitting a support ticket related to errors, be sure to provide both error codes to help diagnose the problem as quickly as possible. For example, the most common error when calling PryonLite_GetModelAttributes() is 8 ( PRYON_LITE_ERROR_MODEL_INCOMPATIBLE ). Most often the solution is to ensure that the engine and the model are compatible. This could be caused by mixing models and engine from different versions, or using model binaries from one architecture for another one (e.g. using x86 model binaries for aarch64-linaro541 architecture). See question \"There are many models under the models folder, which one do I use?\" The device wakes up to itself when the speech includes \"Alexa\". How can this be prevented? This type of false positives is known as \"self-wake\". Typically, we would expect an acoustic echo canceler (AEC) to be removing most or all of any echo in the microphone signal, and reduce self-wakes. Please refer to Self-Wake Mitigation Overview . Fingerprinting How many commercials are supported? Different versions of the list (small, medium, large) are available for download. Devices are programmed to ask for the right size based on their storage capacity. How often are the lists updated? The media list is compiled and updated on a weekly basis. The devices contact DAVS services periodically and check whether a new list is available. If there is a new list, the device will download it. Do I need the engine with V2 API before I can use fingerprinting? Yes, if you are using PryonLite engine with V1 API, you need to switch to V2 API. The effort is minimal, and should be about a week. VAD Does using VAD to gate Wakeword functionality increase the accuracy of the WW Engine? No. Its purpose is power saving only. For EnergyDetection in specific, it does very slightly degrade the accuracy of the WW Engine, typically 0.5-1.0% relative decrease in FAR/FRR, depending on the model being used. Can I increase or decrease the sensitivity of VAD? No. Currently the sensitivity is hard-coded inside the engine. Cascade Mode How does a customer make a request for models to run in cascade configuration? Please reach out to your AVS Solution Architect to make a new package request. Cascade or stand-alone model(s) will be provided based on the information entered in the request. What is the minimum CPU and memory required to run the first-stage detector? Around 20 MIPS and 125 KB of memory is required to run the smallest ultra-low-power model. Does VAD come built in with the first-stage detector? Yes, it does, but it is disabled by default. The customer is also free to use their custom VAD instead if preferred. Do I need to transmit Pre-Roll from the first stage to the second stage? Yes; in fact, you will need to transfer more than 500 milliseconds of audio to downstream stages. For devices that implement a 'cascade architecture' for low-power wake word detection, the final device-side wake word verification stage must meet pre-roll AVS device-to-cloud streaming requirements. So all upstream stages - in particular, the first-stage ultra-low-power wake word detector - must forward sufficient pre-roll for the final stage to be guaranteed to have the pre-roll it needs. It is recommended that a cascade architecture be implemented such that the pre-roll from the first stage be variable via run-time configuration from a minimum of 500 milliseconds, to upwards of 700 milliseconds, for a final on-device wake word verification stage that applies only wake word verification. If the final verification stage also applies on-device media-induced wake suppression techniques like fingerprint matching, or watermark detection, those features may require upwards of two to three seconds of pre-roll. Contact the PryonLite team for the latest requirements specific to your device integration. WWDI Why should my device implement WWDI? WakeWord Diagnostic Information (WWDI), also known as metadata, is used for monitoring wake word (WW) engine health in the field. WWDI includes WW engine/model versions, WW engine state, and WW detection threshold. Without this data we are blind when debugging customer issues related to WW detection, on-device fingerprinting, speaker ID, and other WW related features. WWDI also provides data for tracking of model performance in the field. For more information please see: Wake Word Docs WWDI What are start and end indices for Cloud-Based Wake Word Verification? Start index and end index are part of the SpeechRecognize Event HTTP request Sample Message . The start index represents the index in the audio stream where the wake word starts, in samples. The start index should be accurate to within 50 msec of wake word detection. We require 500 msec pre-roll before the wake word, which results in a start index of 8000. During the certification start index and end index are verified to ensure 500 msec pre-roll is available in the audio stream. Please refer to Streaming Requirements for Cloud-based Wake Word Verification . What if I leave the WWDI part of recognizer event empty? Leaving WWDI part empty will lead to AVS Certification failure. WWDI is used for understanding WWE performance and assessing health of devices in the field. Not implementing WWDI or modifying any content intended for WWDI hampers diagnostics efforts. Implementing WWDI is required as specified in the license.txt in the WWE package. Do I have to integrate WWDI even for the first stage wake word detector in a cascade architecture ? Currently, it is not mandatory for the first stage wake word detector to send its WWDI to the second stage wake word detector. However, it is still recommended to send the first stage WWDI to the second stage as it would enable smoother integration of upcoming features that require the first stage WWDI. AVS certification Why do I need to certify my device? If your product is intended for commercial use, you must submit your product for certification. To help you meet Amazon standards and build the best possible Alexa integration, all devices must go through Amazon\u2019s testing and certification process before receiving approval for launch. Please refer to Alexa Built-in Testing and Certification Process . When do I need to re-certify my device? Please refer to Launch and post-certification Over-the-Air (OTA) updates . How do I certify the Alexa App on an OEM phone? (Certifying a Mobile OEM for Alexa App launch) For a Mobile OEM to be able to launch with Alexa Mobile App, the OEM needs to be an AVS partner. For further details, please refer to Solution Providers and ODM Solution Providers . AVS Device SDK Where do I get the latest version of AVS device SDK? Please refer to ( link ). What is the advantage of using the AVS Device SDK rather than my own integration? The SDK implements cloud communication, WakeWord Diagnostic Information (WWDI), Communications, Music, and other features. Using the SDK saves considerable (over 50 hrs) development time. You would benefit from maintenance releases. Reduce chance of introducing new bugs in the custom application. For more information, please refer to AVS Device SDK . Where do I get the wake word adapter? For customers with access to AVS Developer Portal, the wake word adapter is available here: AVS SDK Adapter for Wake Word Lite . Please note, you have to be added to the allow list in order to get the access. Please reach out to the Amazon SAs for instructions. File Simulators (filesims) How do I use the File Simulators? See the instructions in the File Simulators section. When I try a file by itself, the wake word is detected, but when I try the same file as part of a list of files, the wake word is not detected, what is wrong? The wake word engine uses adaptation logic to learn the characteristics of the acoustic environment. This improves detection quality of all audio files in the list are recorded in the same acoustic environment. However if the audio clips are from different environments and do not contain at least 1 second of background noise prior to the wake word, the detection of the wakewords in subsequent files is impacted by changing characteristics. To avoid this problem, we suggest using -c which will cause the engine to clear the adaptation stats after each file.","title":"FAQ"},{"location":"support/faq/index.html#faq","text":"This section addresses frequently asked questions related to the Amazon Wake Word Engine.","title":"FAQ"},{"location":"support/faq/index.html#general-questions","text":"","title":"General Questions"},{"location":"support/faq/index.html#fingerprinting","text":"","title":"Fingerprinting"},{"location":"support/faq/index.html#vad","text":"","title":"VAD"},{"location":"support/faq/index.html#cascade-mode","text":"","title":"Cascade Mode"},{"location":"support/faq/index.html#wwdi","text":"","title":"WWDI"},{"location":"support/faq/index.html#avs-certification","text":"","title":"AVS certification"},{"location":"support/faq/index.html#avs-device-sdk","text":"","title":"AVS Device SDK"},{"location":"support/faq/index.html#file-simulators-filesims","text":"","title":"File Simulators (filesims)"}]}; var search = { index: new Promise(resolve => setTimeout(() => resolve(local_index), 0)) }