Building a Multilingual Speech Corpus

DataForce supports a global audio hardware leader with high quality data for fine-tuning their ASR engine.

The Problem

Automatic speech recognition (ASR) systems can convert user commands into text that is then processed by natural language processing systems. To have an effective ASR implementation, one needs to consider several aspects, such as sound and voice variations across genders, age groups, accents, and dialects, and the background noise associated with the environment where the ASR system will be used. In this case, the client needed to collect training and test data from multiple demographic groups in English, Hindi, German, French, and Italian.

The Solution

DataForce collected voice data and background noise across several scenarios using our proprietary mobile app, DataForce Contribute. Our app ensured that the audio files respected all technical requirements, such as signal-to-noise ratio and sampling rate. After having all voice commands and ambient noise collected in parking, driving, and windows open/closed conditions, convoluting the sound waves helped create data sets that simulated a real environment. With DataForce’s solution, the client developed and tested an efficient ASR engine capable of understanding voice commands in several languages across different scenarios.

Audio Wave