Résumé:
Voice activity detection (VAD) is considered one of the most important
techniques for many speech applications. It is an important method in speech processing,
as it detects the presence or absence of speech. Previously VAD performance was based on
methods that depended on signal processing signal processing, but did not perform
satisfactorily in high-noise environments, so deep learning became an alternative. A , we
adopted in the experimental study three structures for deep learning deep learning,
namely Convolutional Neural Networks (CNN) and a DenseNet network, and we also used
the three databases for speech and noise, namely LibriSpeech, TidiGets and Chimie5 in
succession. We measured accuracy in low-noise environments with various sensitivities
and achieved 100% accuracy.