|ARawNet: A Lightweight Solution for Leveraging Raw Waveforms in Spoof Speech Detection|
An emerging trend in audio processing is capturing low-level speech representations from raw waveforms. These representations have shown promising results on a variety of tasks, such as speech recognition and speech separation. Compared to handcrafted features, learning speech features via backpropagation can potentially provide the model greater flexibility in how it represents data for different tasks. However, results from empirical studies show that, in some tasks, such as spoof speech detection, handcrafted features still currently outperform learned features.
|Year of Publication||
2022 26th International Conference on Pattern Recognition (ICPR)
Montreal, QC, Canada
|Google Scholar | BibTeX | XML | DOI|