Assessing the effectiveness of transfer learning strategies in BLSTM networks for speech denoising

Main Article Content

Abstract

Denoising speech signals represent a challenging task due to the increasing number of applications and technologies currently implemented in communication and portable devices. In those applications, challenging environmental conditions such as background noise, reverberation, and other sound artifacts can affect the quality of the signals. As a result, it also impacts the systems for speech recognition, speaker identification, and sound source localization, among many others. For denoising the speech signals degraded with the many kinds and possibly different levels of noise, several algorithms have been proposed during the past decades, with recent proposals based on deep learning presented as state-of-the-art, in particular those based on Long Short-Term Memory Networks (LSTM and Bidirectional-LSMT). In this work, a comparative study on different transfer learning strategies for reducing training time and increase the effectiveness of this kind of network is presented. The reduction in training time is one of the most critical challenges due to the high computational cost of training LSTM and BLSTM. Those strategies arose from the different options to initialize the networks, using clean or noisy information of several types. Results show the convenience of transferring information from a single case of denoising network to the rest, with a significant reduction in training time and denoising capabilities of the BLSTM networks.

Article Details

How to Cite
Marvin, Astryd, & Michelle. (2022). Assessing the effectiveness of transfer learning strategies in BLSTM networks for speech denoising. Tecnología En Marcha Journal, 35(8), Pág. 42–49. https://doi.org/10.18845/tm.v35i8.6448
Section
Artículo científico

References

Weninger, F., Watanabe, S., Tachioka, Y., and Schuller, B. “Deep recurrent de- noising auto-encoder and blind de-reverberation for reverberated speech recogni- tion.” IEEE ICASSP, 2014.

Donahue, Chris, Bo Li, and Rohit Prabhavalkar. “Exploring speech enhancement with generative adversarial networks for robust speech recognition.” IEEE ICASSP, 2018.

Coto-Jiménez, Marvin, John Goddard-Close, and Fabiola Martínez-Licona. “Im- proving automatic speech recognition containing additive noise using deep denoising autoencoders of LSTM networks.” International Conference on Speech and Computer. Springer, Cham, 2016.

Abouzid, Houda, et al. “Signal speech reconstruction and noise removal using convolutional denoising audioencoders with neural deep learning.” Analog Integrated Circuits and Signal Processing 100.3 (2019): 501-512.

Ling, Zhang. ”An Acoustic Model for English Speech Recognition Based on Deep Learning.” 2019 11th International Conference on Measuring Technology and Mechatronics Automation (ICMTMA). IEEE, 2019.

Coto-Jiménez, M.; Goddard-Close, J.; Di Persia, L.; Rufiner, H.L. “Hybrid Speech Enhancement with Wiener filters and Deep LSTM Denoising Autoencoders.” In Proceedings of the 2018 IEEE International Work Conference on Bioinspired Intelligence (IWOBI), San Carlos, CA, USA, 18–20 July 2018; pp. 1–8.

González-Salazar, Astryd, Michelle Gutiérrez-Muñoz, and Marvin Coto-Jiménez. ”Enhancing Speech Recorded from a Wearable Sensor Using a Collection of Autoencoders.” Latin American High Performance Computing Conference. Springer, Cham, 2019.

Gutiérrez-Muñoz, Michelle, Astryd González-Salazar, and Marvin Coto-Jiménez. “Evaluation of Mixed Deep Neural Networks for Reverberant Speech Enhancement.” Biomimetics 5.1 (2020): 1

Tkachenko, Maxim, et al. “Speech Enhancement for Speaker Recognition Using Deep Recurrent Neural Networks.” International Conference on Speech and Com- puter. Springer, Cham, 2017.

Liu, Ming, et al. “Speech Enhancement Method Based On LSTM Neural Net- work for Speech Recognition.” 2018 14th IEEE International Conference on Signal Processing (ICSP). IEEE, 2018.

Weiss, Karl, Taghi M. Khoshgoftaar, and DingDing Wang. “A survey of transfer learning.” Journal of Big Data 3.1 (2016): 9.

Song, Guangxiao, et al. “Transfer Learning for Music Genre Classification.” Inter- national Conference on Intelligence Science. Springer, Cham, 2017.

Yeom-Song, Víctor, Marisol Zeledón-Córdoba, and Marvin Coto-Jiménez. ”A Per- formance Evaluation of Several Artificial Neural Networks for Mapping Speech Spectrum Parameters

Most read articles by the same author(s)

<< < 1 2