Citation
Abstract
The field of speech signal processing has undergone significant transformation through extensive research. There is growing interest in Speech Enhancement (SE) and Automatic Speech Recognition (ASR), with SE serving as a crucial preliminary step to enhance ASR performance. This paper addresses key challenges, particularly the need to maintain speech quality and improve intelligibility in ASR systems. Recently, deep learning techniques have emerged as powerful tools for tackling these challenges. This systematic review examines speech enhancement and recognition techniques, emphasizing denoising, acoustic modeling, and beamforming. Various deep learning architectures, such as Deep Neural Networks (DNN), Convolutional Neural Networks (CNN), Recurrent Neural Networks (RNN), Long Short-Term Memory (LSTM) networks, and Hybrid Neural Networks, are reviewed to highlight their roles in enhancement and recognition. The review specifically details their usage, the features utilized in each study, the databases employed, performance, and limitations, all presented in a structured tabular format. This approach provides valuable insights into the strengths and weaknesses of each method, guiding future advancements in the field. In particular, it emphasizes that LSTM-RNN models excel in temporal signal processing, while hybrid models demonstrate superior performance in optimizing task outcomes. The paper conducts a comprehensive statistical analysis of 187 research papers that exclusively utilize deep neural networks to address the challenges of speech enhancement and recognition, presenting the latest advances in the field. The review examines publications from 2012 to 2024, shedding light on research trends and patterns, while the proposed solutions aim to bridge gaps for researchers in this evolving domain.
Download File
Official URL or Download Paper: https://www.sciencedirect.com/science/article/pii/...
|
Additional Metadata
| Item Type: | Article |
|---|---|
| Subject: | Engineering (all) |
| Divisions: | Faculty of Computer Science and Information Technology Faculty of Engineering |
| DOI Number: | https://doi.org/10.1016/j.asej.2025.103405 |
| Publisher: | Ain Shams University |
| Keywords: | Acoustic modeling; Beamforming; Deep neural network; Denoising; Machine learning; Reverberation; Speech enhancement; Speech recognition; Systematic review |
| Depositing User: | Ms. Nur Faseha Mohd Kadim |
| Date Deposited: | 07 Apr 2026 02:23 |
| Last Modified: | 07 Apr 2026 02:23 |
| Altmetrics: | http://www.altmetric.com/details.php?domain=psasir.upm.edu.my&doi=10.1016/j.asej.2025.103405 |
| URI: | http://psasir.upm.edu.my/id/eprint/124114 |
| Statistic Details: | View Download Statistic |
Actions (login required)
![]() |
View Item |
