UPM Institutional Repository

Comparative analysis of different parameters used for optimization in the process of speaker and speech recognition using deep neural network


Citation

Natarajan, Sureshkumar and Al-Haddad, Syed Abdul Rahman and Ahmad, Faisul Arif and Hassan, Mohd Khair and Raja Kamil and Azrad, Syaril and Yahya, Mohammed Nawfal and Macleans, June Francis and Salvekar, Pratiksha Prashant (2022) Comparative analysis of different parameters used for optimization in the process of speaker and speech recognition using deep neural network. In: 2022 International Conference on Future Trends in Smart Communities (ICFTSC), 1-2 Dec. 2022, Borneo Conventional Centre Kuching, Sarawak, Malaysia. (pp. 12-17).

Abstract

The process of speaker recognition in a noisy and distant environment is a difficult task as it faces numerous challenges before clean speaker speech signal reaching the microphone. While developing a deep neural network for robust functioning in extreme conditions, the selection of a perfectly compatible optimizer, loss function, and dropout is necessary. This paper presents a comparative study of the optimization process in the neural network, how loss function effectively unites in seeking the optimizer. It emphasizes on the selection of the number of input nodes, hidden layers, and time consumed by each set of selections. This study elaborates the accuracy obtained at different combinations of parameters for robust deep neural network structure. This paper is classified under speaker and speech recognition process into improving accuracy. Experiment results shows that Adam optimizer with 150 epochs offers around 95% accuracy for speaker classification under the noisy condition at different SNR values.


Download File

Full text not available from this repository.
Official URL or Download Paper: https://ieeexplore.ieee.org/document/10040065

Additional Metadata

Item Type: Conference or Workshop Item (Paper)
Divisions: Faculty of Engineering
DOI Number: https://doi.org/10.1109/ICFTSC57269.2022.10040065
Publisher: IEEE
Keywords: Speaker identification; Speech recognition; Deep learning; Neural network; Optimizer; Dropout
Depositing User: Ms. Nuraida Ibrahim
Date Deposited: 13 Nov 2023 05:28
Last Modified: 13 Nov 2023 05:28
Altmetrics: http://www.altmetric.com/details.php?domain=psasir.upm.edu.my&doi=10.1109/ICFTSC57269.2022.10040065
URI: http://psasir.upm.edu.my/id/eprint/37829
Statistic Details: View Download Statistic

Actions (login required)

View Item View Item