Speaker Identification Using Wavelet Packet Transform and Feed Forward Neural Network

Citation

Almashrgy, Mohamed Ali (2005) Speaker Identification Using Wavelet Packet Transform and Feed Forward Neural Network. Masters thesis, Universiti Putra Malaysia.

Abstract

It has been known for a long time that speakers can be identified from their voices. In this work we introduce a speaker identification system using wavelet packet transform. This is one of a wavelet transform analysis for feature extraction and a neural network for classification. This system is applied on ten speakers Instead of applying framing on the signal, the wavelet packet transform is applied on the whole range of the signal. This reduces the calculation time. The speech signal is decomposed into 24 sub bands, according to Mel-scale frequency. Then, for each of these bands, the log energy is taken. Finally, the discrete cosine transform is applied on these bands. These are taken as features for identifying the speaker among many speakers. For the classification task, Feed Forward multi layer perceptron, trained by backpropagation, is proposed for use as training and classification feature vectors of the speaker. We propose to construct a single neural network for each speaker of interest. Training and testing of isolated words in three cases, Vis one-, two-, and three-syllable words, were obtained by recording these words from the LAB colleagues using a low-cost microphone.

Download File

Preview

PDF
FK_2005_36A.pdf
Download (183kB)

Additional Metadata

Item Type:	Thesis (Masters)
Subject:	Neural networks (Computer science)
Call Number:	FK 2005 36
Chairman Supervisor:	Associate Professor Adznan Bin Jantan, PhD
Divisions:	Faculty of Engineering
Depositing User:	Users 17 not found.
Date Deposited:	19 Dec 2008 12:41
Last Modified:	27 May 2013 06:51
URI:	http://psasir.upm.edu.my/id/eprint/865
Statistic Details:	View Download Statistic

Actions (login required)

View Item