Citation
Babiker, Elsadig Ahmed Mohamed
(2002)
A Hybrid Rough Sets K-Means Vector Quantization Model For Neural Networks Based Arabic Speech Recognition.
Doctoral thesis, Universiti Putra Malaysia.
Abstract
Speech is a natural, convenient and rapid means of human communication. The
abil ity to respond to spoken language is of special importance in computer
application wherein the user cannot use his/her limbs in a proper way, and may be
useful in office automation systems. It can help in developing control systems for
many applications such as in telephone assistance systems.
Rough sets theory represents a mathematical approach to vagueness and uncertainty.
Data analysis, data reduction, approxi mate classification, machine learning, and
discovery of pattern in data are functions performed by a rough sets analysis. It was
one of the first non-statistical methodologies of data analysis. It extends classical set
theory by incorporating into the set model the notion of classification as
indiscernibility relation.In previous work rough sets approach application to the field of speech recognition
was limited to the pattern matching stage. That is, to use training speech patterns to
generate classification rules that can be used later to classify input words patterns.
In this thesis rough sets approach was used in the preprocessing stages, namely in the
vector quantization operation in which feature vectors are quantized or classified to a
finite set of codebook classes. Classification rules were generated from training
feature vectors set, and a modified form of the standard voter classification
algorithm, that use the rough sets generated rules, was applied.
A vector quantization model that incorporate rough sets attribute reduction and rules
generation with a modified version of the K-means clustering algorithm was
developed, implemented and tested as a part of a speech recognition framework, in
which the Learning Vector Quantization (LVQ) neural network model was used in
the pattern matching stage.
In addition to the Arabic speech data that used in the original experiments, for both
speaker dependant and speaker independent tests, more verification experiments
were conducted using the TI20 speech data.
The rough sets vector quantization model proved its usefulness in the speech
recognition framework, however it can be extended to different applications that
involve large amounts of data such as speaker verification.
Download File
Additional Metadata
Actions (login required)
|
View Item |