A Hybrid Rough Sets K-Means Vector Quantization Model For Neural Networks Based Arabic Speech Recognition
Babiker, Elsadig Ahmed Mohamed (2002) A Hybrid Rough Sets K-Means Vector Quantization Model For Neural Networks Based Arabic Speech Recognition. PhD thesis, Universiti Putra Malaysia.
Speech is a natural, convenient and rapid means of human communication. The abil ity to respond to spoken language is of special importance in computer application wherein the user cannot use his/her limbs in a proper way, and may be useful in office automation systems. It can help in developing control systems for many applications such as in telephone assistance systems. Rough sets theory represents a mathematical approach to vagueness and uncertainty. Data analysis, data reduction, approxi mate classification, machine learning, and discovery of pattern in data are functions performed by a rough sets analysis. It was one of the first non-statistical methodologies of data analysis. It extends classical set theory by incorporating into the set model the notion of classification as indiscernibility relation.In previous work rough sets approach application to the field of speech recognition was limited to the pattern matching stage. That is, to use training speech patterns to generate classification rules that can be used later to classify input words patterns. In this thesis rough sets approach was used in the preprocessing stages, namely in the vector quantization operation in which feature vectors are quantized or classified to a finite set of codebook classes. Classification rules were generated from training feature vectors set, and a modified form of the standard voter classification algorithm, that use the rough sets generated rules, was applied. A vector quantization model that incorporate rough sets attribute reduction and rules generation with a modified version of the K-means clustering algorithm was developed, implemented and tested as a part of a speech recognition framework, in which the Learning Vector Quantization (LVQ) neural network model was used in the pattern matching stage. In addition to the Arabic speech data that used in the original experiments, for both speaker dependant and speaker independent tests, more verification experiments were conducted using the TI20 speech data. The rough sets vector quantization model proved its usefulness in the speech recognition framework, however it can be extended to different applications that involve large amounts of data such as speaker verification.
Repository Staff Only: Edit item detail