Citation
Salleh, Siti Salwa
(2008)
Motion Path Generation Using A Modified 6th Order Polynomial Function for Visual Speech Synthesis.
PhD thesis, Universiti Putra Malaysia.
Abstract
Facial model that consist of synchronized speech and lips are able to increase speech
intelligibility. This kind of system is called Visual Speech Synthesis system (VSS).
Realisitic visual speech synthesis normally require manipulation of the facial mesh’s
vertices. These processes are complex; it requires large memory and computational
power. Another technique that can be used for the same purpose is by using the
parametric function which is able to control the motion of points on the lips model.
Therefore, this study proposed the used of 6th order polynomial function as the lips’
motion curve. The 6th order polynomial curve however is wild and unstable at the
beginning and end of the curve.
It needs to be altered because it will be used as the motion curve. A formulation has
been proposed in order to flatten the curvy portions. Subsequently the altered polynomial curve is used to develop a computational steps that composes an isolated
digit words utterance. This technique manages to generate the visual speech
synthesis that start and end with neutral lips shapes. It also manages to increase the
lips motion velocity and acceleration at the beginning of the utterance and decrease
the motion velocity and acceleration when the utterance is complete. Another
contribution of this study is the computational technique and steps to generate
continuous utterance based on the altered 6th order polynomial. This technique
focused on lips motion in between one utterance to another. It also considers the
motion velocity and acceleration in synthesizing continuous utterance. As a result it
manages to produce realistic and smooth continuation. Synthesized visual speech
was compared to the actual lips deformation to see the degree of its realistic
realization. The actual motion curve is captured by using optical motion capture
software. Motion similarity is measured base on the correlation coefficient values
produce among the two curves. The control vertices relocation; lips width and height
during speech; motion velocity and acceleration of control vertices and compare the
shapes similarities between synthesize and actual lips were also measured. Results
have shown that the use of 6th order altered polynomial function is able to produce
good speech synthesis with 88% - 95% similarity. In future the use of these
techniques will improve in order to produce higher quality of visual speech synthesis.
Download File
Additional Metadata
Actions (login required)
|
View Item |