Motion Path Generation Using A Modified 6th Order Polynomial Function for Visual Speech Synthesis
Salleh, Siti Salwa (2008) Motion Path Generation Using A Modified 6th Order Polynomial Function for Visual Speech Synthesis. PhD thesis, Universiti Putra Malaysia.
Facial model that consist of synchronized speech and lips are able to increase speech intelligibility. This kind of system is called Visual Speech Synthesis system (VSS). Realisitic visual speech synthesis normally require manipulation of the facial mesh’s vertices. These processes are complex; it requires large memory and computational power. Another technique that can be used for the same purpose is by using the parametric function which is able to control the motion of points on the lips model. Therefore, this study proposed the used of 6th order polynomial function as the lips’ motion curve. The 6th order polynomial curve however is wild and unstable at the beginning and end of the curve. It needs to be altered because it will be used as the motion curve. A formulation has been proposed in order to flatten the curvy portions. Subsequently the altered polynomial curve is used to develop a computational steps that composes an isolated digit words utterance. This technique manages to generate the visual speech synthesis that start and end with neutral lips shapes. It also manages to increase the lips motion velocity and acceleration at the beginning of the utterance and decrease the motion velocity and acceleration when the utterance is complete. Another contribution of this study is the computational technique and steps to generate continuous utterance based on the altered 6th order polynomial. This technique focused on lips motion in between one utterance to another. It also considers the motion velocity and acceleration in synthesizing continuous utterance. As a result it manages to produce realistic and smooth continuation. Synthesized visual speech was compared to the actual lips deformation to see the degree of its realistic realization. The actual motion curve is captured by using optical motion capture software. Motion similarity is measured base on the correlation coefficient values produce among the two curves. The control vertices relocation; lips width and height during speech; motion velocity and acceleration of control vertices and compare the shapes similarities between synthesize and actual lips were also measured. Results have shown that the use of 6th order altered polynomial function is able to produce good speech synthesis with 88% - 95% similarity. In future the use of these techniques will improve in order to produce higher quality of visual speech synthesis.
Repository Staff Only: Edit item detail