Improving Phoneme to Viseme Mapping for Indonesian Language

Anung Rachman(1*), Risanuri Hidayat(2), Hanung Adi Nugroho(3)

(1) Universitas Gadjah Mada Institut Seni Indonesia (ISI) Surakarta
(2) Universitas Gadjah Mada
(3) Universitas Gadjah Mada
(*) Corresponding Author


The lip synchronization technology of animation can run automatically through the phoneme-to-viseme map. Since the complexity of facial muscles causes the shape of the mouth to vary greatly, phoneme-to-viseme mapping always has challenging problems. One of them is the allophone vowel problem. The resemblance makes many researchers clustering them into one class. This paper discusses the certainty of allophone vowels as a variable of the phoneme-to-viseme map. Vowel allophones pre-processing as a proposed method is carried out through formant frequency feature extraction methods and then compared by t-test to find out the significance of the difference. The results of pre-processing are then used to reference the initial data when building phoneme-to-viseme maps. This research was conducted on maps and allophones of the Indonesian language. Maps that have been built are then compared with other maps using the HMM method in the value of word correctness and accuracy. The results show that viseme mapping preceded by allophonic pre-processing makes map performance more accurate when compared to other maps.


Phoneme-to-Viseme Mapping; Allophones; Vowels; Formant Frequencies; Lip-Reading

