Mahidol University Logo
Faculty of ICT, Mahidol University
 

Admissions

Printable Version

 

THAI SYLLABLE SPEECH RECOGNITION USING HIDDEN MARKOV MODELS

 

TITLE THAI SYLLABLE SPEECH RECOGNITION USING HIDDEN MARKOV MODELS.
AUTHOR PORNCHAI POARAMSRI
DEGREE MASTER OF SCIENCE PROGRAMME IN COMPUTER SCIENCE
FACULTY FACULTY OF SCIENCE
ADVISOR SUPACHAI TANGWONGSAN
CO-ADVISOR CHOMTIP PORNPANOMCHAI
 
ABSTRACT
Spoken language is one of the most intuitive communication methods for humans. Interfacing between human and machine by using speech or spoken language is an important part of the new era of human-computer interfacing. Speech recognition gives human speech understanding ability to a computer. Speech recognition of English language is very successful and could be commonly found in many daily applications, in the present day. However, Thai speech recognition is still far from reality or common acceptance. Our work is to develop a Thai speech recognition system but limit the problem scope to the speaker dependence and isolated syllable problem. To reach that goal, the Hidden Markov Model was studied and the HMM-based speech recognition system was developed. Moreover, the Generalized Mixture Tying HMM (GENONE HMM), which is a special scheme of HMM, was adopted in addition to the Continuous-Density HMM (CDHMM) and Semi-Continuous HMM (SCHMM). The Viterbi training procedure, which is used for training the HMMs, was also modified for the GENONE HMM. In addition, the Beam pruning technique was adapted and integrated into the system to reduce computational time for recognition. Moreover, the experiment on the Thai tone recognition was also arranged by using the HMM-based speech recognition system. The acoustic features used in our system were the Mel-Frequency Cepstrum Coefficient (MFCC), Energy and Pitch frequency. The system was developed and tested for running on the Intel-based machine running MS Windows 2000. All programs were coded in the C/C++ programming language by using the Microsoft Visual C++ 6.0 as a development tool. The recognition accuracy of the speaker-dependent Thai isolated syllable speech recognition using HMM was above 98%. By using Beam pruning technique with an appropriate beam width, recognition time was reduced by 65.4% with no additional error. Thai tone recognition using HMM yielded recognition accuracy of 97.88%, 97.36%, 98.81%, 90.67% and 100.0% for MID, LOW, FALLING, HIGH and RISING tones respectively. In conclusion, the system is quite satisfactory to meet the research objectives, especially in terms of recognition accuracy and speed.
KEYWORD THAI SPEECH RECOGNITION / HIDDEN MARKOV MODEL / GENERALIZED MIXTURE TYING / GENONE / VITERBI TRAINING / BEAM PRUNING

 

Go to Top

 

ICT Building, Mahidol University, 999 Phuttamonthon 4 Road, Salaya, Nakhonpathom 73170 Tel. +66 02 441-0909 Fax. +66 02 849-6099
Mahidol University Computing Center, The Faculty of ICT, Mahidol University , Rama 6 Road, Rajathevi, Bangkok 10400 Tel. +66 02 354-4333 Fax. +66 02 354-7333