Can Chinese Phonemes Improve Machine Transliteration?: A Comparative Study of English-to-Chinese Transliteration Models

Inspired by the success of English grapheme-to-phoneme research in speech synthesis, many researchers have proposed phoneme-based English-to-Chinese transliteration models. However, such approaches have severely suffered from the errors in Chinese phoneme-to-grapheme conversion. To address this issue, we propose a new English-to-Chinese transliteration model and make systematic comparisons with the conventional models. Our proposed model relies on the joint use of Chinese phonemes and their corresponding English graphemes and phonemes. Experiments showed that Chinese phonemes in our proposed model can contribute to the performance improvement in English-to-Chinese transliteration.

[1]  H. Isahara,et al.  A Comparison of Different Machine Transliteration Models , 2006, J. Artif. Intell. Res..

[2]  Karin M. Verspoor,et al.  Automatic English-Chinese name transliteration for development of multilingual resources , 1998, ACL.

[3]  Haizhou Li,et al.  Semantic Transliteration of Personal Names , 2007, ACL.

[4]  Su-Youn Yoon,et al.  Multilingual Transliteration Using Feature based Phonetic Method , 2007, ACL.

[5]  Haizhou Li,et al.  Whitepaper of NEWS 2009 Machine Transliteration Shared Task , 2009, NEWS@IJCNLP.

[6]  Richard M. Schwartz,et al.  The N-Best Algorithm: Efficient Procedure for Finding Top N Sentence Hypotheses , 1989, HLT.

[7]  Grzegorz Kondrak,et al.  Substring-Based Transliteration , 2007, ACL.

[8]  Yaser Al-Onaizan,et al.  Translating Named Entities Using Monolingual and Bilingual Resources , 2002, ACL.

[9]  Jian Su,et al.  A Joint Source-Channel Model for Machine Transliteration , 2004, ACL.

[10]  Sanjeev Khudanpur,et al.  Transliteration of Proper Names in Cross-Lingual Information Retrieval , 2003, NER@ACL.

[11]  Long Jiang,et al.  Named Entity Translation with Web Mining and Transliteration , 2007, IJCAI.

[12]  Wei Gao,et al.  Phoneme-Based Transliteration of Foreign Names for OOV Problem , 2004, IJCNLP.

[13]  Ellen M. Voorhees,et al.  The TREC-5 Confusion Track: Comparing Retrieval Methods for Scanned Text , 2000, Information Retrieval.

[14]  Adam L. Berger,et al.  A Maximum Entropy Approach to Natural Language Processing , 1996, CL.

[15]  Adwait Ratnaparkhi,et al.  A Linear Observed Time Statistical Parser Based on Maximum Entropy Models , 1997, EMNLP.

[16]  Jason S. Chang,et al.  Acquisition of English-Chinese Transliterated Word Pairs from Parallel-Aligned Texts using a Statistical Machine Transliteration Model , 2003, ParallelTexts@NAACL-HLT.

[17]  Ming-Wei Chang,et al.  Unsupervised Constraint Driven Learning For Transliteration Discovery , 2009, HLT-NAACL.

[18]  Falk Scholer,et al.  Collapsed Consonant and Vowel Models: New Approaches for English-Persian Transliteration and Back-Transliteration , 2007, ACL.

[19]  Muhammad Ghulam Abbas Malik,et al.  Punjabi Machine Transliteration , 2006, ACL.

[20]  Berlin Chen,et al.  Generating phonetic cognates to handle named entities in English-Chinese cross-language spoken document retrieval , 2001, IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01..