| 研究生: |
石朝全 Chao Chuang Shih |
|---|---|
| 論文名稱: |
使用轉移學習來改進針對命名實體音譯的樞軸語言方法 Using transfer learning to improve pivot language approach to named entity transliteration |
| 指導教授: | 蔡宗翰 |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
資訊電機學院 - 資訊工程學系 Department of Computer Science & Information Engineering |
| 論文出版年: | 2019 |
| 畢業學年度: | 107 |
| 語文別: | 中文 |
| 論文頁數: | 45 |
| 中文關鍵詞: | 機器音譯 、機器翻譯 、命名實體音譯 、雙語音譯 、轉移學習 、注意力機制 、Seq2Seq模型 、樞軸語言 |
| 外文關鍵詞: | Machine Transliteration, Named Entity Transliteration, Bilingual transliteration, Seq2Seq Model, Pivot language, Bridge Language |
| 相關次數: | 點閱:17 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
機器翻譯已經被研究多年,雖然多數句型可以被順利翻譯,但若句子包含命名實體如人名或地名,仍然有無法成功以該語言文字表現的窘境,這種情形在英語以外的語言之間的轉換也更加嚴重,而命名實體音譯即是此問題的解決方法之一。
音譯問題是機器翻譯很重要的一部分,但當我們實際要研究這個問題時,我們時常會發生僅有有限的來源語言和目標語言之間的平行語料的狀況,尤其當其中一種語言為低資源語言,這種狀況的發生機率就會大大提升。相對地,若我們將廣泛使用的語言(如:英文)視為樞軸語言,我們可能可以更加容易取得來源語言和樞軸語言或是樞軸語言和目標語言的平行語料,從這兩種語料中,我們可以很直觀地藉由找出共同的樞軸語言條目,來產生包含來源語言、樞軸語言以及目標語言的三語言平行語料,以解決原本雙語間的音譯問題。然而,這種方法卻會浪費大量得來不易的資料。
因此,我們提出了一個採用了注意力機制以及轉移學習的Seq2Seq模型,除了三種語言的平行語料外,可以有效利用剩餘資料,增進從來源語言到目標語言的命名實體音譯問題之表現。
Machine translation has been research for a long time. Although most of the sentences can be translated correctly, when it comes to named entity like a personal name or a location in a sentence, there's still room for improvement especially between non-English languages. Named Entity Transliteration is a way to solve the condition mentioned above.
Transliteration is a key part of machine translation. However when we actually do research, we often have limited parallel data between source language and target language. If we take a wildly used language as a pivot langage, in contract, it would be more easily to extract language pairs of source language to pivot language and pivot language to target language. It's intuitive to extract the common pivot language entities from these corpora to generate a three-language parallel data include source language, pivot language, target language. We can achieve the bilingual transliteration task using the parallel data; nevertheless, large amount of data is wasted in this method.
We propose a modified attention-based sequence-to-sequence model which also applies transfer learning techniques. Our model effectively utilize the remaining data besides the parallel data to promote the performance of named entity transliteration.
[1] N. Chen, X. Duan, M. Zhang, R. E. Banchs, and H. Li, “Whitepaper on NEWS
2018 Shared Task on Machine Transliteration,” p. 8,
[2] Y.-C. Wang, C.-K. Wu, and R. T.-H. Tsai, “Cross-language and Cross-encyclopedia
Article Linking Using Mixed-language Topic Model and Hypernym Translation,”
in Proceedings of the 52nd Annual Meeting of the Association for Computational
Linguistics (Volume 2: Short Papers), Baltimore, Maryland: Association for Computational
Linguistics, 2014, pp. 586–591. doi: 10.3115/v1/P14-2096. [Online].
Available: http://aclweb.org/anthology/P14-2096.
[3] P. Sorg and P. Cimiano, “Enriching the Crosslingual Link Structure of Wikipedia
—A Classification-Based Approach,” p. 6, 2008.
[4] J. Oh, D. Kawahara, K. Uchimoto, J. Kazama, and K. Torisawa, “Enriching Multilingual
Language Resources by Discovering Missing Cross-Language Links in
Wikipedia,” in 2008 IEEE/WIC/ACM International Conference on Web Intelligence
and Intelligent Agent Technology, vol. 1, Dec. 2008, pp. 322–328. doi: 10.
1109/WIIAT.2008.317.
[5] S. Ganesh, S. Harsha, P. Pingali, and V. Varma, “Statistical Transliteration for
Cross Langauge Information Retrieval using HMM alignment and CRF,” p. 6, 2008.
[6] S. Gella, J. Sharma, and K. Bali, “Query word labeling and Back Transliteration
for Indian Languages: Shared task system description,” p. 6, 2013.
[7] M. G. A. Malik, C. Boitet, and P. Bhattacharyya, “Hindi Urdu machine transliteration
using finite-state transducers,” in Proceedings of the 22nd International Conference
on Computational Linguistics - COLING ’08, vol. 1, Manchester, United
Kingdom: Association for Computational Linguistics, 2008, pp. 537–544, isbn: 978-
1-905593-44-6. doi: 10 . 3115 / 1599081 . 1599149. [Online]. Available: http : / /
portal.acm.org/citation.cfm?doid=1599081.1599149 (visited on 01/28/2019).
[8] W. Ammar, C. Dyer, and N. Smith, “Transliteration by Sequence Labeling with
Lattice Encodings and Reranking,” p. 5, 2012.
[9] D. Bahdanau, K. Cho, and Y. Bengio, “Neural Machine Translation by Jointly
Learning to Align and Translate,” Sep. 1, 2014. arXiv: 1409.0473 [cs, stat].
[Online]. Available: http://arxiv.org/abs/1409.0473 (visited on 01/16/2019).
[10] M. Bisani and H. Ney, “Joint-sequence models for grapheme-to-phoneme conversion,”
Speech Communication, vol. 50, no. 5, pp. 434–451, May 2008, issn: 01676393.
doi: 10.1016/j.specom.2008.01.002. [Online]. Available: https://linkinghub.
elsevier.com/retrieve/pii/S0167639308000046 (visited on 01/21/2019).
[11] S. Jiampojamarn, C. Cherry, and G. Kondrak, “Joint Processing and Discriminative
Training for Letter-to-Phoneme Conversion,” p. 9, 2008.
[12] S. Jiampojamarn, A. Bhargava, Q. Dou, K. Dwyer, and G. Kondrak, “DirecTL:
A language-independent approach to transliteration,” in Proceedings of the 2009
Named Entities Workshop: Shared Task on Transliteration - NEWS ’09, Suntec, Singapore:
Association for Computational Linguistics, 2009, p. 28, isbn: 978-1-932432-
57-2. doi: 10.3115/1699705.1699712. [Online]. Available: http://portal.acm.
org/citation.cfm?doid=1699705.1699712 (visited on 01/21/2019).
[13] A. Finch and E. Sumita, “A Bayesian Model of Bilingual Segmentation for Transliteration,”
p. 8, 2010.
[14] A. Finch, L. Liu, X. Wang, and E. Sumita, “Target-Bidirectional Neural Models
for Machine Transliteration,” in Proceedings of the Sixth Named Entity Workshop,
Berlin, Germany: Association for Computational Linguistics, 2016, pp. 78–82. doi:
10.18653/v1/W16- 2711. [Online]. Available: http://aclweb.org/anthology/
W16-2711 (visited on 01/26/2019).
[15] M. M. Khapra, A. Kumaran, and P. Bhattacharyya, “Everybody loves a rich cousin:
An empirical study of transliteration through bridge languages,” p. 9, 2010.
[16] M. Zhang, X. Duan, V. Pervouchine, and H. Li, “Machine Transliteration: Leveraging
on Third Languages,” p. 9, 2010.
[17] L. Haizhou, Z. Min, and S. Jian, “A joint source-channel model for machine transliteration,”
in Proceedings of the 42nd Annual Meeting on Association for Computational
Linguistics - ACL ’04, Barcelona, Spain: Association for Computational
Linguistics, 2004, 159–es. doi: 10 . 3115 / 1218955 . 1218976. [Online]. Available:
http://portal.acm.org/citation.cfm?doid=1218955.1218976 (visited on
01/19/2019).
[18] B. Zoph, D. Yuret, J. May, and K. Knight, “Transfer Learning for Low-Resource
Neural Machine Translation,” Apr. 7, 2016. arXiv: 1604 . 02201 [cs]. [Online].
Available: http://arxiv.org/abs/1604.02201.
[19] T. Q. Nguyen and D. Chiang, “Transfer Learning across Low-Resource, Related
Languages for Neural Machine Translation,” Aug. 31, 2017. arXiv: 1708 . 09803
[cs]. [Online]. Available: http : / / arxiv . org / abs / 1708 . 09803 (visited on
01/21/2019).
[20] X. N. Agency., “Chinese transliteration of foreign personal names,” The Commercial
Press, 1992.