| 研究生: |
江一杰 Yi-Chieh Chiang |
|---|---|
| 論文名稱: |
漢字發音系統之音韻關聯規則探勘 Phonetic Association rule discovery for Chinese Character Pronunciation system |
| 指導教授: |
蔡孟峰
Meng-Feng Tsai |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
資訊電機學院 - 資訊工程學系 Department of Computer Science & Information Engineering |
| 畢業學年度: | 98 |
| 語文別: | 英文 |
| 論文頁數: | 63 |
| 中文關鍵詞: | 漢語音韻 、關聯規則探勘 |
| 外文關鍵詞: | chinese phonetic, association rule mining |
| 相關次數: | 點閱:6 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
漢語是一種困難學習的語言,因為漢語不是拼音文字,光是要認識每個字的發音就很困難,所以希望為華文學習者設計一套較為方便的學習方法,並且提供編寫教材的人員參考。因此需要大量的漢字資料,於是選擇使用漢字構型資料庫的形聲字且有注音的部分,因為形聲字佔漢字的百分之九十以上,並且形聲字拆成形符與聲符,漢語初學者可以判斷聲符去聯想一個字的發音。將漢字構形資料庫的文字透過漢語大辭典查詢所有的異音,再藉由與中文學專家的合作,由中文學的專家標定每一組形聲字的聲符,接著再利用資料探勘領域的多層關聯規則探勘方法(Multilevel Association Rules Mining),探勘出聲符對形聲字的規則,並且使用者可以自由選擇觀察不同的層次,可以不只觀察到讀音這個層次,更可以看到聲母、韻母、聲調層次還有音素層次,再分析這些規則的分布性與影響力,選擇一些好的規則,提供建議給學習者作為起步的文字。
Chinese is a difficult language to learn, because of Chinese is not alphabetic writing it is very hard to know the pronunciation of each word. So we hope to design a more convenient way of learning for the Chinese beginners and provide to officers on the edition of teaching materials. Therefore we require a lot of Chinese characters, so we choose to use part of the "Chinese Character Component Searching System of Academia Sinica" has the form of phonetic sound of the pictophonetic characters. Chinese beginners can determine from phonetic complement to the pronunciation of a word, because of pictophonetic characters are accounted for more than 90 percent of Chinese characters that are general used and formed by phonetic complement and semantic complement. Check all the words’ different Pronunciation through "Hanyu Da Cidian", and then marked by the experts in the literature and confirm the phonetic complement. Afterwards to use multilevel association rule mining to mine the rules of pictophonetic characters to words. Users can choose specified level to observe the effect of not only the pronunciation level, but also initials, vowels, tones level and phonemes level. Final we can suggest some words to the beginners.
[1] Shen X., “Origin of Chinese Character,”100. (說文解字.敘 , 許慎)
[2] http://cdp.sinica.edu.tw/cdphanzi/ , Chinese Character Component Searching System of Academia Sinica.
[3] Hanyu da zidian weiyuanhui 漢語大字典委員會, Hanyu da zidian 漢語大字典 ("Comprehensive Chinese Character Dictionary"), 8 vols., Hubei cishu chubanshe and Sichuan cishu chubanshe, 1986-1989.
[4] Agrawal R.and Srikant R., "Fast algorithm for mining association rules, " by Proc. of 1994 Int. Conf.on Very Large Data Bases (VLDB''94), pages 487-499, Santiago, Chile, Sept. 1994.
[5] Klemettinen M., Mannila H., Ronkainen P., Toivonen H., and Verkamo I., "Finding interesting rules from large sets of discovered association rules, " by Proc. 3rd Int. Conf. Information and Knowledge Management (CIKM''94), pages 401-408, Gaithersburg, MD, Nov. 1994.
[6] Han J., Pei J., and Yin Y., "Mining frequent patterns without candidate generation, “ by Proc. 2000 ACM-SIGMOD Int. Conf. Management of Data (SIGMOD''00), pages 1-12, Dallas, TX, May 2000.
[7] Meng-Feng Tsai and Yi-Ming Lee, "Mining Self-derivable Multilevel FP-tree From a Transactional Database", National Central University, Taiwan, Master Thesis, 2006.
[8] Jiawei H., Yongjian F., "Discovery of Multiple-Level Association Rules from Large Databases, " by Proc. of 1995 Int''l Conf. on Very Large Data Bases (VLDB''95).
[9] Jiawei H. and Micheline K., Data Mining: Concepts and Techniques, 2nd ed. pages 228-231, Morgan Kaufmann Publishers, March 2006.
[10] http://140.115.52.39/chinaweb/rule.html