| 研究生: |
曾朝譽 Chao-Yu Tseng |
|---|---|
| 論文名稱: |
運用文字探勘評估碩士論文之一致性與完整性 Using Text Mining to Evaluate the Consistency and Completeness of Master’s Thesis |
| 指導教授: | 薛義誠 |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
管理學院 - 資訊管理學系 Department of Information Management |
| 論文出版年: | 2017 |
| 畢業學年度: | 105 |
| 語文別: | 中文 |
| 論文頁數: | 100 |
| 中文關鍵詞: | 文字探勘 、論文一致性 、論文完整性 、NGD 、K-Core 、ABC分類法 |
| 外文關鍵詞: | text mining, consistency, completeness, NGD, K-Core, ABC classification |
| 相關次數: | 點閱:16 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
一篇好的論文摘要能幫助讀者快速瞭解論文之內容,其中之內容尤以目的及結論為甚,因此本研究將採用近五年來國立中央大學資訊管理學系之碩士論文為實驗資料,目的為探討碩士論文中,摘要、目的及結論兩兩章節間之一致性,並利用目的與結論檢驗論文之完整性,藉此評估碩士論文之一致性及完整性。
本研究視字詞為節點,利用相似度計算與中心性計算建立每篇論文之圖形結構,之後運用ABC分類法挑出各篇論文之主題字詞,以評估章節間的一致性及完整性。
於「摘要章節與目的章節、摘要章節與結論章節及目的章節與結論章節之一致性分析」方面,發現結合NGD相似度計算、K-Core中心性演算法及ABC分類法為較佳之方法。於「目的章節與結論章節之完整性分析」方面,使用Degree、Strength、K-Core或PageRank等四種中心性演算法所計算出之結果皆無明顯差異。於「各年間之論文一致性與完整性」方面,發現2013年及2015年之論文較佳,2012年之論文較差,而2014年及2016年之論文則較為一般。於「影響論文一致性與完整性之因素」方面,若能精準地撰寫標題內容且各章節之內容有撰寫確實,但又不至於太過多,即會有較好的一致性與完整性。
A good abstract in thesis can help readers quickly understand the main idea of thesis, which is consist of purpose and conclusion. Therefore, this study will use the master's thesis of the Department of Information Management of National Central University in the past five years as the experimental data. In order to evaluate the writing quality of the master's thesis, the consistency between the chapters of abstract and purpose, the chapters of abstract and conclusion and the chapters of purpose and conclusion are discussed and the completeness between the chapters of purpose and conclusion are also discussed.
In this study, each word is used as a vertex, and the graphic structure of each paper is established by combing the similarity measurement and centrality measurement. In order to evaluate the consistency and the completeness of the chapters, the ABC classification method is used to find the thematic words in each thesis.
Regarding the consistency between the chapters of abstract and purpose, the chapters between abstract and conclusion and the chapters between purpose and conclusion, it is found that the combination of NGD similarity, K-Core centrality and ABC classification is a better method. As to the completeness of the chapters between purpose and conclusion, there were no significant difference between Degree, Strength, K-Core and PageRank centrality. Concerning the consistency and the completeness of the thesis in each year, it is found that the thesis in 2013 and 2015 were much better, the thesis in 2012 were worse and the thesis in 2014 and 2016 were common. Regard to the influence on the factors of the consistency and the completeness of the thesis, it is found that the content of title should be written accurately and the content of each chapter should be written precisely.
英文文獻
Abuobieda, A., Salim, N., Albaham, A. T., Osman, A. H. & Kumar, Y. J. (2012, March). Text summarization features selection method using pseudo genetic-based model. In Information Retrieval & Knowledge Management (CAMP), 2012 International Conference on (pp. 193-197). IEEE.
Boehm, B. W. (1984). Verifying and validating software requirements and design specifications. IEEE software, 1(1), 75.
Boudin, F., Huet, S. & Torres-Moreno, J. M. (2011). A graph-based approach to cross-language multi-document summarization. Polibits, (43), 113-118.
Brin, S. & Page, L. (2012). Reprint of: The anatomy of a large-scale hypertextual web search engine. Computer networks, 56(18), 3825-3833.
Burrough-Boenisch, J. (1999). International reading strategies for IMRD articles. Written Communication, 16(3), 296-316.
Chu, C. W., Liang, G. S. & Liao, C. T. (2008). Controlling inventory by combining ABC analysis and fuzzy classification. Computers & Industrial Engineering, 55(4), 841-851.
Cilibrasi, R. L. & Vitanyi, P. M. (2007). The google similarity distance. IEEE Transactions on knowledge and data engineering, 19(3).
Das, D. & Martins, A. F. (2007). A survey on automatic text summarization. Literature Survey for the Language and Statistics II course at CMU, 4, 192-195.
Knobloch, S., Patzig, G., Mende, A. M. & Hastall, M. (2004). Affective news: Effects of discourse structure in narratives on suspense, curiosity, and enjoyment while reading news and novels. Communication Research, 31(3), 259-287.
Kupiec, J., Pedersen, J. & Chen, F. (1995, July). A trainable document summarizer. In Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval (pp. 68-73). ACM.
Lin, C. Y. & Hovy, E. (2003, May). Automatic evaluation of summaries using n-gram co-occurrence statistics. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology-Volume 1 (pp. 71-78). Association for Computational Linguistics.
Lin, C. Y. (2004, July). Rouge: A package for automatic evaluation of summaries. In Text summarization branches out: Proceedings of the ACL-04 workshop (Vol. 8).
Luhn, H. P. (1958). The automatic creation of literature abstracts. IBM Journal of research and development, 2(2), 159-165.
Mani, I. & Maybury, M. T. (Eds.). (1999). Advances in automatic text summarization (Vol. 293). Cambridge, MA: MIT press.
Mihalcea, R. & Tarau, P. (2005, October). A language independent algorithm for single and multiple document summarization. In Proceedings of IJCNLP (Vol. 5).
Opsahl, T., Agneessens, F. & Skvoretz, J. (2010). Node centrality in weighted networks: Generalizing degree and shortest paths. Social networks, 32(3), 245-251.
Page, L., Brin, S., Motwani, R. & Winograd, T. (1999). The PageRank citation ranking: Bringing order to the web. Stanford InfoLab.
Radev, D. R., Hovy, E. & McKeown, K. (2002). Introduction to the special issue on summarization. Computational linguistics, 28(4), 399-408.
Salton, G., Singhal, A., Mitra, M. & Buckley, C. (1997). Automatic text structuring and summarization. Information Processing & Management, 33(2), 193-207.
Seidman, S. B. (1983). Network structure and minimum degree. Social networks, 5(3), 269-287.
Sollaci, L. B. & Pereira, M. G. (2004). The introduction, methods, results, and discussion (IMRAD) structure: a fifty-year survey. Journal of the medical library association, 92(3), 364.
Suwa, M., Scott, A. C. & Shortliffe, E. H. (1982). An approach to verifying completeness and consistency in a rule-based expert system. Ai Magazine, 3(4), 16.
Swamidass, P. M. (2000). ABC analysis or ABC classification. Encyclopedia of production and manufacturing management, 1, 2.
Yeh, J. Y., Ke, H. R., Yang, W. P. & Meng, I. H. (2005). Text summarization using a trainable summarizer and latent semantic analysis. Information processing & management, 41(1), 75-95.
Zowghi, D. & Gervasi, V. (2002, September). The Three Cs of requirements: consistency, completeness, and correctness. In International Workshop on Requirements Engineering: Foundations for Software Quality, Essen, Germany: Essener Informatik Beitiage (pp. 155-164).
Zowghi, D. & Gervasi, V. (2003). On the interplay between consistency, completeness, and correctness in requirements evolution. Information and Software technology, 45(14), 993-1009.
中文文獻
葉鎮源、柯皓仁、楊維邦(民90)。文件自動化摘要方法之研究及其在中文文件的應用 (博士論文)。
薛義誠(民97)。策略規劃與管理。臺北市:雙葉書廊有限公司。