跳到主要內容

簡易檢索 / 詳目顯示

研究生: 曾朝譽
Chao-Yu Tseng
論文名稱: 運用文字探勘評估碩士論文之一致性與完整性
Using Text Mining to Evaluate the Consistency and Completeness of Master’s Thesis
指導教授: 薛義誠
口試委員:
學位類別: 碩士
Master
系所名稱: 管理學院 - 資訊管理學系
Department of Information Management
論文出版年: 2017
畢業學年度: 105
語文別: 中文
論文頁數: 100
中文關鍵詞: 文字探勘論文一致性論文完整性NGDK-CoreABC分類法
外文關鍵詞: text mining, consistency, completeness, NGD, K-Core, ABC classification
相關次數: 點閱:16下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 一篇好的論文摘要能幫助讀者快速瞭解論文之內容,其中之內容尤以目的及結論為甚,因此本研究將採用近五年來國立中央大學資訊管理學系之碩士論文為實驗資料,目的為探討碩士論文中,摘要、目的及結論兩兩章節間之一致性,並利用目的與結論檢驗論文之完整性,藉此評估碩士論文之一致性及完整性。
    本研究視字詞為節點,利用相似度計算與中心性計算建立每篇論文之圖形結構,之後運用ABC分類法挑出各篇論文之主題字詞,以評估章節間的一致性及完整性。
    於「摘要章節與目的章節、摘要章節與結論章節及目的章節與結論章節之一致性分析」方面,發現結合NGD相似度計算、K-Core中心性演算法及ABC分類法為較佳之方法。於「目的章節與結論章節之完整性分析」方面,使用Degree、Strength、K-Core或PageRank等四種中心性演算法所計算出之結果皆無明顯差異。於「各年間之論文一致性與完整性」方面,發現2013年及2015年之論文較佳,2012年之論文較差,而2014年及2016年之論文則較為一般。於「影響論文一致性與完整性之因素」方面,若能精準地撰寫標題內容且各章節之內容有撰寫確實,但又不至於太過多,即會有較好的一致性與完整性。


    A good abstract in thesis can help readers quickly understand the main idea of thesis, which is consist of purpose and conclusion. Therefore, this study will use the master's thesis of the Department of Information Management of National Central University in the past five years as the experimental data. In order to evaluate the writing quality of the master's thesis, the consistency between the chapters of abstract and purpose, the chapters of abstract and conclusion and the chapters of purpose and conclusion are discussed and the completeness between the chapters of purpose and conclusion are also discussed.
    In this study, each word is used as a vertex, and the graphic structure of each paper is established by combing the similarity measurement and centrality measurement. In order to evaluate the consistency and the completeness of the chapters, the ABC classification method is used to find the thematic words in each thesis.
    Regarding the consistency between the chapters of abstract and purpose, the chapters between abstract and conclusion and the chapters between purpose and conclusion, it is found that the combination of NGD similarity, K-Core centrality and ABC classification is a better method. As to the completeness of the chapters between purpose and conclusion, there were no significant difference between Degree, Strength, K-Core and PageRank centrality. Concerning the consistency and the completeness of the thesis in each year, it is found that the thesis in 2013 and 2015 were much better, the thesis in 2012 were worse and the thesis in 2014 and 2016 were common. Regard to the influence on the factors of the consistency and the completeness of the thesis, it is found that the content of title should be written accurately and the content of each chapter should be written precisely.

    摘要 I ABSTRACT II 目錄 III 圖目錄 V 表目錄 VIII 第一章、 緒論 1 1.1研究背景 1 1.2研究動機 1 1.3研究目的 2 1.4研究架構 3 第二章、 文獻探討 4 2.1文件摘要之背景及定義 4 2.2一致性與完整性 5 2.3語句特徵摘要 6 2.4圖形化摘要 8 2.5 ABC分類法 12 第三章、 研究方法 14 3.1實驗資料 14 3.2文件前處理 15 3.3相似度計算 16 3.4中心性計算 17 3.5重要字詞加權 18 3.6主題字詞篩選 19 3.7一致性分析 19 3.8完整性分析 21 3.9小結 24 第四章、 研究結果 25 4.1重要字詞採用率分析 25 4.2絕對一致性及相對一致性分析 32 4.3一致性分析之結果與分布 36 4.4完整性分析之結果與分布 45 4.5個案討論 51 第五章、 結論 64 5.1研究結論與貢獻 64 5.2研究建議 65 5.3研究限制 65 5.4未來研究 66 參考文獻 67 英文文獻 67 中文文獻 69 附錄一 相對一致性分析表 70 附錄二 實驗資料 82

    英文文獻
    Abuobieda, A., Salim, N., Albaham, A. T., Osman, A. H. & Kumar, Y. J. (2012, March). Text summarization features selection method using pseudo genetic-based model. In Information Retrieval & Knowledge Management (CAMP), 2012 International Conference on (pp. 193-197). IEEE.
    Boehm, B. W. (1984). Verifying and validating software requirements and design specifications. IEEE software, 1(1), 75.
    Boudin, F., Huet, S. & Torres-Moreno, J. M. (2011). A graph-based approach to cross-language multi-document summarization. Polibits, (43), 113-118.
    Brin, S. & Page, L. (2012). Reprint of: The anatomy of a large-scale hypertextual web search engine. Computer networks, 56(18), 3825-3833.
    Burrough-Boenisch, J. (1999). International reading strategies for IMRD articles. Written Communication, 16(3), 296-316.
    Chu, C. W., Liang, G. S. & Liao, C. T. (2008). Controlling inventory by combining ABC analysis and fuzzy classification. Computers & Industrial Engineering, 55(4), 841-851.
    Cilibrasi, R. L. & Vitanyi, P. M. (2007). The google similarity distance. IEEE Transactions on knowledge and data engineering, 19(3).
    Das, D. & Martins, A. F. (2007). A survey on automatic text summarization. Literature Survey for the Language and Statistics II course at CMU, 4, 192-195.
    Knobloch, S., Patzig, G., Mende, A. M. & Hastall, M. (2004). Affective news: Effects of discourse structure in narratives on suspense, curiosity, and enjoyment while reading news and novels. Communication Research, 31(3), 259-287.
    Kupiec, J., Pedersen, J. & Chen, F. (1995, July). A trainable document summarizer. In Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval (pp. 68-73). ACM.
    Lin, C. Y. & Hovy, E. (2003, May). Automatic evaluation of summaries using n-gram co-occurrence statistics. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology-Volume 1 (pp. 71-78). Association for Computational Linguistics.
    Lin, C. Y. (2004, July). Rouge: A package for automatic evaluation of summaries. In Text summarization branches out: Proceedings of the ACL-04 workshop (Vol. 8).
    Luhn, H. P. (1958). The automatic creation of literature abstracts. IBM Journal of research and development, 2(2), 159-165.
    Mani, I. & Maybury, M. T. (Eds.). (1999). Advances in automatic text summarization (Vol. 293). Cambridge, MA: MIT press.
    Mihalcea, R. & Tarau, P. (2005, October). A language independent algorithm for single and multiple document summarization. In Proceedings of IJCNLP (Vol. 5).
    Opsahl, T., Agneessens, F. & Skvoretz, J. (2010). Node centrality in weighted networks: Generalizing degree and shortest paths. Social networks, 32(3), 245-251.
    Page, L., Brin, S., Motwani, R. & Winograd, T. (1999). The PageRank citation ranking: Bringing order to the web. Stanford InfoLab.
    Radev, D. R., Hovy, E. & McKeown, K. (2002). Introduction to the special issue on summarization. Computational linguistics, 28(4), 399-408.
    Salton, G., Singhal, A., Mitra, M. & Buckley, C. (1997). Automatic text structuring and summarization. Information Processing & Management, 33(2), 193-207.
    Seidman, S. B. (1983). Network structure and minimum degree. Social networks, 5(3), 269-287.
    Sollaci, L. B. & Pereira, M. G. (2004). The introduction, methods, results, and discussion (IMRAD) structure: a fifty-year survey. Journal of the medical library association, 92(3), 364.
    Suwa, M., Scott, A. C. & Shortliffe, E. H. (1982). An approach to verifying completeness and consistency in a rule-based expert system. Ai Magazine, 3(4), 16.
    Swamidass, P. M. (2000). ABC analysis or ABC classification. Encyclopedia of production and manufacturing management, 1, 2.
    Yeh, J. Y., Ke, H. R., Yang, W. P. & Meng, I. H. (2005). Text summarization using a trainable summarizer and latent semantic analysis. Information processing & management, 41(1), 75-95.
    Zowghi, D. & Gervasi, V. (2002, September). The Three Cs of requirements: consistency, completeness, and correctness. In International Workshop on Requirements Engineering: Foundations for Software Quality, Essen, Germany: Essener Informatik Beitiage (pp. 155-164).
    Zowghi, D. & Gervasi, V. (2003). On the interplay between consistency, completeness, and correctness in requirements evolution. Information and Software technology, 45(14), 993-1009.
    中文文獻
    葉鎮源、柯皓仁、楊維邦(民90)。文件自動化摘要方法之研究及其在中文文件的應用 (博士論文)。
    薛義誠(民97)。策略規劃與管理。臺北市:雙葉書廊有限公司。

    QR CODE
    :::