以詞𢑥分析探勘論文寫作之3C結構水準｜國立中央大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	張朝順 Chao-Shun Zhang
論文名稱：	以詞𢑥分析探勘論文寫作之3C結構水準 Using Word Analysis to Explore the 3C Structure Level of Master’s Thesis
指導教授：	薛義誠
口試委員:
學位類別：	碩士 Master
系所名稱：	管理學院 - 資訊管理學系 Department of Information Management
論文出版年：	2018
畢業學年度：	106
語文別：	中文
論文頁數：	129
中文關鍵詞：	多文章自動摘要、語意分析、論文一致性、論文完整性、論文正確性
外文關鍵詞：	multi-document automatic summarization, Master’s Thesis consistency, completeness, correctness
相關次數：	點閱：25 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

好的論文會明確地寫出目的及結論，且兩者之間應具有一致性及完整性，反之則會缺漏造成內容前後不一致，並造成使用者錯誤之引用，使用者搜尋論文主要是透過摘要來評定是否參考，摘要在論文架構中之功能為協助讀者快速理解論文內容，因此好的論文摘要能提升論文搜尋之正確性，摘要內容又以目的及結論為重，故本研究以臺灣碩博論文知識加值系統近六年來中央大學資訊管理系之碩士論文為實驗資料，以TextRank演算法萃取文章之特徵，採用ROUGE-1 做為評量依據評量論文之一致性及完整性，再以多文章之擷取式自動摘要技術TextRank、LexRank、Luhn與潛在語意分析 (Latent Semantic Analysis，LSA)四種方式產生正確性較佳之摘要，與原始摘要做特徵比對，進而評估正確性。
本研究目的為評論論文寫作邏輯之一致性，範圍之完整性及結論之正確性之結構水準，並將自動摘要附在原文摘要之後，藉此幫助查詢者提升論文搜尋之正確性，藉由實驗發現2013年及2015年一致性、完整性及正確性較佳，2014年及2017較差且發現論文具一致性、完整性及正確性因素之一為指導教授。具體貢獻(1)建立一個非監督式學習的驗證系統 (2)系統評估速度快，可立即評估該論文是否通過一致性、完整性及正確性(3)提供各項數據，供論文寫作者調整其結構達到一致性、完整性及正確性(4)讓指導教授以系統自動檢核論文減少人工審閱時間(5)提供自動摘要增加內容分辨力輔助提升查詢論文之正確性

A good Master’s thesis will clearly stating the purpose and conclusion. Between the purpose and the conclusion, there should be consistency and completeness. On the other hand, missing the above two points, the content will fall short and have contradictions, that will misguide the thesis readers to quote incorrectly. Generally, a reader defines a thesis worth takes references by abstract. The abstract helps the readers to understand the content in a quicker way. Therefore, a good abstract will elevate the correctness of giving a right thesis to meet the reader’s needs. The content of the abstract of a thesis values the purpose and the conclusion the most. This research takes the master’s thesis of the Department of Information Management of National Central University from National Digital Library of Theses and Dissertations System in Taiwan as research data, using TextRank algorithm to extract the features of a thesis, applying ROUGE-1 as evaluation basis to measure the consistency and completeness of the thesis. Furthermore, with the help of the four algorithm of automatic multi-document extraction system TextRand, LexRank, Luhn, and potential semantic meaning analyzation system LSA (Latent Semantic Analysis) to make an abstract with a better correctness. Then, using this automatic summarization from the above technologies to compare with the original abstract to measure the correctness.
The purpose of this research is to comment on the consistency of the paper writing logic, the completeness of the scope, and the correctness of the conclusions, hopefully, after applying the auto- abstract to the original summary, there will be results with better correctness occurring from the thesis searching for the readers. From experiments, the consistency, completeness and correctness of 2013 and 2015 were found better, and 2014 and 2017 were found worse and the professor guidance has a great correlation about 3C structure . The contributions are: (1) Establishing a unsupervised verification system. (2) the consistency, completeness and correctness of the thesis can immediately be assessed pass or not. (3) Provide data for thesis writing to adjust its structure to achieve consistency, completeness and correctness. (4) Professors can automatically check thesis to reduce manual review task. (5) Provide automatic summaries to help improve the accuracy of query thesis.

摘要    I
ABSTRACT    II
致謝    III
目錄    IV
圖目錄    VIII
表目錄    X
第一章、 緒論    1
1研究背景    1
2研究動機    2
3研究目的    4
4 預期影響性及研究貢獻    5
5 研究流程    5
第二章、 文獻回顧    6
1文字探勘    6
1.1中文斷詞相關研究    7
1.2 特徵選取    8
1.3 文本排序    8
2 完整性、一致性及正確性    9
3自動摘要技術    10
3.1 詞彙中心性排序    12
3.2 盧恩演算法    13
3.3 潛在語意分析    14
第三章、 實驗方法    17
1實驗架構    17
2實驗資料    17
3以原始摘要、目錄、結論章節評估一致性及完整性    18
3.1擷取目的、結論及摘要章節    19
3.2以混合式方式進行中文斷詞    19
3.3萃取文章特徵    19
3.4以原始摘要、目的、結論章節評估一致性    20
3.5以原始摘要、目的、結論章節評估完整性    22
4產生自動摘要以評估一致性及完整性    23
4.1將目的及結論章節依照標點符號切割句子    24
4.2產生自動摘要    24
4.3以自動摘要、目錄、結論章節評估一致性及完整性    24
5判別自動摘要結果及進行正確性評估    24
5.1依照自動摘要所評估之結果進行判讀    25
5.2以原始摘要及自動摘要評估正確性    25
第四章、 實驗結果    27
1評估歷年資料之一致性    27
1.1採用2012年之實驗資料    27
1.2採用2013年之實驗資料    28
1.3採用2014年之實驗資料    30
1.4採用2015年之實驗資料    31
1.5採用2016年之實驗資料    32
1.6採用2017年之實驗資料    34
1.7 小結    35
2評估歷年資料之完整性    36
2.1採用2012年之實驗資料    36
2.2採用2013年之實驗資料    37
2.3採用2014年之實驗資料    39
2.4採用2015年之實驗資料    41
2.5採用2016年之實驗資料    43
2.6採用2017年之實驗資料    44
2.7小結    46
3評估歷年資料之正確性    47
3.1採用一致性較佳之自動摘要評估正確性    47
3.1.1採用2012年之實驗資料    48
3.1.2採用2013年之實驗資料    49
3.1.3採用2014年之實驗資料    51
3.1.4採用2015年之實驗資料    52
3.1.5採用2016年之實驗資料    54
3.1.6採用2017年之實驗資料    55
3.2採用完整性較佳之自動摘要評估正確性    57
3.2.1採用2012年之實驗資料    57
3.2.2採用2013年之實驗資料    59
3.2.3採用2014年之實驗資料    60
3.2.4採用2015年之實驗資料    62
3.2.5採用2016年之實驗資料    64
3.2.6採用2017年之實驗資料    66
3.3小結    67
4個案討論    70
4.1 原始摘要評估一致性、完整性結果    70
4.1.1一致性、完整性結構水平高之論文    71
4.1.2一致性、完整性結構水平低之論文    72
4.2 自動摘要評估一致性、完整性結果    73
4.2.2一致性低於原始摘要之自動摘要    73
4.2.2完整性低於原始摘要之自動摘要    75
4.3正確性較佳之原始摘要    76
4.3.1正確性結構水平高之原始摘要    76
4.3.2正確性結構水平低之原始摘要    77
第五章、 結論    79
1研究結論及貢獻    79
2研究限制    83
3未來研究    83
參考文獻    86
英文文獻    86
中文文獻    90
附錄一 實驗數據    91
附錄二 實驗資料    105
                                

英文文獻
Adline, A. L., Mahalakshmi, G. S., & Sendhilkumar, S. (2018). Graph Based Generation of Research Paper Summaries. Journal of Computational and Theoretical Nanoscience, 15(4), 1106-1111.
Bazerman, C. (1984). Modern evolution of the experimental report in physics: Spectroscopic articles in Physical Review, 1893-1980. Social studies of science, 14(2), 163-196.
Boehm, B. W. (1984). Verifying and validating software requirements and design specifications. IEEE software, 1(1), 75.
Das, D., & Martins, A. F. (2007). A survey on automatic text summarization. Literature Survey for the Language and Statistics II course at CMU, 4, 192-195.
Davis, A. M. (1990). Software requirements: analysis and specification: Prentice Hall Press.
Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T. K., & Harshman, R. (1990). Indexing by latent semantic analysis. Journal of the American society for information science, 41(6), 391.
Erkan, G., & Radev, D. R. (2004). Lexrank: Graph-based lexical centrality as salience in text summarization. journal of artificial intelligence research, 22, 457-479.
Gong, Y., & Liu, X. (2001). Generic text summarization using relevance measure and latent semantic analysis. Paper presented at the Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval, New Orleans, Louisiana, USA.
Lin, C.-Y. (2003). ROUGE: Recall-oriented understudy for gisting evaluation. In.
Luhn, H. P. (1958). The automatic creation of literature abstracts. IBM Journal of research and development, 2(2), 159-165.
Mihalcea, R., & Tarau, P. (2004). Textrank: Bringing order into text. Paper presented at the Proceedings of the 2004 conference on empirical methods in natural language processing.
M. Witbrock and V. Mittal, Ultra Summarization: a Statistical Approach to Generating
Highly Condensed Non-extractive Summaries, Proceedings of the 22th Annual International ACM SIGIR Conference on Research and Development in Information
Retrieval (SIGIR), pp. 315–316, 1999
Ng, H. T., Goh, W. B., & Low, K. L. (1997). Feature selection, perceptron learning, and a usability case study for text categorization. Paper presented at the ACM SIGIR Forum.
Paice, C. D. (1990). Constructing literature abstracts by computer: techniques and prospects. Information Processing & Management, 26(1), 171-186.
Radev, D. R., Hovy, E., & McKeown, K. (2002). Introduction to the special issue on summarization. Computational linguistics, 28(4), 399-408.
Sebastiani, F. (2002). Machine learning in automated text categorization. ACM computing surveys (CSUR), 34(1), 1-47.
Sollaci, L. B., & Pereira, M. G. (2004). The introduction, methods, results, and discussion (IMRAD) structure: a fifty-year survey. Journal of the medical library association, 92(3), 364.
Steinberger, J., & Jezek, K. (2004). Using latent semantic analysis in text summarization and summary evaluation. Proc. ISIM, 4, 93-100.
Sullivan, D. (2001). Document warehousing and text mining: techniques for improving business operations, marketing, and sales: John Wiley & Sons, Inc.
Witbrock, M. J., & Mittal, V. O. (1999). Ultra-summarization (poster abstract): a statistical approach to generating highly condensed non-extractive summaries. Paper presented at the Proceedings of the 22nd annual international ACM SIGIR conference on Research and development in information retrieval.
Wu, J. (2011). Improving the writing of research papers: IMRAD and beyond. In: Springer.
Xiong, C., Li, Y., & Lv, K. (2017, June). Multi-documents Summarization Based on the TextRank and Its Application in Argumentation System. In International Conference on Emerging Internetworking, Data & Web Technologies (pp. 457-466). Springer, Cham.
Xiong, S., & Luo, Y. (2014, December). A new approach for multi-document summarization based on latent semantic analysis. In Computational Intelligence and Design (ISCID), 2014 Seventh International Symposium on (Vol. 1, pp. 177-180). IEEE.
Yeh, C.-L. (1991). Rule-based word identification for Mandarin Chinese sentences-A unification approach. Computer Processing of Chinese and Oriental Languages.
Yonghe, L., & Jianhua, C. (2014). Public opinion analysis of microblog content. Paper presented at the Information Science and Applications (ICISA), 2014 International Conference on.
Zowghi, D., & Gervasi, V. (2002). The Three Cs of requirements: consistency, completeness, and correctness. Paper presented at the International Workshop on Requirements Engineering: Foundations for Software Quality, Essen, Germany: Essener Informatik Beitiage.
Zowghi, D., & Gervasi, V. (2003). On the interplay between consistency, completeness, and correctness in requirements evolution. Information and Software Technology, 45(14), 993-1009. doi:https://doi.org/10.1016/S0950-5849(03)00100-9
Zowghi, D., & Gervasi, V. (2004). Erratum to “On the interplay between consistency, completeness, and correctness in requirements evolution”. Information and Software Technology, 46(11), 763-779.
中文文獻
李俊宏、張興亞(2007)。一個以 Ontology 為基礎的 Web-Mining 技術應用於供應鏈競爭分析之研究.電子商務學報，第九卷，第三期，頁 435-160。
李麗華、李富民、詹尚驥、周裕健(2009)。以學術部落格為主之個人化推薦系統.資訊科技國際期刊(IJAIT)，第 3 卷，Vol. 3，頁 56-75.
曾朝譽. (2017)。運用文字探勘評估碩士論文之一致性與完整性. 國立中央大學.
葉鎮源、柯皓仁、楊維邦. (2001)。文件自動化摘要方法之研究及其在中文文件的應用.
劉志明、于波、歐陽純萍、余穎、陽小華、翟雲. (2017)。基於主題的SE -Text Rank情感摘要方法. 情報工程, 3(3), 97-104.
謝育倫、劉士弘、陳冠宇、王新民、許聞廉、陳柏琳 (2016)。運用序列到序列生成架構於重寫式自動摘要

簡易檢索 / 詳目顯示

相關論文