| 研究生: |
張覺意 Chueh-I Chang |
|---|---|
| 論文名稱: |
應用自然語言處理技術提供學生電子書閱讀理解能力之智慧化評量 |
| 指導教授: |
楊鎮華
Steve Yang |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
資訊電機學院 - 資訊工程學系 Department of Computer Science & Information Engineering |
| 論文出版年: | 2020 |
| 畢業學年度: | 108 |
| 語文別: | 中文 |
| 論文頁數: | 46 |
| 中文關鍵詞: | 自然語言處理 、文檔摘要 、問題生成 、機器評分 |
| 外文關鍵詞: | NLP, Document summarization, Question generation, Machine grading |
| 相關次數: | 點閱:13 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
近年來,教育各項資源逐漸數位化,數位教育平台也逐漸普及,學生的學習活動歷程也得以數位化,在傳統教學現場,教師要了解學生的閱讀理解能力,往往透過小考或一些課程互動,而在今日的數位平台中,如何量測學生的閱讀理解能力,是學習分析領域中一項重要的議題。
隨著人工智慧快速發展,自然語言處理(NLP)領域在近年來得到顯著的突破,本文希望能利用當今最先進的自然語言處理技術,找出量測學生閱讀理解能力的最佳方法,此外,教師欲了解學生的閱讀理解能力,通常透過批改學生的小考來達成,然而教師在出題與批改上往往耗費大量的時間和人力成本,本文透過自然語言處理技術將這兩個步驟自動化,以利教師更快速地了解學生的閱讀理解能力。
在本文中,我們透過比對學生於電子書中畫的重點和教師畫的重點的一致性來判斷學生的閱讀理解能力,比較TextRank, RAKE, BERT三種方法的代理度量(Proxy measure)效能,透過語言生成模型GPT-2產生小考問答題,透過語言代表模型BERT自動批改學生答案,最後根據批改結果自動給予學生建議並將結果反饋給教師,以完成高度自動化的閱讀理解能力智慧化評量。
In recent years, various educational resources have been gradually digitized, e-learning platforms have gradually become popular, and students’ learning activities have been digitized. At traditional teaching sites, teachers need to understand students’ reading comprehension, often interacting through quizzes or in-class activities. In today's e-learning platforms, how to measure students' reading comprehension is an important topic in the field of learning analytics.
With the rapid development of artificial intelligence, the field of natural language processing has made significant breakthroughs in recent years. This paper hopes to use state-of-the-art natural language processing technology to find the best way to measure students' reading comprehension. In addition, teachers want to know students' reading comprehension ability is usually achieved by marking students' quizzes. However, teachers often spend a lot of time and labor on setting and marking exam papers. This paper uses natural language processing technology to automate these two steps to help teachers understand students' reading comprehension more quickly.
In this paper, we measure the reading comprehension of students by comparing the consistency of the markers drawn by students in e-books and the markers drawn by teachers, then we compared the proxy measure performance of the three methods of TextRank, RAKE, and BERT. In quiz generation phase, we use GPT-2, a state-of-the-art language generation model, to generate quizzes by parsing materials. In the grading phase, we use BERT, a pre-trained language understanding model, to grade students’ answers automatically, and give them guiding according to grading results to complete a highly automated reading comprehension measurement framework.
Robertson, S. (2004). Understanding inverse document frequency: on theoretical arguments for IDF. Journal of documentation.
Mihalcea, R., & Tarau, P. (2004, July). Textrank: Bringing order into text. In Proceedings of the 2004 conference on empirical methods in natural language processing (pp. 404-411).
Rose, S., Engel, D., Cramer, N., & Cowley, W. (2010). Automatic keyword extraction from individual documents. Text mining: applications and theory, 1, 1-20.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you need. In Advances in neural information processing systems (pp. 5998-6008).
Rush, A. M., Chopra, S., & Weston, J. (2015). A neural attention model for abstractive sentence summarization. arXiv preprint arXiv:1509.00685.
Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R. R., & Le, Q. V. (2019). Xlnet: Generalized autoregressive pretraining for language understanding. In Advances in neural information processing systems (pp. 5754-5764).
Liu, Y. (2019). Fine-tune BERT for extractive summarization. arXiv preprint arXiv:1903.10318.
Miller, D. (2019). Leveraging BERT for extractive text summarization on lectures. arXiv preprint arXiv:1906.04165.
Rajpurkar, P., Zhang, J., Lopyrev, K., & Liang, P. (2016). Squad: 100,000+ questions for machine comprehension of text. arXiv preprint arXiv:1606.05250.
Zhou, Q., Yang, N., Wei, F., Tan, C., Bao, H., & Zhou, M. (2017, November). Neural question generation from text: A preliminary study. In National CCF Conference on Natural Language Processing and Chinese Computing (pp. 662-671). Springer, Cham.
Heilman, M. (2011). Automatic factual question generation from text. Language Technologies Institute School of Computer Science Carnegie Mellon University, 195.
Le, N. T., Kojiri, T., & Pinkwart, N. (2014). Automatic question generation for educational applications–the state of art. In Advanced Computational Methods for Knowledge Engineering(pp. 325-338). Springer, Cham.
Du, X., Shao, J., & Cardie, C. (2017). Learning to ask: Neural question generation for reading comprehension. arXiv preprint arXiv:1705.00106.
Zhao, Y., Ni, X., Ding, Y., & Ke, Q. (2018). Paragraph-level neural question generation with maxout pointer and gated self-attention networks. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (pp. 3901-3910).
Sutskever, I., Vinyals, O., & Le, Q. V. (2014). Sequence to sequence learning with neural networks. In Advances in neural information processing systems (pp. 3104-3112).
Kriangchaivech, K., & Wangperawong, A. (2019). Question Generation by Transformers. arXiv preprint arXiv:1909.05017.
Kim, Y., Lee, H., Shin, J., & Jung, K. (2019, July). Improving neural question generation using answer separation. In Proceedings of the AAAI Conference on Artificial Intelligence(Vol. 33, pp. 6602-6609).
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., & Sutskever, I. (2019). Language models are unsupervised multitask learners. OpenAI Blog, 1(8), 9.
Chan, Y. H., & Fan, Y. C. (2019, November). A Recurrent BERT-based Model for Question Generation. In Proceedings of the 2nd Workshop on Machine Reading for Question Answering (pp. 154-162).
Klein, T., & Nabi, M. (2019). Learning to Answer by Learning to Ask: Getting the Best of GPT-2 and BERT Worlds. arXiv preprint arXiv:1911.02365.
Krishna, K., & Iyyer, M. (2019). Generating Question-Answer Hierarchies. arXiv preprint arXiv:1906.02622.
Ellis B. Page. 1967. Grading essays by computer: progress report. In Proceedings of the Invitational Conference on Testing Problems, pages 87–100.
Landauer, T. K. & Dumais, S. T. (1997). A solution to Plato's problem: The Latent Semanctic Analysis theory of the acquisition, induction, and representation of knowledge. Psychological Review, 104, 211-140
Foltz, P. W., Laham, D., & Landauer, T. K. (1999). The intelligent essay assessor: Applications to educational technology. Interactive Multimedia Electronic Journal of Computer-Enhanced Learning, 1(2), 939-944.
Thomas K. Landauer, Darrell Laham, and Peter W. Foltz. (2003). Automated scoring and annotation of essays with the Intelligent Essay Assessor. In M.D. Shermis and J.C. Burstein, editors, Automated essay scoring: A cross-disciplinary perspective, pages 87–112.
Zhang, L., Huang, Y., Yang, X., Yu, S., & Zhuang, F. (2019). An automatic short-answer grading model for semi-open-ended questions. Interactive Learning Environments, 1-14.
Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural computation, 9(8), 1735-1780.
Hasanah, U., Permanasari, A. E., Kusumawardani, S. S., & Pribadi, F. S. (2019). A scoring rubric for automatic short answer grading system. Telkomnika, 17(2), 763-770.
Wang, Z., Lan, A. S., Waters, A. E., Grimaldi, P., & Baraniuk, R. G. A Meta-Learning Augmented Bidirectional Transformer Model for Automatic Short Answer Grading.
Liu, T., Ding, W., Wang, Z., Tang, J., Huang, G. Y., & Liu, Z. (2019, June). Automatic Short Answer Grading via Multiway Attention Networks. In International Conference on Artificial Intelligence in Education (pp. 169-173). Springer, Cham.
Sung, C., Dhamecha, T., Saha, S., Ma, T., Reddy, V., & Arora, R. (2019, November). Pre-Training BERT on Domain Resources for Short Answer Grading. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP) (pp. 6073-6077).
Papineni, K., Roukos, S., Ward, T., & Zhu, W. J. (2002, July). BLEU: a method for automatic evaluation of machine translation. In Proceedings of the 40th annual meeting on association for computational linguistics (pp. 311-318). Association for Computational Linguistics.
Torrey, L., & Shavlik, J. (2010). Transfer learning. In Handbook of research on machine learning applications and trends: algorithms, methods, and techniques (pp. 242-264). IGI Global.