| 研究生: |
程祥恩 Hsiang-En Cherng |
|---|---|
| 論文名稱: |
應用記憶增強機制階層式深度學習模型於短文對話之對話品質與事件偵測任務 Dialogue Quality and Nugget Detection for Short Text Conversation based on Hierarchical Multi-Stack Model with Memory Enhance Structure |
| 指導教授: | 張嘉惠 |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
資訊電機學院 - 資訊工程學系 Department of Computer Science & Information Engineering |
| 論文出版年: | 2019 |
| 畢業學年度: | 107 |
| 語文別: | 英文 |
| 論文頁數: | 55 |
| 中文關鍵詞: | 對話品質 、事件偵測 、深度學習 、自然語言處理 |
| 外文關鍵詞: | Dialogue Quality, Nugget Detection, Deep Learning, Natural Language Processing |
| 相關次數: | 點閱:10 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
隨著自然語言處理技術的進步,Waston, Siri, Alexa等自動對話系統已成為最重要的應用之一。近年來,企業嘗試建立自動客服聊天機器人,讓機器能學習解決客人的問題,降低客服人員成本並提供24小時不間斷的客戶服務。
然而,目前評估聊天機器人的方法高度仰賴人類評估,並沒有有效的方法能夠快速分析一個聊天機器人的好壞。因此,NTCIR-14提出了Short Text Conversation 3 (STC-3)任務,包含對話品質(Dialogue Quality)和事件偵測(Nugget Detection)子任務,提供有效的指標幫助我們對能自動對聊天機器人進行評估。在本研究中,我們使用深度學習方法來探討DQ和ND子任務,透過深度學習方法來分析一段對話的優劣。
DQ子任務是將一則對話進行對話品質分析,分析的指標包含對話完整性(A-score)、對話效率(E-score)以及顧客滿意度(S-score),ND子任務則是分析一段對話中每一句話語的對話行為,藉此分析對話的架構與邏輯性。
我們使用多層深度學習模型解決DQ和ND子任務,使用話語層(utterance layer)、上下文層(context layer) 和記憶層(memory layer)來學習對話表示法,並使用門控機制(gating mechanism)於話語層和上下文層。我們也嘗試使用BERT[9]和多層CNN作為句子表示,實驗結果顯示BERT的效能優於多層CNN。最後,我們提出的模型在DQ和ND子任務中均優於所有參賽者的模型與任務發起者所提出的baseline模型。
With the development of Natural Language Processing (NLP) Automatic question-answering system such as Waston, Siri, Alexa, has become one of the most important NLP applications. Nowadays, enterprises try to build automatic custom service chatbots to save human resources and provide a 24-hour customer service.
However, evaluation of chatbots currently relied greatly on human annotation which cost a plenty of time. Thus, Short Text Conversation 3 (STC-3) in NTCIR-14 has initiated a new subtask called Dialogue Quality (DQ) and Nugget Detection (ND) which aim to automatically evaluate dialogues generated by chatbots. In this paper, we consider the DQ and ND subtasks for STC-3 using deep learning method.
The DQ subtask aims to judge the quality of the whole dialogue using three measures: Task Accomplishment (A-score), Dialogue Effectiveness (E-score) and Customer Satisfaction of the dialogue (S-score). The ND subtask, on the other hand, is to classify if an utterance in a dialogue contains a nugget, which is similar to dialogue act (DA) labeling problem.
We applied a general model with utterance layer, context layer and memory layer to learn dialogue representation for both DQ and ND subtasks and use gating and attention mechanism at multiple layers including: utterance layer and context layer. We also tried BERT and multi-stack CNN as sentence representation. The result shows that BERT produced a better utterance representation than multi-stack CNN for both DQ and ND subtasks and outperform other participants’ model and the baseline models proposed by the organizer on Ubuntu customer helpdesk dialogues corpus.
1. Axelrod, S.: Natural Language Generation in the IBM Flight Information System. ANLP/NAACL Workshop on Conversational Systems (2000)
2. Bajaj, P., Campos, D., Craswell, N., Deng, L., Gao, J., Liu, X., Majumder, R., McNamara, A., Mitra, B., Nguyen, T., Rosenberg, M., Song, X., Stoica, A., Tiwary, S., Wang, T.: MS MARCO: A Human Generated MAchine Reading COmprehension Dataset. Thirtieth Conference on Neural Information Processing Systems (2016)
3. Bird, S., Loper, E.: NLTK: The Natural Language Toolkit. Association for Computational Linguistics (2004)
4. Blunsom, P., Kalchbrenner, N.: Recurrent convolutional neural networks for discourse compositionality. Proceedings of the 2013 Workshop on Continuous Vector Space Models and their Compositionality (2013)
5. Bordes, A., Boureau, Y, L., Weston, J.: Learning End-to-End Goal-Oriented Dialogue. Proceedings of the 4th International Conference on Learning Representations (2017)
6. Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling. arXiv preprint arXiv:1412.3555 (2014)
7. Cong, K., Lam, W.: CUIS at the NTCIR-14 STC-3 DQ Subtask. Proceedings of the 14th NTCIR conference on Evaluation of Information Access Technologies (2019)
8. Dauphin, Y, N., Fan, A., Auli, M., Grangier, D.: Language modeling with gated convolutional networks. arXiv preprint arXiv: 1612.08083 (2016)
9. Devlin, J., Chang, M, W., Lee, K., Toutanova, K.: BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv preprint arXiv:1810.04805 (2018)
10. Elhadad, M.: Generating coherent argumentative paragraphs. International Conference on Computational Linguistics (1992)
11. Hochreiter, S., Schmidhuber, J.: Long Short-Term Memory. Neural Computation (1997)
12. Hu, M., Peng, Y., Huang, Z., Qiu, X., Wei, R., Zhou, M.: Reinforced Mnemonic Reader for Machine Reading Comprehension. The 27th International Joint Conference on Artificial Intelligence (2018)
13. Huang, Z., Xu, W., Yu, K.: Bidirectional lstm-crf models for sequence tagging. arXiv preprint arXiv:1508.01991 (2015)
14. Isbell, C, L, Jr., Kearns, M., Kormann, D., Singh, S., Stone, P.: Cobot in LambdaMOO: A Social Statistics Agent. The Fourteenth AAAI Conference on Artificial Intelligence (2000)
15. Jurafsky, D., Shriberg, L., Biasca, D.: Switchboard SWBD-DAMSL Shallow-Discourse-Function Annotation Coders Manual, Draft 13 (1997)
16. Kato, S., Suzuki, R., Zeng, Z., Sakai, T.: SLSTC at the NTCIR-14 STC-3 Dialogue Quality and Nugget Detection Subtasks. Proceedings of the 14th NTCIR conference on Evaluation of Information Access Technologies (2019)
17. Kim, Y.: Convolutional Neural Networks for Sentence Classification. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (2014)
18. Kiros, R., Zhu, Y., Salakhutdinov, R., Zemel, R, S., Torralba, A., Urtasun, R., Fidler, S.: Skip-Thought Vectors. Twenty-seven Conference on Neural Information Processing Systems (2015)
19. Kumar, H., Agarwal, A., Dasgupta, R., Joshi, S., Kumar, A.: Dialogue Act Sequence Labeling using Hierarchical encoder with CRF. The Thirty-Second AAAI Conference on Artificial Intelligence (2018)
20. Lee, J, Y., Dernoncourt, F.: Sequential short-text classification with recurrent and convolutional neural networks. Proceedings of NAACL-HLT (2016)
21. Lendvai, P., Geertzen, J.: Token-based chunking of turn-internal dialogue act sequences. SIGDIAL Workshop on Discourse and Dialogue (2007)
22. Li, J., Monroe, W., Shi, T., Jean, S., Ritter, A., Jurafsky, D.: Adversarial Learning for Neural Dialogue Generation. Proceedings of the 2017 conference on Empirical Methods in Natural Language Processing (2017)
23. Li, Y., Su, H., Shen, X., Li, W., Cao, Z., Niu, S.: DailyDialog: A Manually Labelled Multi-turn Dialogue Dataset. Proceedings of the Eighth International Joint Conference on Natural Language Processing (2017)
24. Liu, F., Baldwin, T., Cohn, T.: Capturing Long-range Contextual Dependencies with Memory-enhanced Conditional Random Fields. Proceedings of the Eighth International Joint Conference on Natural Language Processing (2017)
25. Liu, Y., Han, K., Tan, Z., Lei, Y.: Using Context Information for Dialog Act Classification in DNN Framework. Proceedings of the Conference on Empirical Methods in Natural Language Processing (2017)
26. Logeswaran, L., Lee, H.: An Efficient Framework for Learning Sentence Representations. Sixth International Conference on Learning Representations (2018)
27. Luo, L., Xu, J., Lin, J., Zeng, Q., Sun, X.: An Auto-Encoder Matching Model for Learning Utterance-Level Semantic Dependency in Dialogue Generation. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (2018)
28. Ma, X., Hovy, E.: End-to-end sequence labeling via bi-directional lstm-cnns-crf. Proceeding of the 54th Annual Meeting of the Association for Computational Linguistics (2016)
29. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient Estimation of Word Representations in Vector Space. Proceedings of the 1st International Conference on Learning Representations (2013)
30. Nishida, K., Saito, I., Nishida, K., Shinoda, K., Otsuka, A., Asano, H., Tomita, J.: Multi-Style Generative Reading Comprehension. The 57th Annual Meeting of the Association for Computational Linguistics (2019)
31. Pennington, J., Socher, R., Manning, C, D.: GloVe: Global Vectors for Word Representation (2014)
32. Peters, M, E., Neumann, M., Iyyer, M., Gardner, M., Clark, C., Lee, L., Zettlemoyer, Luke.: Deep contextualized word representations. Annual Conference of the North American Chapter of the Association for Computational Linguistics (2018)
33. Radford, A., Narasimhan, K., Salimans, T., Sutskever, I.: Improving Language Understanding by Generative Pre-Training (2018)
34. Rajendran, J., Ganhotra, J., Singh, S., Polymenakos, L. Learning End-to-End Goal-Oriented Dialog with Multiple Answers. Proceedings of the Conference on Empirical Methods in Natural Language Processing (2018)
35. Rajpurkar, P., Kia, R., Liang, P.: Know What You Don’t Know: Unanswerable Questions for SQuAD. arXiv preprint arXiv:1806.03822 (2018)
36. Rajpurkar, P., Zhang, J., Lopyrev, K., Liang, P.: SQuAD: 100,000+ Questions for Machine Comprehension of Text. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (2016)
37. Ritter, A., Cherry, C., Dolan, W, B.: Data-Driven Response Generation in Social Media. Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing (2011)
38. Sakai, T.: Comparing Two Binned Probability Distributions for Information Access Evaluation. The 41st International ACM SIGIR Conference on Research and Development in Information Retrieval (2018)
39. Seo, M., Min, S., Farhadi, A., Hajishirz, H.: Query-Reduction Networks for Question Answering. Proceedings of the 4th International Conference on Learning Representations (2017)
40. Shriberg, E., Dhillon, R., Bhagat, S., Ang, J., Carvey, H.: The ICSI meeting recorder dialog act (MRDA) corpus (2004)
41. Stolcke, A., Ries, K., Coccaro, N., Shriberg, E., Bates, R., Jurafsky, D., Taylor, P., Martin, R., Van, Ess-Dykema, C., Meteer, M.: Dialogue act modeling for automatic tagging and recognition of conversational speech. Association for Computational Linguistics (2000)
42. Xu, J., Ren, X., Lin, J., Sun, X.: Diversity-Promoting GAN: A Cross-Entropy Based Generative Adversarial Network for Diversified Text Generation. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (2018)
43. Tomas, M., Edouard, G., Piotr, Bojanowski., Christian, P., Armand, Joulin.: Advances in Pre-Training Distributed Word Representations. Proceedings of the International Conference on Language Resources and Evaluation (2018)
44. Vaswani, A., Shazeer, M., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A, N., Kaiser, K., Polosukhin, I.: Attention Is All You Need. arXiv preprint arXiv:1706.03762 (2017)
45. Weston, J., Bordes, A., Chopra, S., Rush, A, M., Merrienboer, B. V., Joulin, A., Mikolov, T.: Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks. Proceedings of the 3rd International Conference on Learning Representations (2016)
46. Weston, J., Chopra, S., Bordes, A.: Memory networks. Proceedings of the 3rd International Conference on Learning Representations (2015)
47. Yan, M., Xia, J., Wu, C., Bi, B., Zhao, Z., Zhang, J., Si, J., Wang, R., Wang, W., Chen, H.: A Deep Cascade Model for Multi-Document Reading Comprehension. The Thirty-Third AAAI Conference on Artificial Intelligence (2019)
48. Yu, S., Indurthi, S., Back, S., Lee, H.: A Multi-Stage Memory Augmented Neural Network for Machine Reading Comprehension. Proceedings of the Workshop on Machine Reading for Question Answering (2018)
49. Zeng, Z., Kato, S., Sakai, T.: Overview of the NTCIR-14 Short Text Conversation Task: Dialogue Quality and Nugget Detection Subtasks. Proceedings of the 14th NTCIR conference on Evaluation of Information Access Technologies (2019)
50. Zimmermann, M.: Joint segmentation and classification of dialog acts using conditional random fields. 10th Annual Conference of the International Speech Communication Association (2009)
51. Zhao, T., Zhao, R., Eskenazi, M.: Learning Discourse-level Diversity for Neural Dialog Models using Conditional Variational Autoencoders. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (2017)