應用階層可解構式注意力模型於新聞立場辨識任務

簡易檢索 / 詳目顯示

回結果列表

研究生：	黃晨郁 Chen-Yu Huang
論文名稱：	應用階層可解構式注意力模型於新聞立場辨識任務 A Hierarchical Decomposable Attention Model for News Stance Detection
指導教授：	張嘉惠 Chia-Hui Chang
口試委員:
學位類別：	碩士 Master
系所名稱：	資訊電機學院 - 資訊工程學系 Department of Computer Science & Information Engineering
論文出版年：	2020
畢業學年度：	108
語文別：	中文
論文頁數：	52
中文關鍵詞：	新聞立場辨識、篇章分析、自然語言推理、注意力機制
外文關鍵詞：	News Stance Detection, Discourse Analysis, Natural Language Inference, Attention Mechanism
相關次數：	點閱：8 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

新聞立場辨識(News Stance Detection)幫助人們釐清新聞文章所代表的立場，進一步從不同角度去了解各種議題。在新聞立場辨識的任務中，我們必須根據給定議題及新聞文章，去判斷此篇新聞立場是中立、偏向贊成或對立。此項任務與自然語言推理(Natural Language Inference, NLI)任務類似，目標在給定兩個句子，判斷兩者之間是否無關或存在蘊涵、矛盾關係。新聞文章是由多個句子組成的篇章（Discourse），因此透過分析文章中的句子彼此的關係（稱之為篇章分析Discourse Analysis），也可能有助於新聞立場判定。
然而，在篇章分析以及自然語言推理的相關研究中，大部分的模型是在文章中「句子間」的關係（篇章關係）已知的情況下訓練；新聞立場辨識則是要找尋「文章」與「議題」的關係，因此每篇文章本身的篇章關係，以及文章中的每個句子與議題間的關係都是未知的。同時，在訓練資料中，新聞內容與議題的立場不同的資料比例很少，讓訓練模型更加不易，也是本項任務的困難之處。
在本篇論文中，我們參考了Parikh等人針對NLI任務設計的Decomposable Attention Model以及Durmus等人以階層的方式處理篇章資料的作法，提出Hierarchical Decomposable Attention Model來解決新聞立場辨識任務。實驗結果顯示，該架構確實可以達到較好的效能。針對資料不平衡的問題，我們加入新聞內容與議題立場不相關的資料，並透過實驗驗證我們的作法有助於提升模型在不相關資料的辨識能力。

In News Stance Detection task, we have to judge whether the stance of news is neutral, partial approval or opposition to a given query. This task is similar to the Natural Language Inference (NLI) task which accepts two sentences as input and determine whether the given two sentences are irrelevant, entailment or contradiction. A news article is a discourse composed of multiple sentences, so it may also help to determine the news stance by analyzing the relationship between the sentences in the article (called Discourse Analysis).
In the related work for Discourse Analysis, most models are trained with the relationship between “sentences” (Discourse Relation) in the articles; News Stance Detection is to find relation between “query” and “article”, so the discourse relation in the articles, and the relationship between each sentence in the article and the query is unknown. At the same time, in our training data, few of news articles hold different stance toward the queries, such a data imbalance problem makes the training of models more difficult, which is also a challenge of this task.
In this paper, we proposed a Hierarchical Decomposable Attention Model to solve News Stance Detection task which refer to the Decomposable Attention Model (Parikh et al. 2016) for NLI tasks and the hierarchical way to deal with discourse (Durmus et al. 2019). The experiment result showed that the performance of our architecture is better than other models. For the data imbalance problem, we added data labeled with unrelated and proved this way can improve the ability of the model to identify unrelated data.

摘要    i
Abstract    ii
目錄    iii
圖目錄    v
表目錄    vi
一、    簡介    1
二、    背景知識    5
2-1    Neural Machine Translation By Jointly Learning To Align And Translate    5
2-2    Transformer    7
2-3    BERT (Bidirectional Encoder Representation from Transformers)    8
三、    相關研究    10
3-1    Discourse Parsing    10
3-2    Natural Language Inference (NLI)    11
3-3    Stance Detection    14
3-4    新聞立場檢索競賽    16
四、    模型架構    18
4-1    Input Representation    18
4-2    Hierarchical Decomposable Attention Model    19
五、    實驗與分析    23
5-1    資料集    23
5-1-1    資料分析與標記策略    24
5-1-2    資料前處理與資料分割    26
5-1-3    評估方法    26
5-2    模型效能比較    28
5-3    資料形式的影響    31
5-4    錯誤分析    34
六、    結論與未來工作    38
參考文獻    39

                                

[1] https://aidea-web.tw/topic/b6abbf14-2d60-456c-8cbe-34fdfcd58967
[2] https://lucene.apache.org/solr/
[3] https://www.elastic.co/
[4] https://ckipnlp.readthedocs.io/en/latest/index.html
[5] http://www.fakenewschallenge.org/
[6] https://competitions.codalab.org/competitions/16843#results
[7] https://www.sfu.ca/rst/01intro/intro.html
[8] https://nlp.stanford.edu/projects/snli/
[9] https://www.kialo.com/
[10] https://www.tensorflow.org/
[11] Bahdanau D., Cho K., Bengio Y.: Neural machine translation by jointly learning to align and translate. The Third International Conference on Learning Representations (2015)
[12] Bowman, S, R., Angeli, G., Potts, C., Manning, C, D.: A large annotated corpus for learning natural language inference. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (EMNLP) (2015)
[13] Carlson, L., Marcu, D., Okurovsky, M, E.: Building a Discourse-tagged Corpus in the Framework of Rhetorical Structure Theory. Proceedings of the Second SIGdial Workshop on Discourse and Dialogue (2001)
[14] Derczynski L., Bontcheva K., Liakata, M., Procter R., Hoi, G, W, S., Zubiaga, A.:
40
SemEval-2017 Task 8: RumourEval: Determining rumour veracity and support for rumours. Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017) (2017)
[15] Devlin, J., Chang, M, W., Lee, K., Toutanova, K.: BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv preprint arXiv:1810.04805 (2018)
[16] Durmus, E., Ladhak, F., Cardie, C.: Determining Relative Argument Specificity and Stance for Complex Argumentative Structures. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics (2019)
[17] Küçük, D and Can, F.: . Stance Detection: A Survey. ACM Comput. Surv. 53, 1, Article 12 (May 2020), 37 pages. DOI:https://doi.org/10.1145/3369026 (2020)
[18] Li, Y., Feng, W., Sun, J., Kong, F., Zhou, G.: Building Chinese Discourse Corpus with Connective-driven Dependency Tree Structure. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) (2014)
[19] Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient Estimation of Word Representations in Vector Space. Proceedings of the 1st International Conference on Learning Representations (2013)
[20] Mohammad, S., Kiritchenko, S., Sobhani, P., Zhu, X., Cherry, C.: SemEval-2016 Task 6: Detecting Stance in Tweets. Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016) (2016)
[21] Mann, W, C., Thompson, S, A.: Rhetorical structure theory: Toward a functional theory of text organization. Text-Interdisciplinary Journal for the Study of Discourse, 8(3):243–281,
41
(1988)
[22] Prasad, R., Webber, B., Joshi, A.: Reflections on the penn discourse treebank, comparable corpora, and complementary annotation. Computational Linguistics, Volume 40, Issue 4 (2014)
[23] Parikh, A, P., Täckström, O., Das D., Uszkoreit, J.: A Decomposable Attention Model for Natural Language Inference. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing (EMNLP) (2016)
[24] Radford, A., Narasimhan, K., Salimans, T., & Sutskever, I.: Improving language understanding by generative pre-training. Technical report, OpenAI (2018)
[25] Vaswani, A., Shazeer, M., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A, N., Kaiser, K., Polosukhin, I.: Attention Is All You Need. arXiv preprint arXiv:1706.03762 (2017)
[26] Williams, A., Nangia, N., Bowman, S, R.: A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference. Proceedings of North American Chapter of the Association for Computational Linguistics (2018)
[27] Wang, Y, J., Chang, C, H.: Using Attentive to improve Recursive LSTM End-to-End Chinese Discourse Parsing. ROCLING (2019)
[28] Wang, S,M., Ku, L, W.: ANTUSD: A Large Chinese Sentiment Dictionary. Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16) (2016)

簡易檢索 / 詳目顯示

相關論文