以注意力機制輔助文本分類中的資料增益｜國立中央大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	黃晉豪 Jin-Hao Huang
論文名稱：	以注意力機制輔助文本分類中的資料增益
指導教授：	林熙禎 She-Jen Lin
口試委員:
學位類別：	碩士 Master
系所名稱：	管理學院 - 資訊管理學系 Department of Information Management
論文出版年：	2020
畢業學年度：	108
語文別：	中文
論文頁數：	70
中文關鍵詞：	資料增益、文本分類、自然語言處理、注意力機制
外文關鍵詞：	Data augmentation, Attention Mechanism, Text classification, Natural language processing
相關次數：	點閱：9 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

資料增益（Data Augmentation）在許多研究領域中都顯示對於模型預測的準確率有幫助，然而在自然語言處理中，資料增益方式大多都具隨機性，或者在增益前需要許多語言的先備知識，因此本研究提出以注意力機制作為輔助的資料增益方法，以進行資料增益，探討注意力機制對於文本分類中的資料增益是否有影響。最後在實驗結果中證實本研究提出之方法，可有效提升分類器在分類上之準確度，在僅有500筆資料量下提升分類器10%的分類準確度。

Data augmentation is a strategy to increase the quantity of the data, in order to improve the performance of the model. This strategy is widely used in natural language processing field, however, data augmentation strategies in natural language field nowadays, either consist of a lot of randomness, or require a lot of human pre-defined rules. In our work, we propose a novel approach to augment the data according to the attention weight, which doesn’t require any human pre-defined rules yet can get rid of the randomness. Our approach increases the accuracy of the classifier model, and it shows the feasibility of taking attention weight as a basis to perform data augmentation.

摘要    １
Abstract    ii
目錄    iii
圖目錄    vi
表目錄    viii
一、    緒論    1
1-1    研究背景    1
1-2    研究動機    2
1-3    研究目的    3
1-4    論文架構    4
二、    文獻探討    5
2-1    圖像資料增益（Image Data Augmentation）    6
2-1-1    調變原圖之資料增益法    6
2-1-2    深度學習之資料增益法    8
2-2    文字的資料增益 （Text Data Augmentation）    12
2-2-1    原文句增益法    13
2-2-2    反向翻譯法    19
2-3    以額外知識輔助資料增益之方法    20
2-3-1    以停用詞表為輔助的資料增益    20
2-3-2    以LDA （Latent Dirichlet Allocation）為輔助的資料增益    22
2-3-3    以詞性為輔助之資料增益    22
2-3-4    以TF-IDF為輔助之資料增益    23
2-4    注意力機制（Attention Mechanism）    24
2-4-1    注意力機制之發展    25
三、    研究方法    29
3-1    研究流程    29
3-2    階層式注意力機制    29
3-3    注意力資料增益    33
3-4    下游分類器    34
四、    實驗設計與分析    35
4-1    前處理、資料集與下游模型    35
4-2    實驗環境    35
4-3    實驗設計與結果    36
4-3-1    實驗一、注意力機制閾值設置實驗    36
4-3-2    實驗二、注意力機制與使用語言知識的資料增益之比較    41
4-3-3    實驗三、注意力機制在各資料增益方法中之表現    46
4-3-4    實驗四、 注意力機制生成資料與原始資料之比較    49
4-4    實驗小結    50
五、    結論與未來方向    52
5-1    結論    52
5-2    研究限制    52
5-3    未來研究方向    53
參考文獻    54


                                

Bahdanau, D., Cho, K., & Bengio, Y. （2016）. Neural Machine Translation by Jointly Learning to Align and Translate. ArXiv:1409.0473 [Cs, Stat]. http://arxiv.org/abs/1409.0473
Cho, K., van Merrienboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., & Bengio, Y. （2014）. Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation. ArXiv:1406.1078 [Cs, Stat]. http://arxiv.org/abs/1406.1078
Chorowski, J. K., Bahdanau, D., Serdyuk, D., Cho, K., & Bengio, Y. （2015）. Attention-Based Models for Speech Recognition. In C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, & R. Garnett （Eds.）, Advances in Neural Information Processing Systems 28 （pp. 577–585）. Curran Associates, Inc. http://papers.nips.cc/paper/5847-attention-based-models-for-speech-recognition.pdf
Claude, C. （2018, December 5）. Text Data Augmentation Made Simple By Leveraging NLP Cloud APIs. https://arxiv.org/ftp/arxiv/papers/1812/1812.04718.pdf
Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. （2019）. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. ArXiv:1810.04805 [Cs]. http://arxiv.org/abs/1810.04805
Doersch, C. （2016）. Tutorial on Variational Autoencoders. ArXiv:1606.05908 [Cs, Stat]. http://arxiv.org/abs/1606.05908
Duan, S., Zhao, H., Zhang, D., & Wang, R. （2020）. Syntax-aware Data Augmentation for Neural Machine Translation. ArXiv:2004.14200 [Cs]. http://arxiv.org/abs/2004.14200
Elder, H., & Hokamp, C. （2018）. Generating High-Quality Surface Realizations Using Data Augmentation and Factored Sequence Models. ArXiv:1805.07731 [Cs]. http://arxiv.org/abs/1805.07731
Galassi, A., Lippi, M., & Torroni, P. （2020）. Attention in Natural Language Processing. ArXiv:1902.02181 [Cs, Stat]. http://arxiv.org/abs/1902.02181
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., & Bengio, Y. （2014）. Generative Adversarial Nets. In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, & K. Q. Weinberger （Eds.）, Advances in Neural Information Processing Systems 27 （pp. 2672–2680）. Curran Associates, Inc. http://papers.nips.cc/paper/5423-generative-adversarial-nets.pdf
Hu, T., Qi, H., Huang, Q., & Lu, Y. （2019）. See Better Before Looking Closer: Weakly Supervised Data Augmentation Network for Fine-Grained Visual Classification. ArXiv:1901.09891 [Cs]. http://arxiv.org/abs/1901.09891
Ioffe, S., & Szegedy, C. （2015）. Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift. ArXiv:1502.03167 [Cs]. http://arxiv.org/abs/1502.03167
Jiao, X., Yin, Y., Shang, L., Jiang, X., Chen, X., Li, L., Wang, F., & Liu, Q. （2019）. TinyBERT: Distilling BERT for Natural Language Understanding. ArXiv:1909.10351 [Cs]. http://arxiv.org/abs/1909.10351
Kang, G., Dong, X., Zheng, L., & Yang, Y. （2017）. PatchShuffle Regularization. ArXiv:1707.07103 [Cs]. http://arxiv.org/abs/1707.07103
Kim, Y. （2014）. Convolutional Neural Networks for Sentence Classification. ArXiv:1408.5882 [Cs]. http://arxiv.org/abs/1408.5882
Kobayashi, S. （2018）. Contextual Augmentation: Data Augmentation by Words with Paradigmatic Relations. ArXiv:1805.06201 [Cs]. http://arxiv.org/abs/1805.06201
Loper, E., & Bird, S. （2002）. NLTK: The Natural Language Toolkit. ArXiv:Cs/0205028. http://arxiv.org/abs/cs/0205028
Luo, C., Zhu, Y., Jin, L., & Wang, Y. （2020）. Learn to Augment: Joint Data Augmentation and Network Optimization for Text Recognition. ArXiv:2003.06606 [Cs]. http://arxiv.org/abs/2003.06606
Luque, F. M. （2019）. Atalaya at TASS 2019: Data Augmentation and Robust Embeddings for Sentiment Analysis. ArXiv:1909.11241 [Cs]. http://arxiv.org/abs/1909.11241
Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. （2013）. Distributed Representations of Words and Phrases and their Compositionality. In C. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, & K. Q. Weinberger （Eds.）, Advances in Neural Information Processing Systems 26 （pp. 3111–3119）. Curran Associates, Inc. http://papers.nips.cc/paper/5021-distributed-representations-of-words-and-phrases-and-their-compositionality.pdf
Miller, G. A. （1995）. WordNet: A lexical database for English. Communications of the ACM, 38（11）, 39–41. https://doi.org/10.1145/219717.219748
Moreno-Barea, F. J., Strazzera, F., Jerez, J. M., Urda, D., & Franco, L. （2018）. Forward Noise Adjustment Scheme for Data Augmentation. 2018 IEEE Symposium Series on Computational Intelligence （SSCI）, 728–734. https://doi.org/10.1109/SSCI.2018.8628917
Muhammad, A., & Amit, K. S. （2019）. A Text Data Augmentation Approach for Improving the Performance of CNN.
Pan, S. J., & Yang, Q. （2010）. A Survey on Transfer Learning. IEEE Transactions on Knowledge and Data Engineering, 22（10）, 1345–1359. https://doi.org/10.1109/TKDE.2009.191
Pennington, J., Socher, R., & Manning, C. （2014）. Glove: Global Vectors for Word Representation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing （EMNLP）, 1532–1543. https://doi.org/10.3115/v1/D14-1162
Radford, A., Metz, L., & Chintala, S. （2016）. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. ArXiv:1511.06434 [Cs]. http://arxiv.org/abs/1511.06434
Ratner, A. J., Ehrenberg, H., Hussain, Z., Dunnmon, J., & Ré, C. （2017）. Learning to Compose Domain-Specific Transformations for Data Augmentation. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, & R. Garnett （Eds.）, Advances in Neural Information Processing Systems 30 （pp. 3236–3246）. Curran Associates, Inc. http://papers.nips.cc/paper/6916-learning-to-compose-domain-specific-transformations-for-data-augmentation.pdf
Sennrich, R., Haddow, B., & Birch, A. （2016）. Improving Neural Machine Translation Models with Monolingual Data. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics （Volume 1: Long Papers）, 86–96. https://doi.org/10.18653/v1/P16-1009
Shleifer, S. （2019）. Low Resource Text Classification with ULMFit and Backtranslation. ArXiv, abs/1903.09244.
Shorten, C., & Khoshgoftaar, T. M. （2019）. A survey on Image Data Augmentation for Deep Learning. Journal of Big Data, 6（1）, 60. https://doi.org/10.1186/s40537-019-0197-0
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. （n.d.）. Dropout: A Simple Way to Prevent Neural Networks from Overﬁtting. 30.
Sugiyama, A., & Yoshinaga, N. （2019）. Data augmentation using back-translation for context-aware neural machine translation. Proceedings of the Fourth Workshop on Discourse in Machine Translation （DiscoMT 2019）, 35–44. https://doi.org/10.18653/v1/D19-6504
Tan, J., Wan, X., & Xiao, J. （2017）. Abstractive Document Summarization with a Graph-Based Attentional Neural Model. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics （Volume 1: Long Papers）, 1171–1181. https://doi.org/10.18653/v1/P17-1108
Tan, M., & Le, Q. V. （2019）. EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. ArXiv:1905.11946 [Cs, Stat]. http://arxiv.org/abs/1905.11946
Taylor, L., & Nitschke, G. （2017）. Improving Deep Learning using Generic Data Augmentation. ArXiv:1708.06020 [Cs, Stat]. http://arxiv.org/abs/1708.06020
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., & Polosukhin, I. （2017）. Attention Is All You Need. ArXiv:1706.03762 [Cs]. http://arxiv.org/abs/1706.03762
Wang, F., Jiang, M., Qian, C., Yang, S., Li, C., Zhang, H., Wang, X., & Tang, X. （2017）. Residual Attention Network for Image Classification. 2017 IEEE Conference on Computer Vision and Pattern Recognition （CVPR）, 6450–6458. https://doi.org/10.1109/CVPR.2017.683
Wang, W. Y., & Yang, D. （2015）. That’s So Annoying!!!: A Lexical and Frame-Semantic Embedding Based Data Augmentation Approach to Automatic Categorization of Annoying Behaviors using #petpeeve Tweets. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 2557–2563. https://doi.org/10.18653/v1/D15-1306
Wei, J., & Zou, K. （2019）. EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks. https://openreview.net/forum?id=BJelsDvo84
Xie, Q., Dai, Z., Hovy, E., Luong, M.-T., & Le, Q. V. （2019）. Unsupervised Data Augmentation for Consistency Training. ArXiv:1904.12848 [Cs, Stat]. http://arxiv.org/abs/1904.12848
Xie, Z., Wang, S. I., Li, J., Lévy, D., Nie, A., Jurafsky, D., & Ng, A. Y. （2017）. Data Noising as Smoothing in Neural Network Language Models. ArXiv:1703.02573 [Cs]. http://arxiv.org/abs/1703.02573
Xiong, C., Merity, S., & Socher, R. （2016）. Dynamic Memory Networks for Visual and Textual Question Answering. ArXiv:1603.01417 [Cs]. http://arxiv.org/abs/1603.01417
Xu, Y., Jia, R., Mou, L., Li, G., Chen, Y., Lu, Y., & Jin, Z. （2016）. Improved relation classification by deep recurrent neural networks with data augmentation. Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, 1461–1470. https://www.aclweb.org/anthology/C16-1138
Yang, Z., Yang, D., Dyer, C., He, X., Smola, A., & Hovy, E. （2016）. Hierarchical Attention Networks for Document Classification. Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 1480–1489. https://doi.org/10.18653/v1/N16-1174
Zhang, X., Zhao, J., & LeCun, Y. （2015）. Character-level Convolutional Networks for Text Classification. Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 1, 649–657. http://dl.acm.org/citation.cfm?id=2969239.2969312

簡易檢索 / 詳目顯示

相關論文