跳到主要內容

簡易檢索 / 詳目顯示

研究生: 李彥瑾
Yan-Jin Lee
論文名稱: Incorporating Word Ordering Information into Contrastive Learning of Sentence Embeddings
指導教授: 柯士文
口試委員:
學位類別: 碩士
Master
系所名稱: 管理學院 - 資訊管理學系
Department of Information Management
論文出版年: 2023
畢業學年度: 111
語文別: 英文
論文頁數: 67
中文關鍵詞: 對比學習句子嵌入自然語言理解深度學習
外文關鍵詞: contrastive learning, sentence embeddings, natural language understanding, deep neural network
相關次數: 點閱:29下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 句子嵌入 (Sentence Embeddings) 在自然語言理解(Natural Language Understanding, NLU)扮演重要的角色。預訓練模型 (例如: BERT、RoBERTa) 會將原始句子轉換成句子嵌入,並將該嵌入 (Embeddings) 套用在多個NLU任務中有顯著的效果提升,但在語意文本相似性(Semantic Textual, Similarity, STS)任務上表現卻不如預期。先前的研究發現BERT和RoBERTa模型對單詞順序具有敏感性。為此,我們提出了一種名為StructCSE的方法,它將學習單詞順序 (word order) 和語意訊息 (semantic information) 相結合,以增強對比學習的句子嵌入。
    在實驗中,我們分別使用了語意文本相似性 (STS) 以及遷移學習 (Transfer Learning) 任務來驗證 StructCSE的有效性。根據實驗結果,在大部分的資料集StructCSE有與過去作法相抗衡或優於基線模型的表現,其中發現StructCSE在以BERT為基礎模型時,在STS任務有顯著的進步,而遷移學習至情感分析的子任務中,StructCSE有突出的表現。


    Sentence embeddings play an important role in Natural language understanding (NLU). Pretrained models like BERT and RoBERTa encode input sentences into embeddings, which significantly enhances performance across multiple NLU tasks. However, they underperform in the task of semantic textual similarity (STS). Previous research has found that BERT and RoBERTa are sensitive to the word ordering. In response, we propose StructCSE, a method that incorporates word order and semantic information to enhance contrastive learning of sentence embeddings.
    In our experiments, we evaluate the effectiveness of StructCSE using STS and transfer learning tasks. The results demonstrate that StructCSE performs competitively or outperforms baseline models on most datasets. Especially, with fine-tuning BERT, StructCSE achieves better performance in the STS tasks and exhibits outstanding performance in the sentiment analysis subtasks of transfer learning.

    摘 要 I Abstract II Table of Contents III List of Figures V List of Tables VI 1. Introduction 1 1.1. Overview 1 1.2. Motivation 4 1.3. Objectives 6 1.4. Thesis Organization 7 2. Related Works 8 2.1. Contextual Embeddings 8 2.1.1. Word-level Representations 8 2.1.2. Context-dependent Representations 8 2.1.3. Pre-training Models for Contextual Embeddings 9 2.1.4. Sentence Embedding Methods 11 2.2. Contrastive Learning 12 2.2.1. Contrastive Learning in Sentence Embedding 14 2.2.1.1. SimCSE (Gao et al., 2021) 15 2.2.1.2. DiffCSE (Chuang et al., 2022) 15 3. Methodology 17 3.1. Overview 17 3.2. Model Architecture 17 3.3. Datasets 17 3.4. Experiment Process 24 3.5. Experiment Design 25 3.5.1. Experiment 1 - The Effectiveness of Our Model in Semantic Textual Similarity Tasks 25 3.5.2. Experiment 2 - The Effectiveness of Our Model in Text Classification 26 3.6. Evaluation Metrics 26 3.6.1. Confusion Metrix 26 3.6.2. Spearman and Pearson Correlation 28 4. Experiment Results 29 4.1. Experiment 1 Results - The Effectiveness of Our Model in Semantic Textual Similarity Tasks 29 4.1.1. Quantitative Study 29 4.1.2. Case Study 30 4.1.3. Distribution of Sentence Embeddings 32 4.2. Experiment 2 Results - The Effectiveness of Our Model in Text Classification 33 4.3. The Review of Experiment Results 36 5. Conclusion 38 5.1. Overall Summary 38 5.2. Contributions 39 5.3. Study Limitations 39 5.4. Future Works 39 6. Reference 41

    Agirre, E., Banea, C., Cardie, C., Cer, D., Diab, M., Gonzalez-Agirre, A., Guo, W., Lopez-Gazpio, I., Maritxalar, M., Mihalcea, R., Rigau, G., Uria, L., & Wiebe, J. (2015). SemEval-2015 Task 2: Semantic Textual Similarity, English, Spanish and Pilot on Interpretability. Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), 252–263. https://doi.org/10.18653/v1/S15-2045
    Agirre, E., Banea, C., Cardie, C., Cer, D., Diab, M., Gonzalez-Agirre, A., Guo, W., Mihalcea, R., Rigau, G., & Wiebe, J. (2014). SemEval-2014 Task 10: Multilingual Semantic Textual Similarity. Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014), 81–91. https://doi.org/10.3115/v1/S14-2010
    Agirre, E., Banea, C., Cer, D., Diab, M., Gonzalez-Agirre, A., Mihalcea, R., Rigau, G., & Wiebe, J. (2016). SemEval-2016 Task 1: Semantic Textual Similarity, Monolingual and Cross-Lingual Evaluation. Proceedings of the 10th International Workshop on Semantic Evaluation (SemEval-2016), 497–511. https://doi.org/10.18653/v1/S16-1081
    Agirre, E., Cer, D., Diab, M., Gonzalez-Agirre, A., & Guo, W. (2013). *SEM 2013 shared task: Semantic Textual Similarity. Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 1: Proceedings of the Main Conference and the Shared Task: Semantic Textual Similarity, 32–43. https://aclanthology.org/S13-1004
    Agirre, E., Diab, M., Cer, D., & Gonzalez-Agirre, A. (2012). SemEval-2012 Task 6: A Pilot on Semantic Textual Similarity. Proceedings of the First Joint Conference on Lexical and Computational Semantics - Volume 1: Proceedings of the Main Conference and the Shared Task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation, 385–393.
    Bowman, S. R., Angeli, G., Potts, C., & Manning, C. D. (2015). A large annotated corpus for learning natural language inference. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, 632–642. https://doi.org/10.18653/v1/D15-1075
    Cai, X., Huang, J., Bian, Y., & Church, K. (2021). Isotropy in the Contextual Embedding Space: Clusters and Manifolds. International Conference on Learning Representations. https://openreview.net/forum?id=xYGNO86OWDH
    Cao, R., Wang, Y., Liang, Y., Gao, L., Zheng, J., Ren, J., & Wang, Z. (2022). Exploring the Impact of Negative Samples of Contrastive Learning: A Case Study of Sentence Embedding. Findings of the Association for Computational Linguistics: ACL 2022, 3138–3152. https://doi.org/10.18653/v1/2022.findings-acl.248
    Cer, D., Diab, M., Agirre, E., Lopez-Gazpio, I., & Specia, L. (2017). SemEval-2017 Task 1: Semantic Textual Similarity Multilingual and Crosslingual Focused Evaluation. Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017), 1–14. https://doi.org/10.18653/v1/S17-2001
    Chen, D., Du, J., Bing, L., & Xu, R. (2018). Hybrid Neural Attention for Agreement/Disagreement Inference in Online Debates. Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, 665–670. https://doi.org/10.18653/v1/D18-1069
    Chen, T., Kornblith, S., Norouzi, M., & Hinton, G. (2020). A Simple Framework for Contrastive Learning of Visual Representations. Proceedings of the 37th International Conference on Machine Learning, 1597–1607. https://proceedings.mlr.press/v119/chen20j.html
    Cho, K., van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., & Bengio, Y. (2014). Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 1724–1734. https://doi.org/10.3115/v1/D14-1179
    Chuang, Y.-S., Dangovski, R., Luo, H., Zhang, Y., Chang, S., Soljacic, M., Li, S.-W., Yih, S., Kim, Y., & Glass, J. (2022). DiffCSE: Difference-based Contrastive Learning for Sentence Embeddings. Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 4207–4218. https://doi.org/10.18653/v1/2022.naacl-main.311
    Chuang, Y.-S., Dangovski, R., Luo, H., Zhang, Y., Chang, S., Soljačić, M., Li, S.-W., Yih, W., Kim, Y., & Glass, J. (2022). DiffCSE: Difference-based Contrastive Learning for Sentence Embeddings (arXiv:2204.10298). arXiv. http://arxiv.org/abs/2204.10298
    Chung, J., Gulcehre, C., Cho, K., & Bengio, Y. (2014). Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling. ArXiv:1412.3555 [Cs]. http://arxiv.org/abs/1412.3555
    Conneau, A., & Kiela, D. (2018, May). SentEval: An Evaluation Toolkit for Universal Sentence Representations. Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). LREC 2018, Miyazaki, Japan. https://aclanthology.org/L18-1269
    Conneau, A., Kiela, D., Schwenk, H., Barrault, L., & Bordes, A. (2017). Supervised Learning of Universal Sentence Representations from Natural Language Inference Data. Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 670–680. https://doi.org/10.18653/v1/D17-1070
    Dangovski, R., Jing, L., Loh, C., Han, S., Srivastava, A., Cheung, B., Agrawal, P., & Soljačić, M. (2021). Equivariant contrastive learning. ArXiv Preprint ArXiv:2111.00899.
    Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019a). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), 4171–4186. https://doi.org/10.18653/v1/N19-1423
    Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019b). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. Proceedings of the 2019 Conference of the North, 4171–4186. https://doi.org/10.18653/v1/N19-1423
    Dolan, B., Quirk, C., & Brockett, C. (2004). Unsupervised Construction of Large Paraphrase Corpora: Exploiting Massively Parallel News Sources. COLING 2004: Proceedings of the 20th International Conference on Computational Linguistics, 350–356. https://www.aclweb.org/anthology/C04-1051
    Ethayarajh, K. (2019). How Contextual are Contextualized Word Representations? Comparing the Geometry of BERT, ELMo, and GPT-2 Embeddings. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 55–65. https://doi.org/10.18653/v1/D19-1006
    Gao, J., He, D., Tan, X., Qin, T., Wang, L., & Liu, T. (2019). Representation Degeneration Problem in Training Natural Language Generation Models. International Conference on Learning Representations. https://openreview.net/forum?id=SkEYojRqtm
    Gao, T., Yao, X., & Chen, D. (2021). SimCSE: Simple Contrastive Learning of Sentence Embeddings. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, 6894–6910. https://doi.org/10.18653/v1/2021.emnlp-main.552
    Grill, J.-B., Strub, F., Altché, F., Tallec, C., Richemond, P., Buchatskaya, E., Doersch, C., Avila Pires, B., Guo, Z., Gheshlaghi Azar, M., Piot, B., kavukcuoglu, koray, Munos, R., & Valko, M. (2020). Bootstrap Your Own Latent—A New Approach to Self-Supervised Learning. In H. Larochelle, M. Ranzato, R. Hadsell, M. F. Balcan, & H. Lin (Eds.), Advances in Neural Information Processing Systems (Vol. 33, pp. 21271–21284). Curran Associates, Inc. https://proceedings.neurips.cc/paper_files/paper/2020/file/f3ada80d5c4ee70142b17b8192b2958e-Paper.pdf
    Hadsell, R., Chopra, S., & LeCun, Y. (2006). Dimensionality Reduction by Learning an Invariant Mapping. 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Volume 2 (CVPR’06), 2, 1735–1742. https://doi.org/10.1109/CVPR.2006.100
    He, K., Fan, H., Wu, Y., Xie, S., & Girshick, R. (2020). Momentum Contrast for Unsupervised Visual Representation Learning. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 9726–9735. https://doi.org/10.1109/CVPR42600.2020.00975
    Hermann, K. M., Kocisky, T., Grefenstette, E., Espeholt, L., Kay, W., Suleyman, M., & Blunsom, P. (2015). Teaching Machines to Read and Comprehend. In C. Cortes, N. Lawrence, D. Lee, M. Sugiyama, & R. Garnett (Eds.), Advances in Neural Information Processing Systems (Vol. 28). Curran Associates, Inc. https://proceedings.neurips.cc/paper_files/paper/2015/file/afdec7005cc9f14302cd0474fd0f3c96-Paper.pdf
    Hessel, J., & Schofield, A. (2021). How effective is BERT without word ordering? Implications for language understanding and data privacy. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 2: Short Papers), 204–211. https://doi.org/10.18653/v1/2021.acl-short.27
    Hill, F., Cho, K., & Korhonen, A. (2016). Learning Distributed Representations of Sentences from Unlabelled Data. Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 1367–1377. https://doi.org/10.18653/v1/N16-1162
    Hochreiter, S., & Schmidhuber, J. (1997). Long Short-Term Memory. Neural Computation, 9(8), 1735–1780. https://doi.org/10.1162/neco.1997.9.8.1735
    Hu, M., & Liu, B. (2004). Mining and Summarizing Customer Reviews. Proceedings of the Tenth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 168–177. https://doi.org/10.1145/1014052.1014073
    Huang, J., Tang, D., Zhong, W., Lu, S., Shou, L., Gong, M., Jiang, D., & Duan, N. (2021). WhiteningBERT: An Easy Unsupervised Sentence Embedding Approach. Findings of the Association for Computational Linguistics: EMNLP 2021, 238–244. https://doi.org/10.18653/v1/2021.findings-emnlp.23
    Jing, L., & Tian, Y. (2021). Self-Supervised Visual Feature Learning With Deep Neural Networks: A Survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(11), 4037–4058. https://doi.org/10.1109/TPAMI.2020.2992393
    Joshi, M., Chen, D., Liu, Y., Weld, D. S., Zettlemoyer, L., & Levy, O. (2020). SpanBERT: Improving Pre-training by Representing and Predicting Spans. Transactions of the Association for Computational Linguistics, 8, 64–77. https://doi.org/10.1162/tacl_a_00300
    Kiros, R., Zhu, Y., Salakhutdinov, R. R., Zemel, R., Urtasun, R., Torralba, A., & Fidler, S. (2015). Skip-Thought Vectors. Advances in Neural Information Processing Systems, 28. https://papers.nips.cc/paper/2015/hash/f442d33fa06832082290ad8544a8da27-Abstract.html
    Kiros, R., Zhu, Y., Salakhutdinov, R., Zemel, R. S., Torralba, A., Urtasun, R., & Fidler, S. (2015). Skip-thought Vectors. Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 2, 3294–3302. http://dl.acm.org/citation.cfm?id=2969442.2969607
    Lan, Z., Chen, M., Goodman, S., Gimpel, K., Sharma, P., & Soricut, R. (2020, April). ALBERT: A Lite BERT for Self-supervised Learning of Language Representations. Eighth International Conference on Learning Representations. https://iclr.cc/virtual_2020/poster_H1eA7AEtvS.html
    LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278–2324. https://doi.org/10.1109/5.726791
    Li, B., Zhou, H., He, J., Wang, M., Yang, Y., & Li, L. (2020). On the Sentence Embeddings from Pre-trained Language Models. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), 9119–9130. https://doi.org/10.18653/v1/2020.emnlp-main.733
    Li, X., & Roth, D. (2002). Learning Question Classifiers. Proceedings of the 19th International Conference on Computational Linguistics - Volume 1, 1–7. https://doi.org/10.3115/1072228.1072378
    Liu, Y., Ott, M., Goyal, N., Du, J., Joshi, M., Chen, D., Levy, O., Lewis, M., Zettlemoyer, L., & Stoyanov, V. (2019). Roberta: A robustly optimized bert pretraining approach. ArXiv Preprint ArXiv:1907.11692.
    Logeswaran, L., & Lee, H. (2018). An efficient framework for learning sentence representations. ArXiv:1803.02893 [Cs]. http://arxiv.org/abs/1803.02893
    Luo, Z. (2021). Analyzing the Anisotropy Phenomenon in Transformer-based Masked Language Models.
    Marelli, M., Menini, S., Baroni, M., Bentivogli, L., Bernardi, R., & Zamparelli, R. (2014). A SICK cure for the evaluation of compositional distributional semantic models. Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14), 216–223. http://www.lrec-conf.org/proceedings/lrec2014/pdf/363_Paper.pdf
    Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient Estimation of Word Representations in Vector Space. ArXiv:1301.3781 [Cs]. http://arxiv.org/abs/1301.3781
    Pang, B., & Lee, L. (2004). A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts. Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL-04), 271–278. https://doi.org/10.3115/1218955.1218990
    Pang, B., & Lee, L. (2005). Seeing Stars: Exploiting Class Relationships for Sentiment Categorization with Respect to Rating Scales. Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL’05), 115–124. https://doi.org/10.3115/1219840.1219855
    Pathak, D., Krähenbühl, P., Donahue, J., Darrell, T., & Efros, A. A. (2016). Context Encoders: Feature Learning by Inpainting. 2536–2544. https://doi.org/10.1109/CVPR.2016.278
    Pavlopoulos, J., Malakasiotis, P., & Androutsopoulos, I. (2017). Deeper Attention to Abusive User Content Moderation. Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, 1125–1135. https://doi.org/10.18653/v1/D17-1117
    Pennington, J., Socher, R., & Manning, C. (2014). GloVe: Global Vectors for Word Representation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 1532–1543. https://doi.org/10.3115/v1/D14-1162
    Reimers, N., & Gurevych, I. (2019). Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), 3980–3990. https://doi.org/10.18653/v1/D19-1410
    Sepp, H., & Juergen, S. (1997). LONG SHORT-TERM MEMORY.
    Socher, R., Perelygin, A., Wu, J., Chuang, J., Manning, C. D., Ng, A., & Potts, C. (2013). Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank. Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, 1631–1642. https://www.aclweb.org/anthology/D13-1170
    Sukhbaatar, S., szlam, arthur, Weston, J., & Fergus, R. (2015). End-To-End Memory Networks. In C. Cortes, N. Lawrence, D. Lee, M. Sugiyama, & R. Garnett (Eds.), Advances in Neural Information Processing Systems (Vol. 28). Curran Associates, Inc. https://proceedings.neurips.cc/paper_files/paper/2015/file/8fb21ee7a2207526da55a679f0332de2-Paper.pdf
    Taylor, W. L. (1953). “Cloze Procedure”: A New Tool for Measuring Readability. Journalism Quarterly, 30(4), 415–433. https://doi.org/10.1177/107769905303000401
    Turian, J., Ratinov, L.-A., & Bengio, Y. (2010). Word Representations: A Simple and General Method for Semi-Supervised Learning. Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, 384–394. https://www.aclweb.org/anthology/P10-1040
    Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., & Polosukhin, I. (2017). Attention is All you Need. Advances in Neural Information Processing Systems, 30. https://proceedings.neurips.cc/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html
    Wang, B., Zhao, D., Lioma, C., Li, Q., Zhang, P., & Simonsen, J. G. (2019, December 20). Encoding word order in complex embeddings. International Conference on Learning Representations. https://openreview.net/forum?id=Hke-WTVtwr
    Wang, T., & Isola, P. (2020). Understanding contrastive representation learning through alignment and uniformity on the hypersphere. Proceedings of the 37th International Conference on Machine Learning, 9929–9939.
    Wang, T., & Lu, W. (2022). Differentiable Data Augmentation for Contrastive Sentence Representation Learning. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, 7640–7653. https://aclanthology.org/2022.emnlp-main.520
    Wang, W., Bi, B., Yan, M., Wu, C., Xia, J., Bao, Z., Peng, L., & Si, L. (2020, April). StructBERT: Incorporating Language Structures into Pre-training for Deep Language Understanding. International Conference on Learning Representations (ICLR). ICLR 2020. https://iclr.cc/virtual_2020/poster_BJgQ4lSFPH.html
    Wiebe, J., Wilson, T., & Cardie, C. (2005). Annotating Expressions of Opinions and Emotions in Language. Language Resources and Evaluation, 39(2), 165–210. https://doi.org/10.1007/s10579-005-7880-9
    Williams, A., Nangia, N., & Bowman, S. (2018). A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference. Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), 1112–1122. https://doi.org/10.18653/v1/N18-1101
    Wu, X., Gao, C., Su, Y., Han, J., Wang, Z., & Hu, S. (2022). Smoothed Contrastive Learning for Unsupervised Sentence Embedding. Proceedings of the 29th International Conference on Computational Linguistics, 4902–4906. https://aclanthology.org/2022.coling-1.434
    Wu, X., Gao, C., Zang, L., Han, J., Wang, Z., & Hu, S. (2022). ESimCSE: Enhanced Sample Building Method for Contrastive Learning of Unsupervised Sentence Embedding. Proceedings of the 29th International Conference on Computational Linguistics, 3898–3907. https://aclanthology.org/2022.coling-1.342
    Yang, Z., Dai, Z., Yang, Y., Carbonell, J., Salakhutdinov, R. R., & Le, Q. V. (2019). XLNet: Generalized Autoregressive Pretraining for Language Understanding. Advances in Neural Information Processing Systems, 32. https://proceedings.neurips.cc/paper/2019/hash/dc6a7e655d7e5840e66733e9ee67cc69-Abstract.html
    Zhang, R., Zhu, J.-Y., Isola, P., Geng, X., Lin, A. S., Yu, T., & Efros, A. A. (2017). Real-Time User-Guided Image Colorization with Learned Deep Priors. ACM Trans. Graph., 36(4). https://doi.org/10.1145/3072959.3073703

    QR CODE
    :::