none｜國立中央大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	林佑錡 Yu-Chi Lin
論文名稱：	History Aware Multi-Stage Prompting for Neural Chat Translation
指導教授：	柯士文 Shin-Wen Ke
口試委員:
學位類別：	碩士 Master
系所名稱：	管理學院 - 資訊管理學系 Department of Information Management
論文出版年：	2023
畢業學年度：	111
語文別：	英文
論文頁數：	95
中文關鍵詞：	神經網路聊天翻譯、機器翻譯、提示調整、深度學習
外文關鍵詞：	neural chat translation, machine translation, prompt tuning, deep learning
相關次數：	點閱：9 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

神經網路聊天翻譯 (Neural Chat Translation, NCT) 是近年於機器翻譯領域中興起的任務，與神經機器翻譯 (Neural Machine Translation, NMT) 不同的是，神經網路聊天翻譯還涉及了多輪對話，因此是一項相當具挑戰性的二合一任務。雖然先前已經有研究使用上下文感知模型，並加入不同的輔助任務來來解決此任務，但往往需要很高的訓練成本。
在微調預訓練語言的成本逐漸提升下，提示 (Prompt) 調整的開始興起，該方法展現了具備參數效率以及在表現上可與微調預訓練語言比較的特性。而最近此方法有被應用至機器翻譯領域中，但是仍只考慮句子級別的翻譯，沒辦法有效將神經網路聊天翻譯任務重視的聊天內容納入考量。因此在本研究中，我們為這項任務提出一個新的提示調整方法稱為 History Aware Multi-stage Prompting (HAMSP)，透過將聊天歷史內容資訊納入到提示，以引導預訓練語言模型生成與對話情境一致的翻譯結果。
在實驗結果中，我們展示了我們提出的 HAMSP 與基準方法相較之下達到更好的表現性能，並且能夠與微調方法相互抗衡。而透過進一步的內在評估，我們說明了我們的方法更加的穩健，並且能夠有效提升翻譯結果的對話連貫性，以及可以提升訓練效率與降低硬體成本，具備廣泛應用至真實世界中不同的聊天系統之潛力。

Neural Chat Translation (NCT) is an emerging task in the field of machine translation. Unlike Neural Machine Translation (NMT), NCT involves multi-turn conversations, making it a challenging dual-task. Previous research has explored the use of context-aware models and auxiliary tasks to address this task, but often at a high training cost.
As the cost of fine-tuning pre-trained language models continues to rise, prompt tuning has emerged as a promising alternative. This method demonstrates the characteristics of parameter efficiency and comparable performance to fine-tuning pre-trained language models. Recently, this method has been applied to the field of machine translation, but it only considers sentence-level translations and does not incorporate the conversational content that is crucial in neural chat translation tasks. Therefore, in this study, we present a new prompt tuning method called History Aware Multi-Stage Prompting (HAMSP). By incorporating the information from the chat history into the prompts, we guide the pre-trained language model to generate translations that are consistent with the conversational context.
In the experimental results, we demonstrate that our proposed HAMSP outperforms the baseline methods and can compete with fine-tuning methods. Through further intrinsic evaluation, we illustrate the robustness of our method and its ability to enhance the dialogue coherence of translations. Additionally, our method shows potential for improving training efficiency and reducing hardware costs, making it suitable for various chat systems in real-world applications.

摘要    I
Abstract    II
Acknowledgements    III
Table of Contents    IV
List of Figures    VI
List of Tables    VII
   Introduction    1
1.    Overview    1
2.    Motivation    2
3.    Objectives    4
4.    Thesis Organization    5
   Related Works    6
1.    Neural Machine Translation    6
1.1.    Sentence-level NMT    6
1.2.    Document-level NMT    8
1.3.    Neural Chat Translation    10
2.    Prompt Tuning    12
2.1.    Manual Prompt    13
2.2.    Discrete Prompt    14
2.3.    Continuous Prompt    20
3.    Multilingual Pre-trained Language Models    28
3.1.    mBART    28
3.2.    mT5    29
3.3.    mGPT    30
4.    Discussion    31
   Methodology    34
1.    Model Overview    34
2.    Model Architecture    35
2.1.    Prompt Generator    36
2.2.    Multi-Stage    38
3.    Training Phase    39
4.    Datasets    39
5.    Experiment setting    41
5.1.    Data preprocessing and postprocessing    41
5.2.    Model Setting    41
6.    Flow Chart    43
7.    Experiment Design    43
7.1.    Experiment - The effectiveness of our proposed prompting method applied to NCT tasks.    43
7.2.    Evaluation Metrics    44
   Experiment Results    48
1.    Experiment - The effectiveness of our proposed prompting method applied to NCT tasks.    48
1.1.    Experiment Results    48
1.2.    Intrinsic evaluation    56
   Conclusion    72
1.    Overall summary    72
2.    Contributions    72
3.    Study limitation    73
4.    Future work    74
Reference    75

                                

Agichtein, E., Gravano, L., 2000. Snowball: extracting relations from large plain-text collections, in: Proceedings of the Fifth ACM Conference on Digital Libraries, DL ’00. Association for Computing Machinery, New York, NY, USA, pp. 85–94. https://doi.org/10.1145/336597.336644
Bahdanau, D., Cho, K., Bengio, Y., 2015. Neural Machine Translation by Jointly Learning to Align and Translate, in: Bengio, Y., LeCun, Y. (Eds.), 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings.
Banerjee, S., Lavie, A., 2005. METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments, in: Proceedings of the ACL Workshop on Intrinsic and Extrinsic Evaluation Measures for Machine Translation and/or Summarization. Association for Computational Linguistics, Ann Arbor, Michigan, pp. 65–72.
Bawden, R., Sennrich, R., Birch, A., Haddow, B., 2018. Evaluating Discourse Phenomena in Neural Machine Translation, in: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Presented at the NAACL-HLT 2018, Association for Computational Linguistics, New Orleans, Louisiana, pp. 1304–1313. https://doi.org/10.18653/v1/N18-1118
Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Neelakantan, A., Shyam, P., Sastry, G., Askell, A., Agarwal, S., Herbert-Voss, A., Krueger, G., Henighan, T., Child, R., Ramesh, A., Ziegler, D., Wu, J., Winter, C., Hesse, C., Chen, M., Sigler, E., Litwin, M., Gray, S., Chess, B., Clark, J., Berner, C., McCandlish, S., Radford, A., Sutskever, I., Amodei, D., 2020. Language Models are Few-Shot Learners, in: Advances in Neural Information Processing Systems. Curran Associates, Inc., pp. 1877–1901.
Chen, J., Li, X., Zhang, J., Zhou, C., Cui, J., Wang, B., Su, J., 2020. Modeling Discourse Structure for Document-level Neural Machine Translation, in: Proceedings of the First Workshop on Automatic Simultaneous Translation. Presented at the AutoSimTrans 2020, Association for Computational Linguistics, Seattle, Washington, pp. 30–36. https://doi.org/10.18653/v1/2020.autosimtrans-1.5
Cho, K., van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., Bengio, Y., 2014. Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation, in: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Presented at the EMNLP 2014, Association for Computational Linguistics, Doha, Qatar, pp. 1724–1734. https://doi.org/10.3115/v1/D14-1179
Conneau, A., Khandelwal, K., Goyal, N., Chaudhary, V., Wenzek, G., Guzmán, F., Grave, E., Ott, M., Zettlemoyer, L., Stoyanov, V., 2020. Unsupervised Cross-lingual Representation Learning at Scale, in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Presented at the ACL 2020, Association for Computational Linguistics, Online, pp. 8440–8451. https://doi.org/10.18653/v1/2020.acl-main.747
Devlin, J., Chang, M.-W., Lee, K., Toutanova, K., 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, in: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Presented at the NAACL-HLT 2019, Association for Computational Linguistics, Minneapolis, Minnesota, pp. 4171–4186. https://doi.org/10.18653/v1/N19-1423
Farajian, M.A., Lopes, A.V., Martins, A.F.T., Maruf, S., Haffari, G., 2020. Findings of the WMT 2020 Shared Task on Chat Translation, in: Proceedings of the Fifth Conference on Machine Translation. Presented at the EMNLP-WMT 2020, Association for Computational Linguistics, Online, pp. 65–75.
Gao, T., Fisch, A., Chen, D., 2021. Making Pre-trained Language Models Better Few-shot Learners, in: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Presented at the ACL-IJCNLP 2021, Association for Computational Linguistics, Online, pp. 3816–3830. https://doi.org/10.18653/v1/2021.acl-long.295
Gu, Y., Han, X., Liu, Z., Huang, M., 2022. PPT: Pre-trained Prompt Tuning for Few-shot Learning, in: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Presented at the ACL 2022, Association for Computational Linguistics, Dublin, Ireland, pp. 8410–8423. https://doi.org/10.18653/v1/2022.acl-long.576
Haviv, A., Berant, J., Globerson, A., 2021. BERTese: Learning to Speak to BERT, in: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume. Presented at the EACL 2021, Association for Computational Linguistics, Online, pp. 3618–3623. https://doi.org/10.18653/v1/2021.eacl-main.316
Hendrycks, D., Gimpel, K., 2020. Gaussian Error Linear Units (GELUs). https://doi.org/10.48550/arXiv.1606.08415
Jiang, Z., Anastasopoulos, A., Araki, J., Ding, H., Neubig, G., 2020a. X-FACTR: Multilingual Factual Knowledge Retrieval from Pretrained Language Models, in: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Presented at the EMNLP 2020, Association for Computational Linguistics, Online, pp. 5943–5959. https://doi.org/10.18653/v1/2020.emnlp-main.479
Jiang, Z., Xu, F.F., Araki, J., Neubig, G., 2020b. How Can We Know What Language Models Know? Trans. Assoc. Comput. Linguist. 8, 423–438. https://doi.org/10.1162/tacl_a_00324
Kingma, D.P., Ba, J., 2014. Adam: A method for stochastic optimization. ArXiv Prepr. ArXiv14126980.
Lapata, M., Barzilay, R., 2005. Automatic evaluation of text coherence: models and representations, in: Proceedings of the 19th International Joint Conference on Artificial Intelligence, IJCAI’05. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, pp. 1085–1090.
Läubli, S., Sennrich, R., Volk, M., 2018. Has Machine Translation Achieved Human Parity? A Case for Document-level Evaluation, in: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Presented at the EMNLP 2018, Association for Computational Linguistics, Brussels, Belgium, pp. 4791–4796. https://doi.org/10.18653/v1/D18-1512
Lee, S., Lee, J., Moon, H., Park, C., Seo, J., Eo, S., Koo, S., Lim, H., 2023. A Survey on Evaluation Metrics for Machine Translation. Mathematics 11, 1006. https://doi.org/10.3390/math11041006
Lei, Y., Ren, Y., Xiong, D., 2022. CoDoNMT: Modeling Cohesion Devices for Document-Level Neural Machine Translation, in: Proceedings of the 29th International Conference on Computational Linguistics. Presented at the COLING 2022, International Committee on Computational Linguistics, Gyeongju, Republic of Korea, pp. 5205–5216.
Lester, B., Al-Rfou, R., Constant, N., 2021. The Power of Scale for Parameter-Efficient Prompt Tuning, in: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Presented at the EMNLP 2021, Association for Computational Linguistics, Online and Punta Cana, Dominican Republic, pp. 3045–3059. https://doi.org/10.18653/v1/2021.emnlp-main.243
Lewis, M., Liu, Y., Goyal, N., Ghazvininejad, M., Mohamed, A., Levy, O., Stoyanov, V., Zettlemoyer, L., 2020. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension, in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Presented at the ACL 2020, Association for Computational Linguistics, Online, pp. 7871–7880. https://doi.org/10.18653/v1/2020.acl-main.703
Li, X.L., Liang, P., 2021. Prefix-Tuning: Optimizing Continuous Prompts for Generation, in: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Presented at the ACL-IJCNLP 2021, Association for Computational Linguistics, Online, pp. 4582–4597. https://doi.org/10.18653/v1/2021.acl-long.353
Li, Y., Yin, Y., Li, J., Zhang, Y., 2022. Prompt-Driven Neural Machine Translation, in: Findings of the Association for Computational Linguistics: ACL 2022. Presented at the Findings 2022, Association for Computational Linguistics, Dublin, Ireland, pp. 2579–2590. https://doi.org/10.18653/v1/2022.findings-acl.203
Liang, Y., Meng, F., Chen, Y., Xu, J., Zhou, J., 2021a. Modeling Bilingual Conversational Characteristics for Neural Chat Translation, in: Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers). Presented at the ACL-IJCNLP 2021, Association for Computational Linguistics, Online, pp. 5711–5724. https://doi.org/10.18653/v1/2021.acl-long.444
Liang, Y., Meng, F., Xu, J., Chen, Y., Zhou, J., 2022. Scheduled Multi-task Learning for Neural Chat Translation, in: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Presented at the ACL 2022, Association for Computational Linguistics, Dublin, Ireland, pp. 4375–4388. https://doi.org/10.18653/v1/2022.acl-long.300
Liang, Y., Zhou, C., Meng, F., Xu, J., Chen, Y., Su, J., Zhou, J., 2021b. Towards Making the Most of Dialogue Characteristics for Neural Chat Translation, in: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Presented at the EMNLP 2021, Association for Computational Linguistics, Online and Punta Cana, Dominican Republic, pp. 67–79. https://doi.org/10.18653/v1/2021.emnlp-main.6
Liu, P., Yuan, W., Fu, J., Jiang, Z., Hayashi, H., Neubig, G., 2022. Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing. ACM Comput. Surv. https://doi.org/10.1145/3560815
Liu, X., Ji, K., Fu, Y., Tam, W., Du, Z., Yang, Z., Tang, J., 2022. P-Tuning: Prompt Tuning Can Be Comparable to Fine-tuning Across Scales and Tasks, in: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). Presented at the ACL 2022, Association for Computational Linguistics, Dublin, Ireland, pp. 61–68. https://doi.org/10.18653/v1/2022.acl-short.8
Liu, X., Zheng, Y., Du, Z., Ding, M., Qian, Y., Yang, Z., Tang, J., 2021. GPT Understands, Too. https://doi.org/10.48550/arXiv.2103.10385
Liu, Y., Gu, J., Goyal, N., Li, X., Edunov, S., Ghazvininejad, M., Lewis, M., Zettlemoyer, L., 2020. Multilingual Denoising Pre-training for Neural Machine Translation. Trans. Assoc. Comput. Linguist. 8, 726–742. https://doi.org/10.1162/tacl_a_00343
Ma, S., Zhang, D., Zhou, M., 2020. A Simple and Effective Unified Encoder for Document-Level Machine Translation, in: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics. Presented at the ACL 2020, Association for Computational Linguistics, Online, pp. 3505–3511. https://doi.org/10.18653/v1/2020.acl-main.321
Maruf, S., Martins, A.F.T., Haffari, G., 2019. Selective Attention for Context-aware Neural Machine Translation, in: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Presented at the NAACL-HLT 2019, Association for Computational Linguistics, Minneapolis, Minnesota, pp. 3092–3102. https://doi.org/10.18653/v1/N19-1313
Maruf, S., Martins, A.F.T., Haffari, G., 2018. Contextual Neural Model for Translating Bilingual Multi-Speaker Conversations, in: Proceedings of the Third Conference on Machine Translation: Research Papers. Presented at the WMT 2018, Association for Computational Linguistics, Brussels, Belgium, pp. 101–112. https://doi.org/10.18653/v1/W18-6311
Miculicich, L., Ram, D., Pappas, N., Henderson, J., 2018. Document-Level Neural Machine Translation with Hierarchical Attention Networks, in: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Presented at the EMNLP 2018, Association for Computational Linguistics, Brussels, Belgium, pp. 2947–2954. https://doi.org/10.18653/v1/D18-1325
Mikolov, T., Chen, K., Corrado, G., Dean, J., 2013. Efficient Estimation of Word Representations in Vector Space. CoRR abs/1301.3781.
Papineni, K., Roukos, S., Ward, T., Zhu, W.-J., 2002. Bleu: a Method for Automatic Evaluation of Machine Translation, in: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. Presented at the ACL 2002, Association for Computational Linguistics, Philadelphia, Pennsylvania, USA, pp. 311–318. https://doi.org/10.3115/1073083.1073135
Petroni, F., Rocktäschel, T., Riedel, S., Lewis, P., Bakhtin, A., Wu, Y., Miller, A., 2019. Language Models as Knowledge Bases?, in: Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP). Presented at the EMNLP-IJCNLP 2019, Association for Computational Linguistics, Hong Kong, China, pp. 2463–2473. https://doi.org/10.18653/v1/D19-1250
Poria, S., Hazarika, D., Majumder, N., Naik, G., Cambria, E., Mihalcea, R., 2019. MELD: A Multimodal Multi-Party Dataset for Emotion Recognition in Conversations, in: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. Presented at the ACL 2019, Association for Computational Linguistics, Florence, Italy, pp. 527–536. https://doi.org/10.18653/v1/P19-1050
Post, M., 2018. A Call for Clarity in Reporting BLEU Scores, in: Proceedings of the Third Conference on Machine Translation: Research Papers. Presented at the WMT 2018, Association for Computational Linguistics, Brussels, Belgium, pp. 186–191. https://doi.org/10.18653/v1/W18-6319
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I., 2019. Language models are unsupervised multitask learners. OpenAI Blog 1, 9.
Raffel, C., Shazeer, N., Roberts, A., Lee, K., Narang, S., Matena, M., Zhou, Y., Li, W., Liu, P.J., 2020. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. J. Mach. Learn. Res. 21, 1–67.
Ravichandran, D., Hovy, E., 2002. Learning surface text patterns for a Question Answering System, in: Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics. Presented at the ACL 2002, Association for Computational Linguistics, Philadelphia, Pennsylvania, USA, pp. 41–47. https://doi.org/10.3115/1073083.1073092
Sanh, V., Debut, L., Chaumond, J., Wolf, T., 2020. DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. https://doi.org/10.48550/arXiv.1910.01108
Shazeer, N., 2020. GLU Variants Improve Transformer. https://doi.org/10.48550/arXiv.2002.05202
Shin, T., Razeghi, Y., Logan IV, R.L., Wallace, E., Singh, S., 2020. AutoPrompt: Eliciting Knowledge from Language Models with Automatically Generated Prompts, in: Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP). Presented at the EMNLP 2020, Association for Computational Linguistics, Online, pp. 4222–4235. https://doi.org/10.18653/v1/2020.emnlp-main.346
Snover, M., Dorr, B., Schwartz, R., Micciulla, L., Makhoul, J., 2006. A Study of Translation Edit Rate with Targeted Human Annotation, in: Proceedings of the 7th Conference of the Association for Machine Translation in the Americas: Technical Papers. Presented at the AMTA 2006, Association for Machine Translation in the Americas, Cambridge, Massachusetts, USA, pp. 223–231.
Sohn, K., Lee, H., Yan, X., 2015. Learning Structured Output Representation using Deep Conditional Generative Models, in: Advances in Neural Information Processing Systems. Curran Associates, Inc.
Sutskever, I., Vinyals, O., Le, Q.V., 2014. Sequence to Sequence Learning with Neural Networks, in: Advances in Neural Information Processing Systems. Curran Associates, Inc.
Tan, Z., Zhang, J., Huang, X., Chen, G., Wang, S., Sun, M., Luan, H., Liu, Y., 2020. THUMT: An Open-Source Toolkit for Neural Machine Translation, in: Proceedings of the 14th Conference of the Association for Machine Translation in the Americas (Volume 1: Research Track). Presented at the AMTA 2020, Association for Machine Translation in the Americas, Virtual, pp. 116–122.
Tan, Z., Zhang, X., Wang, S., Liu, Y., 2022. MSP: Multi-Stage Prompting for Making Pre-trained Language Models Better Translators, in: Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Presented at the ACL 2022, Association for Computational Linguistics, Dublin, Ireland, pp. 6131–6142. https://doi.org/10.18653/v1/2022.acl-long.424
Tang, T., Li, J., Zhao, W.X., Wen, J.-R., 2022. Context-Tuning: Learning Contextualized Prompts for Natural Language Generation, in: Proceedings of the 29th International Conference on Computational Linguistics. Presented at the COLING 2022, International Committee on Computational Linguistics, Gyeongju, Republic of Korea, pp. 6340–6354.
Tiedemann, J., Scherrer, Y., 2017. Neural Machine Translation with Extended Context, in: Proceedings of the Third Workshop on Discourse in Machine Translation. Association for Computational Linguistics, Copenhagen, Denmark, pp. 82–92. https://doi.org/10.18653/v1/W17-4811
Toral, A., Castilho, S., Hu, K., Way, A., 2018. Attaining the Unattainable? Reassessing Claims of Human Parity in Neural Machine Translation, in: Proceedings of the Third Conference on Machine Translation: Research Papers. Presented at the WMT 2018, Association for Computational Linguistics, Brussels, Belgium, pp. 113–123. https://doi.org/10.18653/v1/W18-6312
Tu, Z., Liu, Y., Shi, S., Zhang, T., 2018. Learning to Remember Translation History with a Continuous Cache. Trans. Assoc. Comput. Linguist. 6, 407–420. https://doi.org/10.1162/tacl_a_00029
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł., Polosukhin, I., 2017. Attention is All you Need, in: Advances in Neural Information Processing Systems. Curran Associates, Inc.
Voita, E., Serdyukov, P., Sennrich, R., Titov, I., 2018. Context-Aware Neural Machine Translation Learns Anaphora Resolution, in: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Presented at the ACL 2018, Association for Computational Linguistics, Melbourne, Australia, pp. 1264–1274. https://doi.org/10.18653/v1/P18-1117
Wang, C., Wang, J., Qiu, M., Huang, J., Gao, M., 2021. TransPrompt: Towards an Automatic Transferable Prompting Framework for Few-shot Text Classification, in: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Presented at the EMNLP 2021, Association for Computational Linguistics, Online and Punta Cana, Dominican Republic, pp. 2792–2802. https://doi.org/10.18653/v1/2021.emnlp-main.221
Wang, T., Zhao, C., Wang, M., Li, L., Xiong, D., 2021. Autocorrect in the Process of Translation — Multi-task Learning Improves Dialogue Machine Translation, in: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Industry Papers. Presented at the NAACL-HLT 2021, Association for Computational Linguistics, Online, pp. 105–112. https://doi.org/10.18653/v1/2021.naacl-industry.14
Wenzek, G., Lachaux, M.-A., Conneau, A., Chaudhary, V., Guzmán, F., Joulin, A., Grave, E., 2020. CCNet: Extracting High Quality Monolingual Datasets from Web Crawl Data, in: Proceedings of the Twelfth Language Resources and Evaluation Conference. Presented at the LREC 2020, European Language Resources Association, Marseille, France, pp. 4003–4012.
Wu, Y., Schuster, M., Chen, Z., Le, Q.V., Norouzi, M., Macherey, W., Krikun, M., Cao, Y., Gao, Q., Macherey, K., Klingner, J., Shah, A., Johnson, M., Liu, X., Kaiser, Ł., Gouws, S., Kato, Y., Kudo, T., Kazawa, H., Stevens, K., Kurian, G., Patil, N., Wang, W., Young, C., Smith, J., Riesa, J., Rudnick, A., Vinyals, O., Corrado, G., Hughes, M., Dean, J., 2016. Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation. https://doi.org/10.48550/arXiv.1609.08144
Wu, Z., Wang, S., Gu, J., Hou, R., Dong, Y., Vydiswaran, V.G.V., Ma, H., 2022. IDPG: An Instance-Dependent Prompt Generation Method, in: Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Presented at the NAACL-HLT 2022, Association for Computational Linguistics, Seattle, United States, pp. 5507–5521. https://doi.org/10.18653/v1/2022.naacl-main.403
Xiong, H., He, Z., Wu, H., Wang, H., 2019. Modeling Coherence for Discourse Neural Machine Translation. Proc. AAAI Conf. Artif. Intell. 33, 7338–7345. https://doi.org/10.1609/aaai.v33i01.33017338
Xue, L., Constant, N., Roberts, A., Kale, M., Al-Rfou, R., Siddhant, A., Barua, A., Raffel, C., 2021. mT5: A Massively Multilingual Pre-trained Text-to-Text Transformer, in: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Presented at the NAACL-HLT 2021, Association for Computational Linguistics, Online, pp. 483–498. https://doi.org/10.18653/v1/2021.naacl-main.41
Zhang, J., Luan, H., Sun, M., Zhai, F., Xu, J., Zhang, M., Liu, Y., 2018. Improving the Transformer Translation Model with Document-Level Context, in: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Presented at the EMNLP 2018, Association for Computational Linguistics, Brussels, Belgium, pp. 533–542. https://doi.org/10.18653/v1/D18-1049

簡易檢索 / 詳目顯示

相關論文