DLEX：強化跨站腳本攻擊載荷偵測之大型語言模型回饋框架

簡易檢索 / 詳目顯示

回結果列表

研究生：	鄭宇翔 Yu-Hsiang Cheng
論文名稱：	DLEX：強化跨站腳本攻擊載荷偵測之大型語言模型回饋框架 DLEX: A Feedback driven LLM Framework for Evolving XSS detection Models
指導教授：	曾俊元 Chin-Yang Tseng 許富皓 Fu-Hao Hsu
口試委員:
學位類別：	碩士 Master
系所名稱：	資訊電機學院 - 資訊工程學系 Department of Computer Science & Information Engineering
論文出版年：	2025
畢業學年度：	113
語文別：	中文
論文頁數：	57
中文關鍵詞：	跨站網頁腳本、大型語言模型、對抗生成樣本
外文關鍵詞：	Large Language Models, Cross-site Scripting, Adversarial Payload Generation
相關次數：	點閱：17 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

在網路安全領域中，跨站腳本攻擊依然是最常見且危害性極高的漏洞之一。對於該攻擊，已有不少基於深度學習的跨站腳本載荷偵測模型被提出，但面對新型態或變異型的惡意載荷仍容易失效，導致防禦系統產生誤判或漏報。
為了解決此問題，本研究提出一個以大型語言模型為基礎的自我學習架構，利用大型語言模型具備語義理解能力、低建置成本、快速部署等優勢，相較於傳統的強化學習方式能更有效率地生成語意多樣且具滲透性的攻擊樣本，能自動生成具滲透能力的對抗型跨站腳本載荷，並作為深度學習模型的訓練資料，以持續強化其偵測能力。
本架構結合大型語言模型、對抗樣本設計與自動回饋機制，使偵測模型能在實驗環境中不斷對抗與學習新型攻擊，進而提升其穩健性與泛化能力。實驗結果顯示，透過本方法訓練後的模型，能有效提升對變異型載荷的偵測準確度，展現出此方法在主動防禦設計上的潛力。

Cross-site scripting (XSS) remains one of the most prevalent and dangerous web security threats. Although many deep learning-based models have been proposed for detecting XSS attacks, they often fail to detect novel or obfuscated payloads, resulting in false negatives and system vulnerabilities.

To solve this problem, this research proposes a self-learning framework using large language models (LLMs). The framework can automatically create XSS attack payloads that are able to get past common security filters. By leveraging the semantic understanding, low training cost, and rapid deployment capabilities of LLMs, this framework outperforms traditional RL-based approaches in efficiently generating diverse and hard-to-detect attack samples. These generated payloads are then used to improve the robustness of deep learning-based detection models.

The proposed framework combines LLM, adversarial sample creation, and feedback loop to simulate a continuous attack-defense situation. This allows the detection model to learn from new attack samples and improve its ability to handle different types of attacks. Test results show that the model trained with LLM-generated attack payloads improves its robustness for evasive attacks. This work shows that using LLMs with self-learning systems can help build more active and effective cybersecurity solutions.

中文摘要    i
Abstract    ii
誌謝    iii
目錄    v
圖目錄    vii
表目錄    viii
第1章    緒論    1
1.1 研究動機    1
1.2 研究目的    2
第2章    背景介紹    3
2.1 跨站腳本攻擊 (Cross-site scripting, XSS)    3
2.2 針對 XSS payload 之偵測模型    5
2.3 大型語言模型 (Large Language Models, LLMs)    7
2.4 LLMs 在資訊安全領域之應用    10
第3章    相關研究    11
3.1 基於 Soft Actor-Critic (SAC) 強化學習的 XSS 對抗樣本攻擊架構    11
3.2 基於 Chen 等人的 SAC XSS 載荷生成框架重現與延伸探討    12
3.3 基於生成對抗網路的 XSS 對抗樣本攻擊架構    14
3.4 利用 Transformer 的 XSS 對抗樣本攻擊架構    16
3.5 利用改良DDQN的 XSS 對抗樣本攻擊架構    17
第4章    系統架構與實作    20
4.1 設計目標    20
4.2 系統架構    21
4.3 系統元件    21
第5章    實驗結果與分析    27
5.1 實驗環境    27
5.2 超參數(hyperparameter) 與資料集    27
5.3 評估指標    28
5.4 實驗一：LLM 在不同溫度下的生成表現    29
5.5 實驗二：LLM在不同語言提示詞的表現    33
5.6 實驗三：基於 CVE-2020-11022 的生成測試    36
5.7 實驗四：以擴增資料集重新訓練深度偵測模型    38
第6章    討論    40
6.1 系統限制    40
6.2 未來展望    42
第7章    結論    44
參考文獻    45


                                

[1]. OWASP Top Ten 2021，最後存取日期：2025年6月28日，取自：https://owasp.org/www-project-top-ten/
[2]. P. Likarish, E. Jung and I. Jo, "Obfuscated malicious javascript detection using classification techniques," 2009 4th International Conference on Malicious and Unwanted Software (MALWARE), Montreal, QC, Canada, 2009, pp. 47-54
[3]. A. E. Nunan, E. Souto, E. M. dos Santos and E. Feitosa, "Automatic classification of cross-site scripting in web pages using document-based and URL-based features," 2012 IEEE Symposium on Computers and Communications (ISCC), Cappadocia, Turkey, 2012, pp. 000702-000707
[4]. S. Rathore, P. K. Sharma, and J. H. Park, “XSSClassifier: An efficient XSS attack detection approach based on machine learning classifier on SNSs,” Journal of Information Processing Systems (IJACSA), vol. 13, no. 4, pp. 1014–1028, 2017
[5]. F. Mereani and J. Howe, “Detecting Cross-Site Scripting Attacks Using Machine Learning,” The International Conference on Advanced Machine Learning Technologies and Applications (AMLTA2018), pp. 200–210, Jan. 2018
[6]. Y. Fang, Y. Li, L. Liu, and C. Huang, "DeepXSS: Cross site scripting detection based on deep learning," Proc. 2018 Int. Conf. on Computing and Artificial Intelligence (ICCAI), New York, NY, USA, pp. 47–51, 2018.
[7]. F. M. M. Mokbal, W. Dan, A. Imran, L. Jiuchuan, F. Akhtar and W. Xiaoxi, "MLPXSS: An Integrated XSS-Based Attack Detection Scheme in Web Applications Using Multilayer Perceptron Technique," in IEEE Access, vol. 7, pp. 100567-100580, 2019
[8]. A. Tekerek, "A novel architecture for web-based attack detection using convolutional neural network," Computers & Security, vol. 100, p. 102096, 2021
[9]. T. Hu, C. Xu, S. Zhang, S. Tao, and L. Li, "Cross-site scripting detection with two-channel feature fusion embedded in self-attention mechanism," Computers & Security, vol. 124, p. 102990, 2023
[10]. H. Xu, S. Wang, N. Li, K. Wang, Y. Zhao, K. Chen, T. Yu, Y. Liu, and H. Wang, "Large language models for cyber security: A systematic literature review," arXiv preprint arXiv:2405.04760, 2025.
[11]. B. Ahmad, S. Thakur, B. Tan, R. Karri and H. Pearce, "On Hardware Security Bug Code Fixes by Prompting Large Language Models," in IEEE Transactions on Information Forensics and Security, vol. 19, pp. 4043-4057, 2024
[12]. F. Perrina, F. Marchiori, M. Conti, and N. V. Verde, "AGIR: Automating cyber threat intelligence reporting with natural language generation," arXiv preprint arXiv:2310.02655, 2023
[13]. Y.-Z. Lin, M. Mamun, M. A. Chowdhury, S. Cai, M. Zhu, B. S. Latibari, K. I. Gubbi, N. N. Bavarsad, A. Caputo, A. Sasan, H. Homayoun, S. Rafatirad, P. Satam, and S. Salehi, "HW-V2W-Map: Hardware vulnerability to weakness mapping framework for root cause analysis with GPT-assisted mitigation suggestion," arXiv preprint arXiv:2312.13530, 2023.
[14]. X. Xu, Z. Zhang, Z. Su, Z. Huang, S. Feng, Y. Ye, N. Jiang, D. Xie, S. Cheng, L. Tan, and X. Zhang, "Unleashing the power of generative model in recovering variable names from stripped binary," in Network and Distributed System Security Symposium (NDSS), San Diego, CA, USA, Feb. 2025.
[15]. L. Chen, C. Tang, J. He, H. Zhao, X. Lan, and T. Li, "XSS adversarial example attacks based on deep reinforcement learning," Computers & Security, vol. 120, p. 102831, 2022.
[16]. S. Pasini, G. Maragliano, J. Kim, and P. Tonella, "XSS adversarial attacks based on deep reinforcement learning: A replication and extension study," arXiv preprint arXiv:2502.19095, 2025
[17]. R. L. Alaoui and E. H. Nfaoui, "Generative adversarial network-based approach for automated generation of adversarial attacks against a deep-learning based XSS attack detection model," International Journal of Advanced Computer Science and Applications (IJACSA), vol. 14, no. 7, 2023.
[18]. S. Khan, "LL-XSS: End-to-End Generative Model-based XSS Payload Creation," 2024 21st Learning and Technology Conference (L&T), Jeddah, Saudi Arabia, 2024, pp. 121-126
[19]. Y. Yao, J. He, T. Li, Y. Wang, X. Lan and Y. Li, "An Automatic XSS Attack Vector Generation Method Based on the Improved Dueling DDQN Algorithm," IEEE Transactions on Dependable and Secure Computing, vol. 21, no. 4, pp. 2852-2868, July-Aug. 2024
[20]. Y. Deng, W. Zhang, S. Pan, and L. Bing, "Multilingual Jailbreak Challenges in Large Language Models," arXiv preprint arXiv:2310.06474, Oct. 2023.

簡易檢索 / 詳目顯示

相關論文