none｜國立中央大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	朱席靚 Hsi-Ching Chu
論文名稱：	From Data to Action: CTI Analysis and ATT\&CK Technique Correlation
指導教授：	孫敏德 Min-Te Sun
口試委員:
學位類別：	碩士 Master
系所名稱：	資訊電機學院 - 資訊工程學系 Department of Computer Science & Information Engineering
論文出版年：	2024
畢業學年度：	112
語文別：	英文
論文頁數：	44
中文關鍵詞：	網路威脅、自然語言處理、機器學習
相關次數：	點閱：5 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

網路威脅情報 (CTI) 透過提供來自不同資料來源的可行見解，顯著增強組織網路安全防禦。本研究研究了 CTI 分析與 MITRE ATT&CK 框架之間的相關性，重點關注它們的結合以增強威脅偵測和回應能力。這項研究的一個關鍵方面涉及開發一個分類器，使用基於 BERT 的模型將 CTI 報告映射到特定的ATT&CK 技術。我們的模型比基線 SecBERT 有了顯著的改進，F1-score 提高了 2.6%，Top-3 Accuracy 提高了 4.2%。透過 CTI 與 MITRE ATT&CK 框架的整合，研究人員可以從被動式網路安全策略轉向主動式網路安全策略。這種整合可以快速偵測新出現的威脅，提高事件回應效率，並強化針對不斷變化的網路威脅的防禦措施。最終，CTI 和 ATT&CK 之間的協同效應在當今動態的威脅環境中形成了一種全面的網路安全管理方法。

Cyber Threat Intelligence (CTI) significantly enhances organizational cybersecurity defenses by providing actionable insights from diverse data sources. This research studies the correlation between CTI analysis and the MITRE ATT&CK framework, focusing on their alignment to strengthen threat detection and response capabilities. A pivotal aspect of this study involves developing a classifier using a fine-tuned BERT-based model to map CTI reports to specific ATT&CK techniques. Our model demonstrated substantial improvements over the baseline SecBERT, achieving a 2.6% higher F1-score and a 4.2% improvement in Top-3 Accuracy. By integrating CTI with the MITRE ATT&CK framework, researchers can shift from reactive to proactive cybersecurity strategies. This integration enables swift detection of emerging threats, enhances incident response effectiveness, and fortifies defensive measures against evolving cyber threats. Ultimately, the synergy between CTI and ATT&CK fosters a comprehensive approach to cybersecurity management in today's dynamic threat landscape.

Introduction 1
Related Work 3
1 Machine-Learning based CTI Analysis 3
2 Deep-Learning based CTI Analysis 4
Preliminary 5
1 Natural Language ToolKit 5
2 spaCy 5
3 Hugging Face Tokenizer 5
4 Long Short-Term Memory 6
5 Gated Recurrent Unit 7
6 Structured Threat Information eXpression 8
7 Self-Attention Mechanisms 8
8 Transformer 9
8.1 Bidirectional Encoder Representations 10
8.2 SecBERT 12
8.3 Sentence Transformer Fine-tuning 13
Design 14
1 Motivation 14
2 Problem Statement 14
3 Research Challenges 14
4 Overview 15
5 Dataset Construction 16
5.1 Data Parsing 16
5.2 Data Enrichment 16
5.3 Data Cleaning 16
5.4 Text Tokenization 17
5.5 Dataset Preparation 17
6 Pre-processing 17
7 Pre-trained Model Fine-tuning 18
8 Deployment 20
Performance 21
1 Experimental Environment 21
2 Dataset Information 21
3 Evaluation Metrics 22
4 Model Training 24
5 Experiment Results and Analysis 24
5.1 Evaluation Results of the CAPEC Dataset 25
5.2 Evaluation Results of the Hoang dataset 26
6 Ablation Studies 27
Conclusions 28
Reference 29
                                

[1] jackaduma (kun). https://huggingface.co/jackaduma/. Accessed: 2024-2-28.
[2] Mitre att&ck. https://attack.mitre.org. Accessed: 2024-2-28.
[3] Gbadebo Ayoade, Swarup Chandra, Latifur Khan, Kevin Hamlen, and Bhavani Thuraisingham. Automated threat report classification over multi-source data. In 2018 IEEE 4th International Conference on Collaboration and Internet Computing (CIC), pages 236–245, 2018.
[4] V.S.M. Legoy. Retrieving att&ck tactics and techniques in cyber threat reports, 2019.
[5] Md Ariful Haque, Sachin Shetty, Charles A. Kamhoua, and Kimberly Gold. Adversarial echnique validation & defense selection using attack graph & att&ck matrix. In 2023 International Conference on Computing, Networking and Communications (ICNC), pages 181–187, 2023.
[6] Mengming Li, Rongfeng Zheng, Liang Liu, and Pin Yang. Extraction of threat actions from threat-related articles using multi-label machine learning classification method. In 2019 2nd International Conference on Safety Produce Informatization (IICSPI), pages 428–431, 2019.
[7] Ghaith Husari, Ehab Al-Shaer, Mohiuddin Ahmed, Bill Chu, and Xi Niu. Ttpdrill: Automatic and accurate extraction of threat actions from unstructured text of cti sources. In Proceedings of the 33rd Annual Computer Security Applications Conference, ACSAC ’17, page 103–115, New York, NY, USA, 2017. Association for Computing Machinery.
[8] Isaac Wiafe, Felix Nti Koranteng, Emmanuel Nyarko Obeng, Nana Assyne, Abigail Wiafe, and Stephen R. Gulliver. Artificial intelligence for cybersecurity: A systematic mapping of literature. IEEE Access, 8:146598–146612, 2020.
[9] Abel Yeboah-Ofori, Haralambos Mouratidis, Umar Ismai, Shareeful Islam, and Spyridon Papastergiou. Cyber supply chain threat analysis and prediction using machine learning and ontology. In Ilias Maglogiannis, John Macintyre, and Lazaros Iliadis, editors, Artificial Intelligence Applications and Innovations, pages 518–530, Cham, 2021. Springer International Publishing.
[10] Masashi KADOGUCHI, Shota HAYASHI, Masaki HASHIMOTO, and Akira OTSUKA. Exploring the dark web for cyber threat intelligence using machine leaning. In 2019 IEEE International Conference on Intelligence and Security Informatics (ISI), pages 200–202, 2019.
[11] Erik Hemberg, Jonathan Kelly, Michal Shlapentokh-Rothman, Bryn Marie Reinstadler, Katherine Xu, Nick Rutar, and Una-May O’Reilly. BRON - linking attack tactics, techniques, and patterns with defensive weaknesses, vulnerabilities and affected platform configurations. CoRR, abs/2010.00533, 2020.
[12] Seyed Mohammad Ghaffarian and Hamid Reza Shahriari. Software vulnerability analysis and discovery using machine-learning and data-mining techniques: A survey. ACM Comput. Surv., 50(4), aug 2017.
[13] Yizhe You, Jun Jiang, Zhengwei Jiang, Peian Yang, Baoxu Liu, Huamin Feng, Xuren Wang, and Ning Li. Tim: threat context-enhanced ttp intelligence mining on unstructured threat data. Cybersecurity, 5, 12 2022.
[14] Valentine Legoy, Marco Caselli, Christin Seifert, and Andreas Peter. Automated retrieval of att&ck tactics and techniques for cyber threat reports. April 2020.
[15] Stefano Silvestri, Shareeful Islam, Spyridon Papastergiou, Christos Tzagkarakis, and Mario Ciampi. A machine learning approach for the NLP-based analysis of cyber threats and vulnerabilities of the healthcare ecosystem. Sensors (Basel), 23(2):651, January 2023.
[16] Mohamed Amine Ferrag, Mthandazo Ndhlovu, Norbert Tihanyi, Lucas C. Cordeiro, Merouane Debbah, Thierry Lestable, and Narinderjit Singh Thandi. Revolutionizing cyber threat detection with large language models: A privacy-preserving bert-based lightweight model for iot/iiot devices, 2024.
[17] Ehsan Aghaei, Xi Niu, Waseem Shadid, and Ehab Al-Shaer. Securebert: A domainspecific language model for cybersecurity. In Fengjun Li, Kaitai Liang, Zhiqiang Lin, and Sokratis K. Katsikas, editors, Security and Privacy in Communication Networks, pages 39–56, Cham, 2023. Springer Nature Switzerland.
[18] Vittorio Orbinato, Mariarosaria Barbaraci, Roberto Natella, and Domenico Cotroneo. Automatic mapping of unstructured Cyber Threat Intelligence: An experimental study. In Proceedings of the 33rd IEEE International Symposium on Software Reliability Engineering (ISSRE), 2022.
[19] Thin Tharaphe Thein, Yuki Ezawa, Shunta Nakagawa, Keisuke Furumoto, Yoshiaki Shiraishi, Masami Mohri, Yasuhiro Takano, and Masakatu Morii. Paragraph-based estimation of cyber kill chain phase from threat intelligence reports. Journal of Information Processing, 28:1025–1029, 2020.
[20] Shi Zong, Alan Ritter, Graham Mueller, and Evan Wright. Analyzing the perceived severity of cybersecurity threats reported on social media, 2019. [21] Amirreza Niakanlahiji, Jinpeng Wei, and Bei-Tseng Chu. A natural language processing based trend analysis of advanced persistent threat techniques. In 2018 IEEE International Conference on Big Data (Big Data), pages 2995–3000, 2018.
[22] Benjamin Ampel, Sagar Samtani, Steven Ullman, and Hsinchun Chen. Linking common vulnerabilities and exposures to the mitre att&ck framework: A self-distillation approach. CoRR, abs/2108.01696, 2021.
[23] Md Rayhanur Rahman, Rezvan Mahdavi-Hezaveh, and Laurie Williams. A literature review on mining cyberthreat intelligence from unstructured texts. In 2020 International Conference on Data Mining Workshops (ICDMW), pages 516–525, 2020.
[24] Paulo M. M. R. Alves, Geraldo P. R. Filho, and Vin´ıcius P. Gon¸calves. Leveraging bert’s power to classify ttp from unstructured text. In 2022 Workshop on Communication Networks and Power Systems (WCNPS), pages 1–7, 2022.
[25] Matthew Honnibal, Ines Montani, Sofie Van Landeghem, and Adriane Boyd. spaCy: Industrial-strength Natural Language Processing in Python. 2020.
[26] Anthony Moi and Nicolas Patry. HuggingFace’s Tokenizers, April 2023.
[27] Sepp Hochreiter and J¨urgen Schmidhuber. Long short-term memory. Neural Computation, 9(8):1735–1780, 1997.
[28] Junyoung Chung, Caglar Gulcehre, KyungHyun Cho, and Yoshua Bengio. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555, 2014.
[29] Stix. http://stixproject.github.io/about/. Accessed: 2024-3-20.
[30] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N.Gomez, Lukasz Kaiser, and Illia Polosukhin. Attention is all you need, 2023.
[31] Adam Lopez. Statistical machine translation. ACM Computing Surveys (CSUR), 40(3):1–49, 2008.
[32] Yang Liu and Mirella Lapata. Text summarization with pretrained encoders, 2019.
[33] Madeleine Bates. Models of natural language understanding. Proceedings of the National Academy of Sciences, 92(22):9977–9982, 1995.
[34] Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N.Gomez, Lukasz Kaiser, and Illia Polosukhin. Attention is all you need. CoRR, abs/1706.03762, 2017.
[35] Capec. https://capec.mitre.org. Accessed: 2024-3-20.
[36] Hoang Cuong Nguyen. Hoangcuongnguyen/cti-to-mitre-dataset · datasets at hugging face.

簡易檢索 / 詳目顯示

相關論文