跳到主要內容

簡易檢索 / 詳目顯示

研究生: 洪千惠
Chien-Hui Hung
論文名稱: 結合系統呼叫序列關係與局部特徵計算 之行動惡意程式檢測方法
Combining system call sequence relationship with local feature calculation in a mobile malware detection method
指導教授: 陳奕明
Yi-Ming Chen
口試委員:
學位類別: 碩士
Master
系所名稱: 管理學院 - 資訊管理學系
Department of Information Management
論文出版年: 2022
畢業學年度: 110
語文別: 中文
論文頁數: 67
中文關鍵詞: Android 惡意程式分析動態分析系統呼叫序列序列關係深度學習
外文關鍵詞: Android malware analysis, dynamic analysis, system call sequence, sequence relationship, deep learning
相關次數: 點閱:19下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 在這資訊快速發展的時代,市占率最高的作業系統—Android其開源特性成為駭客的攻擊目標,進而威脅到使用者的隱私。在惡意程式分析中的動態分析不受混淆及動態載入攻擊的影響,還可以了解到程式在執行時的行為,當中的系統呼叫(System calls)能實際呈現應用程式與內核(kernel)間的溝通,因此本研究以動態檢測方法進行,並以系統呼叫為特徵,來表示應用程式的行為。利用TF-IDF的特徵處理方法能將其系統呼叫特徵依據出現的次數以及在整體的關係給予不同重要程度的權
    重分配,不過此方法以一個系統呼叫為一個單位,因此在計算時未有序列的前後關係,而在系統呼叫序列(System call sequences)中,前後關係有其重要性,因此本研究利用n-gram概念結合局部TF-IDF來讓序列型的資料能取得含有序列前後關係及重要程度的特徵。而在惡意程式檢測領域中,深度學習已有卓越的分類效果,因此本研究將動態序列特徵以提出的方法化為向量,並在深度學習的模型上分析Android應用程式。於本研究顯示利用本方法在應用程式的多元分類下能提高3%以上的準確率,而對於未知的2019年資料集準確率提升11%。


    In this era of rapid development of information technology, Android has the highest market share in the operating system. However, its open source feature has been the target of hackers, which in turn threatens the privacy of users. Dynamic analysis in malware analysis is not affected by obfuscation and dynamic loading attacks, but also provides insight into the behavior of the program during execution. The system
    calls can actually represent the communication between the application and the kernel, so this research uses a dynamic detection method to analyze the system calls as a feature to represent the behavior of the application. The TF-IDF feature processing method can assign different weights to system call features according to the number of call occurrences and the overall relationship, but this method uses one system call as a unit, so there is no sequence relationship in the calculation. However, in System call sequences, the pre- and post-sequence relationships
    have their importance. Therefore, this research uses the concept of n-gram combined with local TF-IDF to enable sequence-based data to obtain features containing the pre- and post-sequence relationships and importance of sequences. In the field of malware detection, deep learning has excellent classification results, so in this research, dynamic sequence features are transformed into vectors by the proposed method and Android applications are analyzed on the deep learning model. In this research, it is shown that using this method can improve the accuracy by more than 3% for multiple classification of applications and 11 % for unknown 2019 dataset.

    摘要.......................................................i Abstract .................................................ii 一、緒論 ..................................................1 1-1 研究背景 .............................................1 1-2 研究動機 .............................................2 1-3 研究目的及貢獻 ........................................3 1-4 章節架構 .............................................4 二、相關研究 ...............................................5 2-1 惡意程式分析之相關研究 .................................5 2-2 擷取動態特徵之相關研究 .................................5 2-3 系統呼叫特徵之相關研究 .................................7 2-4 序列型模型之相關研究 .................................13 2-5 相關研究小結 .........................................17 三、研究方法 ..............................................18 3-1 系統架構 ............................................18 3-1-1 系統呼叫提取模組(System Call Extraction Module)...19 3-1-2 系統呼叫轉換模組(System Call Conversion Module)...20 3-1-3 向量轉換模組(Vector Conversion Module)............21 3-1-4 分類模組(Classification Module) ..................24 3-2 評估指標.............................................25 3-3 系統流程.............................................27 四、實驗與討論.............................................29 4-1 實驗環境與資料集......................................29 4-1-1 實驗環境...........................................29 4-1-2 資料集.............................................30 4-2 n-gram 的參數比較....................................32 4-2-1 實驗一 二元分類的 n-gram 的參數比較..................32 4-2-2 實驗二 多元分類的 n-gram 的參數比較..................33 4-3 本研究方法的有效性....................................34 4-3-1 實驗三 二元分類.....................................35 4-3-2 實驗四 多元分類.....................................39 4-4 模型的適應性..........................................43 4-4-1 實驗五 未知惡意樣本的測試............................43 4-4-2 實驗六 未知惡意樣本且測試樣本較新測試.................44 4-5 實驗結果與討論........................................46 五、結論..................................................48 5-1 結論與貢獻...........................................48 5-2 本研究之限制與未來研究.................................50 5-2-1 研究限制...........................................50 5-2-2 未來研究...........................................50 參考文獻..................................................52

    [1] Kaspersky. (2021). IT threat evolution Q3 2021. Mobile statistics. Available: https://securelist.com/it-threat-evolution-in-q3-2021-mobile-statistics/105020/(accessed 2022).
    [2] StatCounter. (2022). Mobile Operating System Market Share Worldwide. Available: https://gs.statcounter.com/os-market-share/mobile/worldwide/#monthly-202102-202203-bar (accessed 2022).
    [3] FireEye. Out of Pocket. (2015). A Comprehensive Mobile Threat Assessment of 7 Million iOS and Android Apps. Available: https://www.fireeye.com/blog/threat-research/2015/02/the_fireeye_mobilet.html (accessed 2022).
    [4] A. Saracino, D. Sgandurra, G. Dini, and F. Martinelli, "Madam: Effective and efficient behavior-based android malware detection and prevention," IEEE Transactions on Dependable and Secure Computing, vol. 15, no. 1, pp. 83-97,
    2016.
    [5] P. Feng, J. Ma, C. Sun, X. Xu, and Y. Ma, "A Novel Dynamic Android Malware Detection System With Ensemble Learning," IEEE Access, vol. 6, pp. 30996-31011, 2018.
    [6] M. Anshori, F. Mar'i, and F.A. Bachtiar, "Comparision of machine learning methods for android malicious software classification base on system call," in 2019 International Conference on Sustainable Information Engineering and
    Technology (SIET), 2019: IEEE, pp. 343-348.
    [7] S. Malik, "Android System Call Analysis for Malicious Application Detection," International Journal of Computer Sciences and Engineering , vol. 5, no. 11, pp. 105-108, 2017.
    [8] M.Z. Mas' ud, S. Sahib, M.F. Abdollah, S.R. Selamat, R. Yusof, "Analysis of Features Selection and Machine Learning Classifier in Android Malware Detection" in 2014 International Conference on Information Science and
    Applications (ICISA), 2014: IEEE, pp. 1-5.
    [9] M. K. Alzaylaee, S. Y. Yerima, and S. Sezer, "DL-Droid: Deep learning based android malware detection using real devices," Computers & Security, vol. 89, pp.
    101663, 2020.
    [10] M. K. Alzaylaee, S. Y. Yerima, and S. Sezer, "DynaLog: An automated dynamic analysis framework for characterizing android applications," in 2016 International Conference On Cyber Security And Protection Of Digital Services (Cyber
    Security), 2016: IEEE, pp. 1-8.
    [11] 劉育祺,2021,“使用於行動惡意程式偵測之局部權重系統呼叫序列壓縮方法,國立中央大學資訊管理研究所碩士論文。
    [12] A. Martín, R. Lara-Cabrera, and D. Camacho, "Android malware detection through hybrid features fusion and ensemble classifiers: the AndroPyTool framework and the OmniDroid dataset," Information Fusion, vol. 52, pp. 128-142, 2019.
    [13] pjlantz, "Droidbox: Dynamic analysis of Android apps." [Online]. Available: https://github.com/pjlantz/droidbox (accessed 2022).
    [14] C. Boettiger, "An introduction to Docker for reproducible research," ACM SIGOPS Operating Systems Review, vol. 49, no. 1, pp. 71-79, 2015.
    [15] J. Yan, Y. Qi, and Q. Rao, "LSTM-based hierarchical denoising network for Android malware detection," Security and Communication Networks , vol. 2018, 2018.
    [16] M. Z. Mas' ud, S. Sahib, M. F. Abdollah, S. R. Selamat, and C. Y. Huoy, "A Comparative Study on Feature Selection Method for N-gram Mobile Malware Detection," Int. J. Netw. Secur., vol. 19, no. 5, pp. 727-733, 2017.
    [17] A. N. Jahromi, S. Hashemi, A. Dehghantanha, R. M. Parizi, and K.-K. R. Choo, "An enhanced stacked LSTM method with no random initialization for malware threat hunting in safety and time-critical systems," IEEE Transactions on
    Emerging Topics in Computational Intelligence, vol. 4, no. 5, pp. 630-640, 2020.
    [18] W. Xie, S. Xu, S. Zou, and J. Xi, "A system-call behavior language system for malware detection using a sensitivity-based LSTM model," in Proceedings of the
    2020 3rd International Conference on Computer Science and Software Engineering, 2020, pp. 112-118.
    [19] K. Al-Thelaya and E.-S. M. El-Alfy, "Android Malware Detector Based on Sequences of System Calls and Bidirectional Recurrent Networks," in International Symposium on Security in Computing and Communication, 2019:
    Springer, pp. 309-321.
    [20] Y. M. Chen, C. H. Hsu, and K. C. K. Chung, "A novel preprocessing method for solving long sequence problem in android malware detection," in 2019 Twelfth International Conference on Ubi-Media Computing (Ubi-Media), 2019: IEEE, pp. 12-17.
    [21] Y.M. Chen, A.C. He, G.C. Chen, and Y.C. Liu, "Android malware detection system integrating block feature extraction and multi-head attention mechanism," in 2020
    International Computer Symposium (ICS), 2020: IEEE, pp. 408-413.
    [22] A. Vaswani et al., "Attention is all you need," Advances in neural information processing systems , vol. 30, 2017.
    [23] A. Graves, "Supervised Sequence Labelling," in Supervised Sequence Labelling with Recurrent Neural Networks. Studies in Computational Intelligence, vol 385.,
    2012 Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24797-2_2
    [24] X. Xiao, S. Zhang, F. Mercaldo, G. Hu, and A. K. Sangaiah, "Android malware detection based on system call sequences and LSTM," Multimedia Tools and Applications, vol. 78, no. 4, pp. 3979-3999, 2019.
    [25] Y. Yu, X. Si, C. Hu, and J. Zhang, "A Review of Recurrent Neural Networks: LSTM Cells and Network Architectures, " Neural computation , vol. 31, no. 7, pp.
    1235-1270, 2019.
    [26] S. Mahdavifar, A. F. A. Kadir, R. Fatemi, D. Alhadidi, and A. A. Ghorbani, "Dynamic Android Malware Category Classification using Semi-Supervised Deep Learning," in 2020 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on
    Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech), 2020: IEEE, pp. 515-522.
    [27] D. Arp, M. Spreitzenbarth, M. Hubner, H. Gascon, K. Rieck, and C. Siemens, "Drebin: Effective and explainable detection of android malware in your pocket," in Ndss, 2014, vol. 14, pp. 23-26.
    [28] K. Allix, T. F. Bissyandé, J. Klein, and Y. Le Traon, "Androzoo: Collecting millions of android apps for the research community," in 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR), 2016: IEEE, pp. 468-471.
    [29] H. Sistemas. "VirusTotal." https://www.virustotal.com/gui/ (accessed 2022).

    QR CODE
    :::