跳到主要內容

簡易檢索 / 詳目顯示

研究生: 劉育祺
Yu-Chi Liu
論文名稱: 使用於行動惡意程式偵測之 局部權重系統呼叫序列壓縮方法
Local weight system calls sequence compression method used in mobile malware detection
指導教授: 陳奕明
Yi-Ming Chen
口試委員:
學位類別: 碩士
Master
系所名稱: 管理學院 - 資訊管理學系
Department of Information Management
論文出版年: 2021
畢業學年度: 109
語文別: 中文
論文頁數: 74
中文關鍵詞: Android 惡意程式分析動態分析系統呼叫序列系統呼叫序列深度學習
外文關鍵詞: Android malware detection, Dynamic analysis, System call sequence, long sequence compression method, Deep learning
相關次數: 點閱:13下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著近年來持有行動裝置的數量逐漸增長,且越來越多的使用者將個人隱私資訊儲存於行動裝置,而其中以Android作業系統為最受歡迎的平台,但由於其平台開放性的緣故,使得大量的駭客以Android平台為首要攻擊目標,嚴重影響使用者的隱私安全。在Android惡意程式偵測領域中的動態分析能夠藉由將程式實際執行並透過將執行過程紀錄,實際了解程式實際執行的行為,並加以分析,其中常見的特徵為系統呼叫,同時隨著近年來深度學習的快速發展,使得惡意程式偵測領域達到更佳的分析結果。但由於取得的系統呼叫序列為應用程式執行一段時間所產生的執行紀錄,因此取得到的特徵屬於一段長序列,而長序列特徵會造成深度學習的模型訓練不佳以及訓練時間過長的問題,因此本研究提出一種基於局部權重的系統呼叫序列壓縮方法,能使壓縮的序列仍具有序列關係,並透過局部權重方法強化壓縮過的序列特徵,使深度學習模型快速訓練且達到高準確率的效果,並在二元分類的表現Accuracy以及F1-Score達到95.32%以及95.31%。


    With the gradual increase in the number of mobile devices held in recent years, and more and more users store personal privacy information on their mobile devices. And the Android operating system is the most popular operating system. However, due to the openness of its platform, a large number of hackers take the Android platform as the primary target of attack, which seriously affects the privacy and security of users. Dynamic analysis in the field of Android malware detection can analyze the actual execution behavior of the application by actually executing the application and recording the execution process. The common feature in Dynamic analysis is system calls. At the same time, with the rapid development of deep learning in recent years, the field of Android malware detection has achieved better analysis results. However, the extracted system call sequence is an execution record generated by the application running for a period of time, it belongs to a long sequence, and the long sequence features will cause problems such as poor training of the deep learning model and excessive training time. Therefore, this research proposes a system call sequence compression method based on local weights, which can make the compressed sequence still have sequence relationship, and strengthen the compressed sequence features through the local weight method, so that the deep learning model can be trained quickly and achieve high accuracy. And the performance Accuracy and F1-Score in the binary classification reached 95.32% and 95.31%.

    論文摘要 i Abstract ii 致謝 iii 一、 緒論 1 1-1 研究動機 4 1-2 研究貢獻 7 1-3 章節架構 7 二、 相關研究 8 2-1 動態特徵提取方法 8 2-2 長序列特徵之相關研究 10 2-3 序列型深度學習模型之相關研究 19 2-4 小結 25 三、 系統架構 27 3-1 系統架構 27 3-1-1 動態特徵提取模組(Dynamic Feature Extraction Module) 28 3-1-2 系統呼叫序列轉換模組(System Calls Sequence Conversion Module) 29 3-1-3 向量轉換模組(Vector Conversion Module) 30 3-1-4 Transformer分類模組(Transformer-based Classification Module) 34 3-2 評估指標 36 3-3 系統運作流程 37 四、 實驗結果 39 4-1 實驗環境與資料集 39 4-1-1 實驗環境 39 4-1-2 使用之資料集 40 4-2 不同子序列數量之比較 42 4-2-1 實驗一 42 4-3 計算權重之方法比較 44 4-3-1 實驗二 二元分類 44 4-3-2 實驗三 惡意類型分類以及惡意家族分類 46 4-4 與不同的子序列壓縮方法比較 50 4-4-1 實驗四 50 4-5 未知資料集測試 51 4-5-1 實驗五 51 4-6 實驗結果與討論 52 五、 結論與未來研究 54 5-1 結論與貢獻 54 5-2 研究限制與未來研究 56 參考文獻 58

    [1] StatCounter. "Desktop vs Mobile vs Tablet vs Console Market Share Worldwide." https://gs.statcounter.com/platform-market-share#monthly-201902-202102-bar (accessed.
    [2] Kaspersky. "IT threat evolution Q2 2020. Mobile statistics." https://securelist.com/it-threat-evolution-q2-2020-mobile-statistics/98337/ (accessed.
    [3] P. Graux, J.-F. Lalande, and V. V. T. Tong, "Obfuscated android application development," in Proceedings of the Third Central European Cybersecurity Conference, 2019, pp. 1-6.
    [4] A. Bacci, A. Bartoli, F. Martinelli, E. Medvet, F. Mercaldo, and C. A. Visaggio, "Impact of Code Obfuscation on Android Malware Detection based on Static and Dynamic Analysis," in ICISSP, 2018, pp. 379-385.
    [5] D. Wermke, N. Huaman, Y. Acar, B. Reaves, P. Traynor, and S. Fahl, "A large scale investigation of obfuscation use in google play," in Proceedings of the 34th Annual Computer Security Applications Conference, 2018, pp. 222-235.
    [6] A. Ananya, A. Aswathy, T. Amal, P. Swathy, P. Vinod, and S. Mohammad, "SysDroid: a dynamic ML-based android malware analyzer using system call traces," Cluster Computing, pp. 1-20, 2020.
    [7] M. Z. Mas' ud, S. Sahib, M. F. Abdollah, S. R. Selamat, and C. Y. Huoy, "A Comparative Study on Feature Selection Method for N-gram Mobile Malware Detection," Int. J. Netw. Secur., vol. 19, no. 5, pp. 727-733, 2017.
    [8] M. K. Alzaylaee, S. Y. Yerima, and S. Sezer, "DL-Droid: Deep learning based android malware detection using real devices," Computers & Security, vol. 89, p. 101663, 2020.
    [9] X. Xiao, S. Zhang, F. Mercaldo, G. Hu, and A. K. Sangaiah, "Android malware detection based on system call sequences and LSTM," Multimedia Tools and Applications, vol. 78, no. 4, pp. 3979-3999, 2019.
    [10] V. P. Dharmalingam and V. Palanisamy, "A novel permission ranking system for android malware detection—the permission grader," Journal of Ambient Intelligence and Humanized Computing, vol. 12, no. 5, pp. 5071-5081, 2021.
    [11] W. Xie, S. Xu, S. Zou, and J. Xi, "A system-call behavior language system for malware detection using a sensitivity-based LSTM model," in Proceedings of the 2020 3rd International Conference on Computer Science and Software Engineering, 2020, pp. 112-118.
    [12] A. N. Jahromi, S. Hashemi, A. Dehghantanha, R. M. Parizi, and K.-K. R. Choo, "An enhanced stacked LSTM method with no random initialization for malware threat hunting in safety and time-critical systems," IEEE Transactions on Emerging Topics in Computational Intelligence, vol. 4, no. 5, pp. 630-640, 2020.
    [13] Y.-M. Chen, A.-C. He, G.-C. Chen, and Y.-C. Liu, "Android malware detection system integrating block feature extraction and multi-head attention mechanism," in 2020 International Computer Symposium (ICS), 2020: IEEE, pp. 408-413.
    [14] Y. M. Chen, C. H. Hsu, and K. C. K. Chung, "A novel preprocessing method for solving long sequence problem in android malware detection," in 2019 Twelfth International Conference on Ubi-Media Computing (Ubi-Media), 2019: IEEE, pp. 12-17.
    [15] K. Al-Thelaya and E.-S. M. El-Alfy, "Android Malware Detector Based on Sequences of System Calls and Bidirectional Recurrent Networks," in International Symposium on Security in Computing and Communication, 2019: Springer, pp. 309-321.
    [16] H. Long, Z. Tian, and Y. Liu, "Detecting Android Malware Based on Dynamic Feature Sequence and Attention Mechanism," in 2021 IEEE 5th International Conference on Cryptography, Security and Privacy (CSP), 2021: IEEE, pp. 129-133.
    [17] A. Naway and Y. Li, "A review on the use of deep learning in android malware detection," arXiv preprint arXiv:1812.10360, 2018.
    [18] K. Kawakami, "Supervised sequence labelling with recurrent neural networks," Ph. D. thesis, 2008.
    [19] M. K. Alzaylaee, S. Y. Yerima, and S. Sezer, "DynaLog: An automated dynamic analysis framework for characterizing android applications," in 2016 International Conference On Cyber Security And Protection Of Digital Services (Cyber Security), 2016: IEEE, pp. 1-8.
    [20] M. K. Alzaylaee, S. Y. Yerima, and S. Sezer, "Emulator vs real phone: Android malware detection using machine learning," in Proceedings of the 3rd ACM on International Workshop on Security and Privacy Analytics, 2017, pp. 65-72.
    [21] A. Martín, R. Lara-Cabrera, and D. Camacho, "Android malware detection through hybrid features fusion and ensemble classifiers: the AndroPyTool framework and the OmniDroid dataset," Information Fusion, vol. 52, pp. 128-142, 2019.
    [22] androguard. "Androguard." https://github.com/androguard/androguard (accessed.
    [23] S. Arzt et al., "Flowdroid: Precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for android apps," Acm Sigplan Notices, vol. 49, no. 6, pp. 259-269, 2014.
    [24] pjlantz, "Droidbox: Dynamic analysis of Android apps." [Online]. Available: https://github.com/pjlantz/droidbox.
    [25] F. Martinelli, F. Marulli, and F. Mercaldo, "Evaluating convolutional neural network for effective mobile malware detection," Procedia computer science, vol. 112, pp. 2372-2381, 2017.
    [26] S. Mahdavifar, A. F. A. Kadir, R. Fatemi, D. Alhadidi, and A. A. Ghorbani, "Dynamic Android Malware Category Classification using Semi-Supervised Deep Learning," in 2020 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech), 2020: IEEE, pp. 515-522.
    [27] M. Sundermeyer, R. Schlüter, and H. Ney, "LSTM neural networks for language modeling," in Thirteenth annual conference of the international speech communication association, 2012.
    [28] A. Vaswani et al., "Attention is all you need," in Advances in neural information processing systems, 2017, pp. 5998-6008.
    [29] C. Boettiger, "An introduction to Docker for reproducible research," ACM SIGOPS Operating Systems Review, vol. 49, no. 1, pp. 71-79, 2015.
    [30] Q. Jerome, K. Allix, R. State, and T. Engel, "Using opcode-sequences to detect malicious Android applications," in 2014 IEEE international conference on communications (ICC), 2014: IEEE, pp. 914-919.
    [31] B. Trstenjak, S. Mikac, and D. Donko, "KNN with TF-IDF based framework for text categorization," Procedia Engineering, vol. 69, pp. 1356-1364, 2014.
    [32] W. Zhang, T. Yoshida, and X. Tang, "A comparative study of TF* IDF, LSI and multi-words for text classification," Expert Systems with Applications, vol. 38, no. 3, pp. 2758-2765, 2011.
    [33] H. Christian, M. P. Agus, and D. Suhartono, "Single document automatic text summarization using term frequency-inverse document frequency (TF-IDF)," ComTech: Computer, Mathematics and Engineering Applications, vol. 7, no. 4, pp. 285-294, 2016.
    [34] D. Arp, M. Spreitzenbarth, M. Hubner, H. Gascon, K. Rieck, and C. Siemens, "Drebin: Effective and explainable detection of android malware in your pocket," in Ndss, 2014, vol. 14, pp. 23-26.
    [35] K. Allix, T. F. Bissyandé, J. Klein, and Y. Le Traon, "Androzoo: Collecting millions of android apps for the research community," in 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR), 2016: IEEE, pp. 468-471.
    [36] H. Sistemas. "VirusTotal." https://www.virustotal.com/gui/ (accessed.

    QR CODE
    :::