跳到主要內容

簡易檢索 / 詳目顯示

研究生: 葉季霏
Chi-Fei Yeh
論文名稱: 一種多特徵 RGB 圖像表示法結合深度學習之 Android 惡意軟體偵測研究
A multi-feature RGB image representation combined with deep learning Android Malware detection research
指導教授: 陳奕明
口試委員:
學位類別: 碩士
Master
系所名稱: 管理學院 - 資訊管理學系
Department of Information Management
論文出版年: 2021
畢業學年度: 109
語文別: 中文
論文頁數: 92
中文關鍵詞: Android惡意軟體偵測靜態分析惡意軟體圖像化深度學習
外文關鍵詞: Android, Malware detection, Static analysis, Malware image method, Deep learning
相關次數: 點閱:13下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 人手一機是現今生活的寫照,不僅如此,行動裝置通常儲存大量使用者相關資料, 針對這些資料,當然存有專屬於個人所擁有之隱私資料,除此之外,便利的支付方式, 使得行動裝置也同時存有信用卡、金融卡等資料,對於大家每天使用的行動裝置安全, 也在近年來顯得更為重要。本論文主要針對時下市占率最高之行動裝置作業系統 Android 進行安全分析,透過靜態分析,並將 APK 檔案中的原始資料,轉換為圖像資 訊,使得可以快速分析出惡意軟體,圖像將結合該 APK 所使用到的權限與操作碼,後續組成 RGB 彩色圖像,下游分類器使用深度學習中,對於圖 像分類表現優異之卷積神經網路(Convolutional Neural Network),使用 Autoencoder 加上 Efficient Net 架構,實驗於 CIC2020 與 Android Malware Dataset 兩種資料集,並進行惡 意軟體四類別以及惡意軟體家族分類,準確率雙雙達到 97%、F1-Score 達到 96%。


    A hand-held device is a portrayal of today’s life. Mobile devices usually store a large amount of user-related data. For these data, there are of course private data owned by individuals. In addition, convenient Mobile payment methods make the device also stores information such as credit cards and financial cards, which has become more important in recent years for the safety of mobile devices that everyone uses every day. This research mainly conducts security analysis for Android, the mobile device operating system with the highest market share nowadays. Through static analysis, the original data in the APK file is converted into image information, so that malicious software can be quickly analyzed. The image will be combined with the permission and operation code used by the APK, and then the RGB color map is formed. The downstream classifier uses the "Convolutional Neural Network" that performs well in image classification in deep learning. Using Autoencoder and Efficient Net architecture, experimenting with two data sets, CIC2020 and Android Malware Dataset, and categorizing the four categories of malware and malware families, the accuracy rate reached 97%, and the F1-Score reached 96%.

    論文摘要 i Abstract ii 誌謝 iii 目錄 iv 圖目錄 vi 表目錄 viii 第一章 緒論 1 1-1 研究背景 1 1-2 研究動機 3 1-3 研究目的與貢獻 6 第二章 相關研究 8 2-1 Android惡意軟體偵測背景技術與知識 8 2-1-1 動靜態分析與操作碼研究 8 2-1-2 以Android權限為特徵相關研究 15 2-2 與本論文直接相關之研究 18 2-2-1 多特徵結合的惡意軟體檢測 18 2-2-2 惡意軟體圖像化 22 2-2-3 深度學習應用於惡意軟體偵測 26 第三章 研究方法 29 3-1 系統架構 29 3-1-1 資料前處理 30 3-1-2 惡意軟體分類模組 38 3-1-3 評估指標 42 3-2 運作流程 43 第四章 實驗結果 45 4-1 實驗環境與使用資料集 45 4-1-1 實驗軟硬體設置 45 4-1-2 資料集 46 4-2 惡意軟體分類有效性實驗 48 4-2-1 實驗一 48 4-2-2 實驗二 52 4-3 消融測試實驗 54 4-3-1 實驗三 54 4-3-2 實驗四 59 4-4 相似研究論文比較實驗 64 4-4-1 實驗五 64 4-5 Robust實驗 66 4-5-1 實驗六 66 4-6 實驗結果與討論 68 第五章 結 論 70 5-1 結論與貢獻 70 5-2 研究限制 72 5-3 未來研究 73 參考文獻 74

    [1] Statcounter. "Mobile Operating System Market Share Worldwide." https://gs.statcounter.com/os-market-share/mobile/worldwide (accessed 2021).
    [2] R. Samani and C. f. t. M. A. T. R. a. M. M. R. team. "McAfee Mobile Threat Report Q1, 2020." https://www.mcafee.com/content/dam/consumer/en-us/docs/2020-Mobile-Threat-Report.pdf?fbclid=IwAR3nQcCasiOmNYxEkB-9OOok8_8ExX_CkUDxua1SmpMZUvMAHd4wXH0ShyU (accessed.
    [3] S. Greengard, "Cyber security gets smart," Communications of the ACM, vol. 59, no. 5, pp. 29-31, 2016.
    [4] T. Hsien-De Huang and H.-Y. Kao, "R2-d2: Color-inspired convolutional neural network (cnn)-based android malware detections," in 2018 IEEE International Conference on Big Data (Big Data), 2018: IEEE, pp. 2633-2642.
    [5] Y. Fang, Y. Gao, F. Jing, and L. Zhang, "Android malware familial classification based on DEX file section features," IEEE Access, vol. 8, pp. 10614-10627, 2020.
    [6] H. Bai, N. Xie, X. Di, and Q. Ye, "FAMD: A Fast Multifeature Android Malware Detection Framework, Design, and Implementation," IEEE Access, vol. 8, pp. 194729-194740, 2020.
    [7] F. Mercaldo and A. Santone, "Deep learning for image-based mobile malware detection," Journal of Computer Virology and Hacking Techniques, pp. 1-15, 2020.
    [8] Y. Ding, X. Zhang, J. Hu, and W. Xu, "Android malware detection method based on bytecode image," Journal of Ambient Intelligence and Humanized Computing, pp. 1-10, 2020.
    [9] F. Ullah et al., "Cyber security threats detection in internet of things using deep learning approach," IEEE Access, vol. 7, pp. 124379-124389, 2019.
    [10] S. Millar, N. McLaughlin, J. Martinez del Rincon, P. Miller, and Z. Zhao, "DANdroid: A multi-view discriminative adversarial network for obfuscated Android malware detection," in Proceedings of the Tenth ACM Conference on Data and Application Security and Privacy, 2020, pp. 353-364.
    [11] B. Yuan, J. Wang, D. Liu, W. Guo, P. Wu, and X. Bao, "Byte-level malware classification based on markov images and deep learning," Computers & Security, vol. 92, p. 101740, 2020.
    [12] K. Bakour and H. M. Ünver, "DeepVisDroid: android malware detection by hybridizing image-based features with deep learning techniques," Neural Computing and Applications, pp. 1-18, 2021.
    [13] 廖舶凱, "Efficient Net結合自動編碼器壓縮模型之Android惡意程式偵測研究," 碩士, 資訊管理學系, 國立中央大學, 2020.
    [14] W. Wang, J. Wei, S. Zhang, and X. Luo, "LSCDroid: malware detection based on local sensitive API invocation sequences," IEEE Transactions on Reliability, vol. 69, no. 1, pp. 174-187, 2019.
    [15] M. K. Alzaylaee, S. Y. Yerima, and S. Sezer, "DL-Droid: Deep learning based android malware detection using real devices," Computers & Security, vol. 89, p. 101663, 2020.
    [16] A. Saracino, D. Sgandurra, G. Dini, and F. Martinelli, "Madam: Effective and efficient behavior-based android malware detection and prevention," IEEE Transactions on Dependable and Secure Computing, vol. 15, no. 1, pp. 83-97, 2016.
    [17] G. Suarez-Tangil, S. K. Dash, M. Ahmadi, J. Kinder, G. Giacinto, and L. Cavallaro, "Droidsieve: Fast and accurate classification of obfuscated android malware," in Proceedings of the Seventh ACM on Conference on Data and Application Security and Privacy, 2017, pp. 309-320.
    [18] D. Bilar, "Opcodes as predictor for malware," International journal of electronic security and digital forensics, vol. 1, no. 2, pp. 156-168, 2007.
    [19] I. Santos, F. Brezo, X. Ugarte-Pedrero, and P. G. Bringas, "Opcode sequences as representation of executables for data-mining-based unknown malware detection," Information Sciences, vol. 231, pp. 64-82, 2013.
    [20] N. McLaughlin et al., "Deep android malware detection," in Proceedings of the Seventh ACM on Conference on Data and Application Security and Privacy, 2017, pp. 301-308.
    [21] A. Pektaş and T. Acarman, "Learning to detect Android malware via opcode sequences," Neurocomputing, vol. 396, pp. 599-608, 2020.
    [22] W. Z. Zarni Aung, "Permission-based android malware detection," International Journal of Scientific & Technology Research, vol. 2, no. 3, pp. 228-234, 2013.
    [23] V. P. Dharmalingam and V. Palanisamy, "A novel permission ranking system for android malware detection—the permission grader," Journal of Ambient Intelligence and Humanized Computing, vol. 12, no. 5, pp. 5071-5081, 2021.
    [24] C. Zhao, C. Wang, and W. Zheng, "Android malware detection based on sensitive permissions and APIs," in International Conference on Security and Privacy in New Computing Environments, 2019: Springer, pp. 105-113.
    [25] D. Zhu, T. Xi, P. Jing, D. Wu, Q. Xia, and Y. Zhang, "A transparent and multimodal malware detection method for Android apps," in Proceedings of the 22nd International ACM Conference on Modeling, Analysis and Simulation of Wireless and Mobile Systems, 2019, pp. 51-60.
    [26] L. Yu and H. Liu, "Feature selection for high-dimensional data: A fast correlation-based filter solution," in Proceedings of the 20th international conference on machine learning (ICML-03), 2003, pp. 856-863.
    [27] A. V. Dorogush, V. Ershov, and A. Gulin, "CatBoost: gradient boosting with categorical features support," arXiv preprint arXiv:1810.11363, 2018.
    [28] H. Zhu, Y. Li, R. Li, J. Li, Z.-H. You, and H. Song, "Sedmdroid: An enhanced stacking ensemble of deep learning framework for android malware detection," IEEE Transactions on Network Science and Engineering, 2020.
    [29] H. M. Ünver and K. Bakour, "Android malware detection based on image-based features and machine learning techniques," SN Applied Sciences, vol. 2, no. 7, pp. 1-15, 2020.
    [30] K. Bakour and H. M. Ünver, "VisDroid: Android malware classification based on local and global image features, bag of visual words and machine learning techniques," Neural Computing and Applications, vol. 33, no. 8, pp. 3133-3153, 2021.
    [31] A. Oliva and A. Torralba, "Modeling the shape of the scene: A holistic representation of the spatial envelope," International journal of computer vision, vol. 42, no. 3, pp. 145-175, 2001.
    [32] M. A. Stricker and M. Orengo, "Similarity of color images," in Storage and retrieval for image and video databases III, 1995, vol. 2420: International Society for Optics and Photonics, pp. 381-392.
    [33] C. Sadowski and G. Levin, "Simhash: Hash-based similarity detection," Technical report, Google, 2007.
    [34] C. Cortes and V. Vapnik, "Support-vector networks," Machine learning, vol. 20, no. 3, pp. 273-297, 1995.
    [35] H. Naeem et al., "Malware detection in industrial internet of things based on hybrid image visualization and deep learning model," Ad Hoc Networks, vol. 105, p. 102154, 2020.
    [36] K. Simonyan and A. Zisserman, "Very deep convolutional networks for large-scale image recognition," arXiv preprint arXiv:1409.1556, 2014.
    [37] A. Naway and Y. Li, "Android malware detection using autoencoder," arXiv preprint arXiv:1901.07315, 2019.
    [38] Virusshare. https://virusshare.com/ (accessed 2021).
    [39] M. Tan and Q. Le, "Efficientnet: Rethinking model scaling for convolutional neural networks," in International Conference on Machine Learning, 2019: PMLR, pp. 6105-6114.
    [40] Androguard. https://github.com/androguard/androguard (accessed.
    [41] Q. Jerome, K. Allix, R. State, and T. Engel, "Using opcode-sequences to detect malicious Android applications," in 2014 IEEE International Conference on Communications (ICC), 2014: IEEE, pp. 914-919.
    [42] F. Naït-Abdesselam, A. Darwaish, and C. Titouna, "An Intelligent Malware Detection and Classification System Using Apps-to-Images Transformations and Convolutional Neural Networks," in 2020 16th International Conference on Wireless and Mobile Computing, Networking and Communications (WiMob)(50308), 2020: IEEE, pp. 1-6.
    [43] G. E. Hinton and R. R. Salakhutdinov, "Reducing the dimensionality of data with neural networks," science, vol. 313, no. 5786, pp. 504-507, 2006.
    [44] S. Mahdavifar, A. F. A. Kadir, R. Fatemi, D. Alhadidi, and A. A. Ghorbani, "Dynamic Android Malware Category Classification using Semi-Supervised Deep Learning," in 2020 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech), 2020: IEEE, pp. 515-522.
    [45] F. Wei, Y. Li, S. Roy, X. Ou, and W. Zhou, "Deep ground truth analysis of current android malware," in International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment, 2017: Springer, pp. 252-276.
    [46] K. Allix, T. F. Bissyandé, J. Klein, and Y. Le Traon, "Androzoo: Collecting millions of android apps for the research community," in 2016 IEEE/ACM 13th Working Conference on Mining Software Repositories (MSR), 2016: IEEE, pp. 468-471.
    [47] G. Sood. "virustotal: R Client for the virustotal API." https://www.virustotal.com/gui/ (accessed.

    QR CODE
    :::