| 研究生: |
吳昭慶 CHAO CHING WU |
|---|---|
| 論文名稱: |
結合雙特徵與污點分析 對抗混淆Android惡意程式之研究 Combining Dual Feature and Taint Analysis Anti-obfuscation Android Malware research |
| 指導教授: |
陳奕明
YI MING CHEN |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
管理學院 - 資訊管理學系 Department of Information Management |
| 論文出版年: | 2023 |
| 畢業學年度: | 111 |
| 語文別: | 中文 |
| 論文頁數: | 61 |
| 中文關鍵詞: | 機器學習 、Android 惡意程式檢測 、靜態分析 、污點分析 |
| 外文關鍵詞: | Machine learning, Android malware detection, static analysis, taint analysis |
| 相關次數: | 點閱:9 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
近年來,隨著智慧手機的普及,Android惡意程式成為了一個越來越嚴重的問題,導致許多使用者的隱私資訊被洩漏,進而導致實質上的財產損失。為了解決這個問題,許多研究者采用了各種方法來辨別和分類惡意程式,包括靜態分析、動態分析和機器學習等技術。然而,市場上出現了許多經過混淆的惡意程式,這些惡意程式往往能夠繞過現有的檢測方法,使得檢測率下降。在這種情況下,許多研究者開始使用動態分析的方法來解決混淆惡意程式的問題。但是,動態分析需要實際執行應用程式來擷取動態特徵,而且當資料集相對龐大時,前處理時間會非常冗長。相比之下,靜態分析不需要實際執行應用程式,前處理時間相對精簡許多,但常用的特徵如API_CALL容易受到混淆技術的影響,從而降低模型的準確率。為了克服這個問題,本研究提出了一種特殊的前處理方法,該方法將對靜態特徵進行向量轉換,從而使混淆技術對這些靜態特徵的影響降至最低。同時,本研究還結合了污點分析技術,以提高Android惡意程式檢測的準確率和效率。
在未混淆資料集上達到了99%的準確率,並在混淆後的資料集中達到了97.8的準確率,且對比動態分析降低了接近20倍的前處理時間。
In recent years, with the popularity of smartphones, Android malware has become an increasingly serious problem, leading to the leakage of many users' private information, which in turn leads to real property loss. To solve this problem, many researchers have adopted various methods to identify and classify malware, including static analysis, dynamic analysis, and machine learning techniques. However, there are many obfuscated malware on the market that can often bypass existing detection methods, resulting in a decrease in detection rates. In this context, many researchers have started to use dynamic analysis to address the problem of obfuscated malware. Dynamic analysis requires the actual execution of the application to capture dynamic features, and the pre-processing time can be very long when the dataset is relatively large. In contrast, static analysis does not require actual application execution, and the preprocessing time is much more streamlined, but common features such as API_CALL are susceptible to obfuscation techniques, thus reducing the accuracy of the model. To overcome this problem, this study proposes a special preprocessing method that performs vector transformation on static features, thus minimizing the effect of obfuscation techniques on these static features. This study also combines the taint analysis technique to improve the accuracy and efficiency of Android malware detection.The accuracy of 99% is achieved in the unobfuscated dataset and 97.8 in the obfuscated dataset, and the pre-processing time is improved by nearly 20 times compared to the dynamic analysis.
[1]K. Tam, A. Feizollah, N. B. Anuar, R. Salleh, and L. Cavallaro, “The Evolution of Android Malware and Android Analysis Techniques,” ACM Comput. Surv., vol. 49, no. 4, p. 76:1-76:41, 13 2017, doi: 10.1145/3017427.
[2]J. Tang, R. Li, Y. Jiang, X. Gu, and Y. Li, “Android malware obfuscation variants detection method based on multi-granularity opcode features,” Future Generation Computer Systems, vol. 129, pp. 141–151, Apr. 2022, doi: 10.1016/j.future.2021.11.005.
[3]T. S. John, T. Thomas, and S. Emmanuel, “Graph Convolutional Networks for Android Malware Detection with System Call Graphs,” in 2020 Third ISEA Conference on Security and Privacy (ISEA-ISAP), Feb. 2020, pp. 162–170. doi: 10.1109/ISEA-ISAP49340.2020.235015.
[4]D. Carlin, P. O’Kane, and S. Sezer, “Dynamic Analysis of Malware Using Run-Time Opcodes,” in Data Analytics and Decision Support for Cybersecurity: Trends, Methodologies and Applications, I. Palomares Carrascosa, H. K. Kalutarage, and Y. Huang, Eds. Cham: Springer International Publishing, 2017, pp. 99–125. doi: 10.1007/978-3-319-59439-2_4.
[5]S. Hao, B. Liu, S. Nath, W. G. J. Halfond, and R. Govindan, “PUMA: programmable UI-automation for large-scale dynamic analysis of mobile apps,” in Proceedings of the 12th annual international conference on Mobile systems, applications, and services, New York, NY, USA, Summer 2014, pp. 204–217. doi: 10.1145/2594368.2594390.
[6]洪千惠,2022,”結合系統呼叫序列關係與局部特徵計算之行動惡意程式檢測方法”,國立中央大學資訊管理研究所碩士論文。
[7]S. Kumar, D. Mishra, B. Panda, and S. K. Shukla, “DeepDetect: A Practical On-device Android Malware Detector,” in 2021 IEEE 21st International Conference on Software Quality, Reliability and Security (QRS), Feb. 2021, pp. 40–51. doi: 10.1109/QRS54544.2021.00015.
[8]Z. Meng, Y. Xiong, W. Huang, F. Miao, and J. Huang, “AppAngio: Revealing Contextual Information of Android App Behaviors by API-Level Audit Logs,” IEEE Transactions on Information Forensics and Security, vol. 16, pp. 1912–1927, 2021, doi: 10.1109/TIFS.2020.3044867.
[9]Y. Fang, Y. Gao, F. Jing, and L. Zhang, “Android Malware Familial Classification Based on DEX File Section Features,” IEEE Access, vol. 8, pp. 10614–10627, 2020, doi: 10.1109/ACCESS.2020.2965646.
[10]O. Mirzaei, G. Suarez-Tangil, J. M. de Fuentes, J. Tapiador, and G. Stringhini, “AndrEnsemble: Leveraging API Ensembles to Characterize Android Malware Families,” in Proceedings of the 2019 ACM Asia Conference on Computer and Communications Security, New York, NY, USA, Summer 2019, pp. 307–314. doi: 10.1145/3321705.3329854.
[11]S. Türker and A. B. Can, “AndMFC: Android Malware Family Classification Framework,” in 2019 IEEE 30th International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC Workshops), Sep. 2019, pp. 1–6. doi: 10.1109/PIMRCW.2019.8880840.
[12]“UI/Application Exerciser Monkey,” Android Developers. https://developer.android.com/studio/test/other-testing-tools/monkey (accessed Mar. 29, 2023).
[13]S. Mahdavifar, A. F. Abdul Kadir, R. Fatemi, D. Alhadidi, and A. A. Ghorbani, “Dynamic Android Malware Category Classification using Semi-Supervised Deep Learning,” in 2020 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech), Aug. 2020, pp. 515–522. doi: 10.1109/DASC-PICom-CBDCom-CyberSciTech49142.2020.00094.
[14]D.-H. Lee, “Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks,” in Workshop on challenges in representation learning, ICML, vol. 3, 2013, p. 2.
[15]M. K. Alzaylaee, S. Y. Yerima, and S. Sezer, “DL-Droid: Deep learning based android malware detection using real devices,” Computers & Security, vol. 89, p. 101663, Feb. 2020, doi: 10.1016/j.cose.2019.101663.
[16]S. I. Imtiaz, S. ur Rehman, A. R. Javed, Z. Jalil, X. Liu, and W. S. Alnumay, “DeepAMD: Detection and identification of Android malware using high-efficient Deep Artificial Neural Network,” Future Generation Computer Systems, vol. 115, pp. 844–856, Feb. 2021, doi: 10.1016/j.future.2020.10.008.
[17]“Soot | Soot - A framework for analyzing and transforming Java and Android applications.” http://soot-oss.github.io/soot/
[18]S. Arzt et al., “FlowDroid: precise context, flow, field, object-sensitive and lifecycle-aware taint analysis for Android apps,” SIGPLAN Not., vol. 49, no. 6, pp. 259–269, 9 2014, doi: 10.1145/2666356.2594299.
[19]X. Zhang, X. Wang, R. Slavin, and J. Niu, “ConDySTA: Context-Aware Dynamic Supplement to Static Taint Analysis,” in 2021 IEEE Symposium on Security and Privacy (SP), May 2021, pp. 796–812. doi: 10.1109/SP40001.2021.00040.
[20]Z. Meng, Y. Xiong, W. Huang, F. Miao, and J. Huang, “AppAngio: Revealing Contextual Information of Android App Behaviors by API-Level Audit Logs,” IEEE Transactions on Information Forensics and Security, vol. 16, pp. 1912–1927, 2021, doi: 10.1109/TIFS.2020.3044867.
[21]S. Dong et al., “Understanding android obfuscation techniques: A
large-scale investigation in the wild,” in International conference on security
and privacy in communication systems, 2018, pp. 172–192.
[22]S. Aonzo, G. C. Georgiu, L. Verderame, and A. Merlo, “Obfuscapk: An open-source black-box obfuscation tool for Android apps,” SoftwareX, vol. 11, p. 100403, Jan. 2020, doi: 10.1016/j.softx.2020.100403.
[23]“Java Obfuscator and Android App Optimizer | ProGuard.” https://www.guardsquare.com/proguard
[24]H. Cai, N. Meng, B. Ryder, and D. Yao, “DroidCat: Effective Android Malware Detection and Categorization via App-Level Profiling,” IEEE Transactions on Information Forensics and Security, vol. 14, no. 6, pp. 1455–1470, Jun. 2019, doi: 10.1109/TIFS.2018.2879302.
[25]M. Ikram, P. Beaume, and M. A. Kaafar, “DaDiDroid: An Obfuscation Resilient Tool for Detecting Android Malware via Weighted Directed Call Graph Modelling.” arXiv, Aug. 21, 2019. doi: 10.48550/arXiv.1905.09136.
[26]“Android 中的權限 | Android Developers.”https://developer.android.com/guide/topics/permissions/overview?hl=zh-tw
[27]Martín, A., Lara-Cabrera, R., & Camacho, D. (2018). A new tool for static and dynamic Android malware analysis. In Data Science and Knowledge Engineering for Sensing Decision Support (pp. 509-516). World Scientific.
[28]V. Avdiienko et al., “Mining Apps for Abnormal Usage of Sensitive Data,” in 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering, May 2015, vol. 1, pp. 426–436. doi: 10.1109/ICSE.2015.61.
[29]“VirusShare.com.”
https://virusshare.com/
[30]“Androguard.”androguard, https://github.com/androguard/androguard
[31]L. Li et al., “AndroZoo++: Collecting Millions of Android Apps and Their Metadata for the Research Community.” arXiv, Sep. 15, 2017. doi: 10.48550/arXiv.1709.05281.