| 研究生: |
廖建凱 Chien-Kai Liao |
|---|---|
| 論文名稱: |
基於深度洞察與深度學習之信用卡詐欺偵測 Credit Card Fraud Detection Based on DeepInsight and Deep Learning |
| 指導教授: |
江振瑞
Jehn-Ruey Jiang |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
資訊電機學院 - 資訊工程學系 Department of Computer Science & Information Engineering |
| 論文出版年: | 2021 |
| 畢業學年度: | 109 |
| 語文別: | 中文 |
| 論文頁數: | 59 |
| 中文關鍵詞: | 自適應合成抽樣 、卷積神經網路 、信用卡詐欺偵測 、深度學習 、深度洞察 、馬修斯相關係數 |
| 外文關鍵詞: | adaptive synthetic sampling, convolutional neural network, credit card fraud detection, deep learning, DeepInsight, Matthews correlation coefficient |
| 相關次數: | 點閱:12 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
隨著電商技術的創新與行動裝置的普及,人們的購物方式變得多元且方便,這促使越來越多民眾加入數位通路消費的行列,也帶動了人們對信用卡的需求。然而,在便利的背後隱藏著許多不法行為,隨著信用卡的交易量逐漸增加,詐欺交易也變得更加氾濫,這為銀行與商家帶來鉅額的虧損,因此發卡公司希望能建立一套有效偵測信用卡詐欺交易的方法。
本論文提出一個信用卡詐欺偵測方法,此方法首先利用自適應合成(adaptive synthetic, ADASYN)抽樣法對資料進行過採樣,增加較難學習的少數類別樣本。接著,透過深度洞察( DeepInsight)方法將信用卡交易資料轉換成組織良好的圖像形式,並輸入卷積神經網路(convolutional neural network, CNN)深度學習模型藉以提取原始資料中的明確特徵,以提升對信用卡交易資料判斷的準確性。
本研究使用 Kaggle 競賽平台上的歐洲信用卡交易資料進行實驗,以評估所提出方法之效能,並與相關的方法進行比較。實驗結果顯示,在準確率(accuracy)、真陰率(true negative rate)、真陽率(true positive rate)和馬修斯相關係數(Matthews correlation coefficient)評分標準下,本論文所提方法皆有較佳的效能。
With the innovation of e-commerce technology and the popularization of mobile devices, shopping has become diverse and convenient, which has prompted more and more people to join the ranks of online shopping, and has also driven the people’s demand for credit cards. However, many crimes are hidden behind the convenience of credit cards. With the gradual increase in the volume of credit card transactions, fraudulent transactions have become more rampant, which has brought huge losses to banks and merchants. Therefore, card issuers hope to establish an effective method for detecting fraudulent transactions of credit cards.
This thesis proposes a credit card fraud detection method. The proposed method first utilizes adaptive synthetic (ADASYN) sampling to oversample the minority class, increasing the number of samples that are harder to learn. Then it uses the DeepInsight method to transform non-image data into well-organized images, which in turn are fed into deep learning convolutional neural network (CNN) model to extract critical features hidden in the raw data for improving the accuracy on credit card fraud detection.
This study uses the European credit card transaction data on the Kaggle competition platform to evaluate the effectiveness of the proposed method and compares the evaluation results with those of related methods. The comparisons show that the proposed method has comparably good performance in terms of the accuracy, true positive rate, true negative rate, and Matthews correlation coefficient.
[1] Credit card。取自https://www.britannica.com/topic/credit-card
[2] Credit Card Statistics。2021年02月,取自 https://shiftprocessing.com/credit-card/
[3] Credit Card Fraud。取自https://www.fbi.gov/scams-and-safety/common-scams-and-crimes/credit-card-fraud
[4] Sixgill:Dark Web Financial Fraud Spikes in Second Half of 2019: Over 76 million Credit Cards for Sale。2020年1月27日,取自https://blog.cybersixgill.com/dark-web-financial-fraud-2019
[5] Jacob Lunduski:Identity Theft & Credit Card Fraud Has Exploded in Recent Years 。2021年02月10日,取自https://www.creditcardinsider.com/blog/annual-fraud-and-identity-theft-analysis/
[6] Nilsonreport :Card Fraud Losses Reach $28.65 Billion。取自https://nilsonreport.com/content_promo.php?id_promo=16
[7] Nathaniel Lee:Credit card fraud will increase due to the Covid pandemic, experts warn。2021年02月01日,取自https://www.cnbc.com/2021/01/27/credit-card-fraud-is-on-the-rise-due-to-covid-pandemic.html
[8]Avantika Shergil:Credit card fraud and technical solutions。2021年02月11日,取自https://itchronicles.com/technology/credit-card-fraud-and-technical-solutions/
[9] O'Shea, K., & Nash, R. (2015). An introduction to convolutional neural networks. arXiv preprint arXiv:1511.08458.
[10]Sharma, A., Vans, E., Shigemizu, D., Boroevich, K. A., & Tsunoda, T. (2019). DeepInsight: A methodology to transform a non-image data to an image for convolution neural network architecture. Scientific reports, 9(1), 1-7.
[11] Chandola, V., Banerjee, A., & Kumar, V. (2009). Anomaly detection: A survey. ACM computing surveys (CSUR), 41(3), 1-58.
[12]Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: synthetic minority over-sampling technique. Journal of artificial intelligence research, 16, 321-357.
[13]He, H., Bai, Y., Garcia, E. A., & Li, S. (2008, June). ADASYN: Adaptive synthetic sampling approach for imbalanced learning. In 2008 IEEE international joint conference on neural networks (IEEE world congress on computational intelligence) (pp. 1322-1328). IEEE.
[14] 人工智慧、機器學習和深度學習哪裡不一樣?。2019年08月28日,
取自https://medium.com/marketingdatascience/%E4%BA%BA%E5%B7%A5%E6%99%BA%E6%85%A7-%E6%A9%9F%E5%99%A8%E5%AD%B8%E7%BF%92%E5%92%8C%E6%B7%B1%E5%BA%A6%E5%AD%B8%E7%BF%92%E5%93%AA%E8%A3%A1%E4%B8%8D%E4%B8%80%E6%A8%A3-90ff862bf9b4
[15] Lynn:人工智慧、機器學習、深度學習之間不是等號,而是一層包一層2017年08月05日,取自https://www.thenewslens.com/article/75335
[16] 你知道機器學習(Machine Learning),有幾種學習方式嗎? 取自 https://www.ecloudvalley.com/zh-hant/machine-learning/
[17]LeCun, Y., Bengio, Y., & Hinton, G. (2015). Deep learning. nature, 521(7553), 436-444.
[18] Artifical neural network(ANN),取自https://chenhh.gitbooks.io/multiperiod_portfolio_optimization/content/ml/ann/
[19]神經元。取自http://www.hkpe.net/hkdsepe/human_body/neuron.htm
[20] 第三張、神經網路, 取自https://nccur.lib.nccu.edu.tw/bitstream/140.119/35873/6/25700606.pdf
[21] Nagesh Singh Chauhan : Introduction to Artificial Neural Networks,取自https://www.kdnuggets.com/2019/10/introduction-artificial-neural-networks.html
[22] Shruti Jadon : Introduction to Different Activation Functions for Deep Learning,2018年05月16日取自https://medium.com/@shrutijadon10104776/survey-on-activation-functions-for-deep-learning-9689331ba092
[23] 卷積神經網絡, 取自 https://polakowo.io/datadocs/docs/deep-learning/cnns
[24] MK Gurucharan:Basic CNN Architecture: Explaining 5 Layers of Convolutional Neural Network, 2020年12月07日 取自https://www.upgrad.com/blog/basic-cnn-architecture/
[25] 卷積神經網路(Convolutional Neural Networks,CNN),2020年09月13日,取自https://blog.xuite.net/metafun/life/589355242
[26] Pumsirirat, A., & Yan, L. (2018). Credit card fraud detection using deep learning based on auto-encoder and restricted boltzmann machine. International Journal of advanced computer science and applications, 9(1), 18-25.
[27] Zamini, M., & Montazer, G. (2018, December). Credit card fraud detection using autoencoder based clustering. In 2018 9th International Symposium on Telecommunications (IST) (pp. 486-491). IEEE.
[28] Rai, A. K., & Dwivedi, R. K. (2020, July). Fraud detection in credit card data using unsupervised machine learning based scheme. In 2020 International Conference on Electronics and Sustainable Communication Systems (ICESC) (pp. 421-426). IEEE.
[29] Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1985). Learning internal representations by error propagation. California Univ San Diego La Jolla Inst for Cognitive Science.
[30] Salakhutdinov, R., Mnih, A., & Hinton, G. (2007, June). Restricted Boltzmann machines for collaborative filtering. In Proceedings of the 24th international conference on Machine learning (pp. 791-798).
[31] Candel, A., Parmar, V., LeDell, E., & Arora, A. (2016). Deep learning with H2O. H2O. ai Inc.
[32] Raghavan, P., & El Gayar, N. (2019, December). Fraud detection using machine learning and deep learning. In 2019 International Conference on Computational Intelligence and Knowledge Economy (ICCIKE) (pp. 334-339). IEEE.
[33] Parmar, J., Patel, A. C., & Savsani, M. (2020). Credit Card Fraud Detection Framework–A Machine Learning Perspective.
[34]Naveen, P., & Diwan, B. (2020, October). Relative Analysis of ML Algorithm QDA, LR and SVM for Credit Card Fraud Detection Dataset. In 2020 Fourth International Conference on I-SMAC (IoT in Social, Mobile, Analytics and Cloud)(I-SMAC) (pp. 976-981). IEEE
[35] Rtayli, N., & Enneya, N. (2020). Selection features and support vector machine for credit card risk identification. Procedia Manufacturing, 46, 941-948.
[36]Lin, T. H., & Jiang, J. R. (2020, December). Anomaly Detection with Autoencoder and Random Forest. In 2020 International Computer Symposium (ICS) (pp. 96-99). IEEE.
[37] Choubey, R., & Gautam, P. COMBINED TECHNIQUE OF SUPERVISED CLASSIFIER FOR THE CREDIT CARD FRAUD DETECTION.
[38] Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine learning, 20(3), 273-297.
[39] Breiman, L. (2001). Random forests. Machine learning, 45(1), 5-32.
[40] Chen, T., & Guestrin, C. (2016, August). Xgboost: A scalable tree boosting system. In Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining (pp. 785-794).
[41] Tharwat, A. (2016). Linear vs. quadratic discriminant analysis classifier: a tutorial. International Journal of Applied Pattern Recognition, 3(2), 145-180.
[42] Dal Pozzolo, A., Caelen, O., Johnson, R. A., & Bontempi, G. (2015, December). Calibrating probability with undersampling for unbalanced classification. In 2015 IEEE Symposium Series on Computational Intelligence (pp. 159-166). IEEE.
[43] Selvaraju, R. R., Das, A., Vedantam, R., Cogswell, M., Parikh, D., & Batra, D. (2016). Grad-cam: Why did you say that?. arXiv preprint arXiv:1611.07450.