跳到主要內容

簡易檢索 / 詳目顯示

研究生: 李思瑤
Pakkapat Banditsingha
論文名稱: 基於自動編碼器和雙重檢查方法的對抗性圖像檢測
ADVERSARIAL IMAGES DETECTION BASED ON AUTOENCODER WITH DOUBLE CHECK METHODS
指導教授: 施國琛
Timothy K. Shih
口試委員:
學位類別: 碩士
Master
系所名稱: 資訊電機學院 - 資訊工程學系
Department of Computer Science & Information Engineering
論文出版年: 2023
畢業學年度: 111
語文別: 英文
論文頁數: 42
中文關鍵詞: 對抗性檢測自動編碼器深度學習Kullback Leibler 散度
外文關鍵詞: Adversarial Detection, Autoencoder, Deep learning, Kullback Leibler Divergence
相關次數: 點閱:12下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 對抗性示例旨在以輸入數據輸入分類模型前緩慢操作輸入數據的方式來攻擊目標模型。這種操作對輸入數據引入了精心設計的小擾動,人眼無法注意到這些擾動,但分類模型可以。為了防止對抗性攻擊,提出了許多對抗性檢測方法。流行的技術之一是使用通過使用未受攻擊的輸入數據訓練的自動編碼器來強迫模型從對抗性示例中刪除擾動值。結果顯示,該方法可以很好地重建輸入,但有時自動編碼器模型也會重建對抗值。繼從 [9] 中的先前作業,他們提出了針對對抗性示例的對抗性檢測器,該檢測器用正常圖像構建普通自動編碼器訓練,然後使用組合向量訓練異常值檢測網絡,該組合向量包括使用均方比較輸入、重建輸出的值誤差和分類模型的預測概率。如果異常值很大,則意味著輸入是對抗性示例。但從這項作業中無法確保自動編碼器會消除擾動值。針對這個問題,我提出了一種稱為以對抗性檢測基礎的 自動編碼器損失競賽的對抗性檢測。通過創建自動編碼器損失函數,可以正確檢測對抗性示例,該函數包括兩個模型競相預測輸入數據,然後選擇最佳模型來計算輸入數據和真實數據之間的損失,並與以模型為基礎的重建損失相結合。結果顯示,使用所提出的損失可以使對抗樣本的重建輸出正確分類到原始圖像。對於在所提出的損失函數中使用的分類模型,我利用基礎模型值和VGG16模型來使預測結果準確。


    Adversarial examples are known as one kind of deep neural network (DNN) attack method, it can perturb the deep neural model, especially in the image classification model. The adversarial example was designed to attack the target model by slowly manipulating the input data before feeding it into the classification models. This manipulation introduces small, carefully crafted perturbations to the input data that cannot be noticed by human eyes but can be noticed by the classification model. To prevent adversarial attacks, many adversarial detection methods were proposed. One of the popular techniques is using an autoencoder trained by using non-attacked input data to force the model to remove the perturbation value from the adversarial examples. The result shows that this method can reconstruct the input well but there is the case that the autoencoder model also reconstructs the adversarial value also. Following the previous work from [9], they propose the adversarial detector against adversarial examples that construct the vanilla autoencoder training with normal images and then training outlier detection network by using the combined vector that includes the value from comparing input and reconstruction output using Mean Square Error and classification model’s prediction probability. If the anomaly value is large, it means that the input is an adversarial example. But from this work cannot make sure that the autoencoder will remove the perturbation value. Following this issue, lead me to propose the adversarial detection called Adversarial detection-based autoencoder loss racing. That can correctly detect the adversarial example by creating the autoencoder loss function that includes two models racing to predict the input data and then will pick the best one to calculate the loss between input data and true data and combined with the model-based reconstruction loss. The result shows that using the proposed loss makes the reconstructed output of the adversarial sample correctly classified to the original image. For the classification model used in the proposed loss function, I utilize the based model value and VGG16 model to make the prediction result accurate.

    摘要 i ABSTRACT ii ACKNOWLEDGEMENT iii Table of Contents iv List of figures vi List of tables vii INTRODUCTION 1 1.1 Background 1 1.2 Problem statement 2 1.3 Significance of the Study 3 LITERATURE REVIEW 4 2.1 Adversarial Concept 4 2.2 Preliminary 5 Fast Gradient Sign Method: 5 Project Gradient Descent: 6 Deepfool: 6 Carlini & Wagner: 6 Autoencoder: 7 MobileNetV2 model: 8 VGG16 model: 9 Kullback Leibler Divergence: 9 2.3 Related Work 9 2.4 Images Comparing Method 10 Structural Similarity Index Measure: 11 Peak Signal-to-Noise Ratio: 11 RESEARCH METHOD 12 3.1 Dataset 14 Animals-10: 15 Cat and Dog: 15 Animals Detection Images Dataset (ADID): 15 ImageNet: 16 3.2 Data Preprocessing 16 Preprocessing for the base model and loss model: 16 Preprocessing for the autoencoder model: 17 3.3 Adversarial Method 17 3.4 Experimental Design 18 The input data: 18 Autoencoder loss adjustment: 18 Quality of image after reconstruction: 18 3.5 Evaluation Matrix 19 RESULT AND DISCUSSION 20 4.1 Experimental Result 20 Experiment on reconstructed mix adversarial example: 20 Quality of image after reconstruction: 23 Loss Adjustment: 25 4.2 Experimental Evaluation 26 Confusion Matrix: 26 Bhattacharyya distance: 27 Result Comparison with related work: 29 4.3 Discussion 30 CONCLUSION AND SUGGESTION 31 5.1 Conclusion 31 5.2 Suggestions and Future Work 31 Reference 32

    [1] Ian J. Goodfellow, Jonathon Shlens, and Christian Szegedy. "Explaining and harnessing adversarial examples." International Conference on Learning Representations (ICLR), 2015.
    [2] Madry, Aleksander, et al. "Towards Deep Learning Models Resistant to Adversarial Attacks." International Conference on Learning Representations (ICLR), 2018.
    [3] Moosavi-Dezfooli, Seyed-Mohsen, Alhussein Fawzi, and Pascal Frossard. "DeepFool: A simple and accurate method to fool deep neural networks." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2016, pp. 2574-2582.
    [4] Carlini, N., and Wagner, D. "Towards evaluating the robustness of neural networks." Proceedings of the IEEE Symposium on Security and Privacy (S&P), IEEE, 2017, pp. 39-57.
    [5] Papernot, Nicolas, Patrick McDaniel, and Ananthram Swami. "Transferability in machine learning: from phenomena to black-box attacks using adversarial samples." ACM Conference on Computer and Communications Security (CCS), 2017.
    [6] Hui Liu, Bo Zhao, Yuefeng Peng, Weidong Li, Peng Liu (2022). Towards Understanding and Harnessing the Effect of Image Transformation in Adversarial Detection. arXiv:2201.01080 .doi.org/10.48550/arXiv.2201.01080
    [7] Vacanti, G., & Van Looveren, A. (2020). Adversarial Detection and Correction by Matching Prediction Distributions. arXiv preprint arXiv:2002.09364v1 [cs.LG].
    [8] theblackmamba31. "Autoencoder - Grayscale to Color Image." Kaggle, n.d., https://www.kaggle.com/code/theblackmamba31/autoencoder-grayscale-to-color-image.
    [9] Liu, Hui & Zhao, Bo & Zhang, Kehuan & Liu, Peng. (2022). Nowhere to Hide: A Lightweight Unsupervised Detector against Adversarial Examples. 10.48550/arXiv.2210.08579.

    QR CODE
    :::