| 研究生: |
林彥廷 Yan-Ting Lin |
|---|---|
| 論文名稱: |
深度注意力殘差網路之工業控制系統基於流的異常分類 Flow-based Anomaly Classification with Deep Attention Residual Network in Industrial Control System |
| 指導教授: |
江振瑞
Jehn-Ruey Jiang |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
資訊電機學院 - 資訊工程學系 Department of Computer Science & Information Engineering |
| 論文出版年: | 2022 |
| 畢業學年度: | 110 |
| 語文別: | 中文 |
| 論文頁數: | 54 |
| 中文關鍵詞: | 工業控制系統 、異常分類 、基於流 、多重注意力區塊 、殘差區塊 、Electra Modbus 資料集 |
| 外文關鍵詞: | industrial control system, anomaly classification, flow-based, multi-attention block, residual block, Electra Modbus dataset |
| 相關次數: | 點閱:19 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
工業控制系統 (industrial control systems, ICS)結合資訊技術(information technology, IT)和運營技術 (operational technology, OT),透過網路以監視、控制和管理大型生產系統或關鍵基礎建設。工業控制系統一旦遭受資安攻擊,輕則系統性能下降、功能喪失,重則導致環境汙染、經濟損失、人員傷亡、甚至危害國家安全。因此,發展入侵偵測系統(intrusion detection system)及入侵分類系統(intrusion classification system),以檢測及分類資安攻擊所造成的異常(anomaly)變得非常重要。
本論文提出基於流 (flow-based)的異常分類方法,結合多重注意力區塊(multi-attention block)機制與殘差區塊 (residual block)機制建構深度神經網路以發展在工業控制系統中的入侵分類系統。所提出的方法首先透過匯集相同資料流(data flow)以獲得更多的特徵,接著使用多重注意力區塊提取在不同維度中的特徵,再使用殘差區塊導出輸入和輸出之間的殘差,以去除主體中相同的部分,從而突出微小的變化。為了增加訓練時的穩健性 (robustness),我們選擇 Ranger (RAdam + LookAhead)作為優化器來減少梯度的方差,選擇 Focal Loss作為損失函數為每個樣本給予相對應的損 失權重,以加強神經網路處理不平衡資料的能力。
本論文採用 Electra Modbus資料集來評估所提方法之效能,不僅將所提方法的不同機制組合進行效能比較,也與其他相關方法進行效能比較。比較結果顯示,所提方法在入侵分類方面,具有最好的精準度、召回率和 F1分數。
Industrial control systems (ICSs) combine information technology and operational technology to monitor, control and manage large-scale production systems or critical infrastructures through networking. Once industrial control systems suffer from information security attacks, their performance degrades and some functions may fail, leading to environmental pollution, economic losses, casualties, and even national security crises. Therefore, it is very important to develop an intrusion detection system and an intrusion classification system to detect and classify anomalies caused by information security attacks.
This thesis proposes a flow-based anomaly classification method that combines multi-attention blocks and residual blocks to construct deep neural networks (DNNs) for developing intrusion classification systems in ICSs. The proposed method first obtains more features through aggregating the same data flows. It then uses multi-attention blocks to extract features in different dimensions, and employs attention blocks to derive the residual between input and output for removing identical portions in the main body, and highlighting small changes. In order to increase the robustness during training, we choose Ranger (RAdam + LookAhead) as the optimizer to reduce the variance of the gradient, and choose Focal Loss as the loss function to give each sample a corresponding loss weight so that DNNs can process imbalanced data properly.
The Electra Modbus dataset is used to evaluate the performance of the proposed method for different combinations of mechanism options. The proposed method is also compared with other related methods in terms of the recison, recall and F1 score to show that it has the best performance.
[1] Zhou, C., Hu, B., Shi, Y., Tian, Y. C., Li, X., & Zhao, Y. (2020). A unified architectural approach for cyberattack-resilient industrial control systems. Proceedings of the IEEE, 109(4), 517-541.
[2] Goodfellow, I., Bengio, Y., Courville, A., & Bengio, Y. (2016). Deep learning (Vol. 1, No. 2). Cambridge: MIT press.
[3] LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278-2324.
[4] Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural computation, 9(8), 1735-1780.
[5] Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., & Bengio, Y. (2014). Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078.
[6] Lin, T. Y., Goyal, P., Girshick, R., He, K., & Dollár, P. (2017). Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision (pp. 2980-2988).
[7] Jiang, J. R., & Chen, Y. T. (2022). Industrial Control System Anomaly Detection and Classification Based on Network Traffic. IEEE Access.
[8] Gómez, Á. L. P., Maimó, L. F., Celdran, A. H., Clemente, F. J. G., Sarmiento, C. C., Masa, C. J. D. C., & Nistal, R. M. (2019). On the generation of anomaly detection datasets in industrial control systems. IEEE Access, 7, 177460-177473.
[9] Trend Labs 趨勢科技全球技術支援與研發中心:什麼是工業控制系統(Industrial Control System, ICS)。2022年6月5日,取自https://blog.trendmicro.com.tw/?p=67721。
[10] 台達:觸控型人機介面。2022年6月5日,取自https://www.deltaww.com/zh-tw/products/Touch-Panel-HMI-Human-Machine-Interfaces/ALL/
[11] 維基百科:遠端終端裝置。2022年6月5日,取自https://zh.wikipedia.org/zh-tw/%E8%BF%9C%E7%A8%8B%E7%BB%88%E7%AB%AF%E8%A3%85%E7%BD%AE
[12] 維基百科:可程式化邏輯控制器。2022年6月5日,取自https://zh.wikipedia.org/wiki/%E5%8F%AF%E7%BC%96%E7%A8%8B%E9%80%BB%E8%BE%91%E6%8E%A7%E5%88%B6%E5%99%A8
[13] Wikipedia:DNP3。2022年6月5日,取自https://en.wikipedia.org/wiki/DNP3
[14] Wikipedia:Modbus。2022年6月5日,取自https://zh.wikipedia.org/zh-tw/Modbus
[15] Wikipedia:OPC。2022年6月5日,取自https://zh.wikipedia.org/wiki/%E5%BC%80%E6%94%BE%E5%B9%B3%E5%8F%B0%E9%80%9A%E4%BF%A1
[16] Chandola, V., Banerjee, A., & Kumar, V. (2009). Anomaly detection: A survey. ACM computing surveys (CSUR), 41(3), 1-58.
[17] Hodge, V., & Austin, J. (2004). A survey of outlier detection methodologies. Artificial intelligence review, 22(2), 85-126.
[18] Wikipedia:Artificial neural network。2022年6月5日,取自https://en.wikipedia.org/wiki/Artificial_neural_network
[19] Wikipedia:Neuron。2022年6月5日,取自https://en.wikipedia.org/wiki/Neuron
[20] He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770-778).
[21] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems, 30.
[22] Glorot, X., Bordes, A., & Bengio, Y. (2011, June). Deep sparse rectifier neural networks. In Proceedings of the fourteenth international conference on artificial intelligence and statistics (pp. 315-323). JMLR Workshop and Conference Proceedings.
[23] Hendrycks, D., & Gimpel, K. (2016). Gaussian error linear units (gelus). arXiv preprint arXiv:1606.08415.
[24] Dugas, C., Bengio, Y., Bélisle, F., Nadeau, C., & Garcia, R. (2000). Incorporating second-order functional knowledge for better option pricing. Advances in neural information processing systems, 13.
[25] Ramachandran, P., Zoph, B., & Le, Q. V. (2017). Searching for activation functions. arXiv preprint arXiv:1710.05941.
[26] Glorot, X., & Bengio, Y. (2010, March). Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the thirteenth international conference on artificial intelligence and statistics (pp. 249-256). JMLR Workshop and Conference Proceedings.
[27] LeCun, Y. A., Bottou, L., Orr, G. B., & Müller, K. R. (2012). Efficient backprop. In Neural networks: Tricks of the trade (pp. 9-48). Springer, Berlin, Heidelberg.
[28] He, K., Zhang, X., Ren, S., & Sun, J. (2015). Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE international conference on computer vision (pp. 1026-1034). [29] Wikipedia:Mean absolute error。2022年6月5日,取自https://en.wikipedia.org/wiki/Mean_absolute_error
[30] Wikipedia:Mean squared error。2022年6月5日,取自https://en.wikipedia.org/wiki/Mean_squared_error
[31] Wikipedia:Mean absolute percentage error。2022年6月5日,取自https://en.wikipedia.org/wiki/Mean_absolute_percentage_error
[32] Wikipedia:Cross entropy。2022年6月5日,取自https://en.wikipedia.org/wiki/Cross_entropy
[33] D. P. Kingma and J. Ba, ``Adam: A method for stochastic optimization,''
2014, arXiv:1412.6980.
[34] Liu, L., Jiang, H., He, P., Chen, W., Liu, X., Gao, J., & Han, J. (2019). On the variance of the adaptive learning rate and beyond. arXiv preprint arXiv:1908.03265.
[35] Zhang, M. R., Lucas, J., Hinton, G., & Ba, J. (2019). Lookahead optimizer: k steps forward, 1 step back. arXiv preprint arXiv:1907.08610.
[36] Less Wright:New Deep Learning Optimizer, Ranger: Synergistic combination of RAdam + LookAhead for the best of both.。2022年6月5日,取自https://lessw.medium.com/new-deep-learning-optimizer-ranger-synergistic-combination-of-radam-lookahead-for-the-best-of-2dc83f79a48d
[37] Centre for Research in Cyber Security, iTrust (2021)。2022年6月5日,取自https://itrust.sutd.edu.sg/
[38] Mathur, A. P., & Tippenhauer, N. O. (2016, April). SWaT: a water treatment testbed for research and training on ICS security. In 2016 international workshop on cyber-physical systems for smart water networks (CySWater) (pp. 31-36). IEEE.
[39] Ahmed, C. M., Palleti, V. R., & Mathur, A. P. (2017, April). WADI: a water distribution testbed for research in the design of secure cyber physical systems. In Proceedings of the 3rd International Workshop on Cyber-Physical Systems for Smart Water Networks (pp. 25-28).
[40] Adepu, S., Kandasamy, N. K., & Mathur, A. (2018). Epic: An electric power testbed for research and training in cyber physical systems security. In Computer Security (pp. 37-52). Springer, Cham.
[41] Ning, B., Qiu, S., Zhao, T., & Li, Y. (2020, October). Power IoT attack samples generation and detection using generative adversarial networks. In 2020 IEEE 4th Conference on Energy Internet and Energy System Integration (EI2) (pp. 3721-3724). IEEE.
[42] Goodfellow, I. J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., ... & Bengio, Y. (2014). Generative adversarial networks. arXiv preprint arXiv:1406.2661.