| 研究生: |
張欣茹 Xin-Ru Zhang |
|---|---|
| 論文名稱: |
結合時空資料的半監督模型並應用於PM2.5空污感測器的異常偵測 Semi-Supervised Model with Spatio-Temporal Data and Applied in PM2.5 sensor anomaly detection |
| 指導教授: |
陳弘軒
Hung-Hsuan Chen |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
資訊電機學院 - 資訊工程學系 Department of Computer Science & Information Engineering |
| 論文出版年: | 2021 |
| 畢業學年度: | 109 |
| 語文別: | 中文 |
| 論文頁數: | 53 |
| 中文關鍵詞: | PM2.5 、異常偵測 、半監督模型 、時空資料結合 |
| 外文關鍵詞: | PM2.5, anomaly detection, semi-supervised model, spatio-temporal data integration |
| 相關次數: | 點閱:14 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
台灣近年來 PM2.5 空氣汙染的議題逐漸受到重視,增設了許多價格
較為低廉的感測器,但是這些感測器容易受到環境因素影響造成較大的
誤差,加上數量龐大造成每台感測器的維護頻率低,單一區域感測器回
傳的數值不如國家級測站來得可靠,
本論文比較了監督式、無監督式、及半監督式的演算法在偵測異常
傳感器的效果。為了結合感測器的時空資訊,我們將監測值轉成圖片資
料、整合性資料、以及整合資料結合時序資料來準備訓練數據。我們根
據工業技術研究所提供的檢測記錄得到感器測的狀態值(正常或異常),
探討了標記資料的比例對半監督模型預測效能的影響。實驗結果顯示:
我們研究的方法優於目前的隨機巡檢機制。
The PM2.5 issue has drawn much attention in Taiwan, and many
inexpensive sensors have been deployed in recent years. However, these
sensors are fragile and susceptible to environmental factors. In addition,
the large number of sensors results in low maintenance frequency, so the
monitored values returned by a single sensor are unreliable.
This thesis compares supervised, unsupervised, and semi-supervised
methods to identify the problematic sensors. We prepared the training
data by converting monitored values into images, integrated data, and sequential data to incorporate the spatio-temporal information of the sensors.
We obtained sensors’status (normal or abnormal) based on the inspection records provided by the Industrial Technology Research Institute. We
explored how the ratio of labeled data to unlabeled data influences the performance of the semi-supervised models. Experimental results show that
our studied methods outperform the current inspection strategy (random
inspection).
[1] V. Van Zoest, A. Stein, and G. Hoek, “Outlier detection in urban air quality sensor
networks,” Water, Air, & Soil Pollution, vol. 229, no. 4, pp. 1–13, 2018.
[2] F. Xiao, M. Yang, H. Fan, G. Fan, and M. A. Al-Qaness, “An improved deep learning model for predicting daily pm2. 5 concentration,” Scientific Reports, vol. 10,
no. 1, pp. 1–11, 2020.
[3] L.-J. Chen, Y.-H. Ho, H.-H. Hsieh, S.-T. Huang, H.-C. Lee, and S. Mahajan, “Adf:
An anomaly detection framework for large-scale pm2. 5 sensing systems,” IEEE
Internet of Things Journal, vol. 5, no. 2, pp. 559–570, 2017.
[4] M. M. Breunig, H.-P. Kriegel, R. T. Ng, and J. Sander, “Lof: Identifying densitybased local outliers,” in Proceedings of the 2000 ACM SIGMOD international conference on Management of data, 2000, pp. 93–104.
[5] F. T. Liu, K. M. Ting, and Z.-H. Zhou, “Isolation forest,” in 2008 eighth ieee
international conference on data mining, IEEE, 2008, pp. 413–422.
[6] L. Ruff, R. Vandermeulen, N. Goernitz, L. Deecke, S. A. Siddiqui, A. Binder, E.
Müller, and M. Kloft, “Deep one-class classification,” in International conference
on machine learning, PMLR, 2018, pp. 4393–4402.
[7] V. Vercruyssen, W. Meert, G. Verbruggen, K. Maes, R. Baumer, and J. Davis,
“Semi-supervised anomaly detection with an application to water analytics,” in
2018 ieee international conference on data mining (icdm), IEEE, vol. 2018, 2018,
pp. 527–536.
[8] L. Ruff, R. A. Vandermeulen, N. Görnitz, A. Binder, E. Müller, K.-R. Müller, and
M. Kloft, “Deep semi-supervised anomaly detection,” arXiv preprint arXiv:1906.02694,
2019.
[9] W. Meert, K. Hendrickx, and T. V. Craenendonck, Wannesm/dtaidistance v2.0.0,
version v2.0.0, Aug. 2020. doi: 10.5281/zenodo.3981067. [Online]. Available:
https://doi.org/10.5281/zenodo.3981067.
[10] G. A. Seber and A. J. Lee, Linear regression analysis. John Wiley & Sons, 2012,
vol. 329.
[11] A. E. Hoerl and R. W. Kennard, “Ridge regression: Biased estimation for nonorthogonal problems,” Technometrics, vol. 12, no. 1, pp. 55–67, 1970.
[12] A. Liaw, M. Wiener, et al., “Classification and regression by randomforest,” R news,
vol. 2, no. 3, pp. 18–22, 2002.
[13] F. Rosenblatt, “Principles of neurodynamics. perceptrons and the theory of brain
mechanisms,” Cornell Aeronautical Lab Inc Buffalo NY, Tech. Rep., 1961.
[14] J. A. Hanley and B. J. McNeil, “The meaning and use of the area under a receiver
operating characteristic (roc) curve.,” Radiology, vol. 143, no. 1, pp. 29–36, 1982.
[15] J. Davis and M. Goadrich, “The relationship between precision-recall and roc
curves,” in Proceedings of the 23rd international conference on Machine learning,
2006, pp. 233–240.