地點資料庫中錯誤場景影像自動偵測系統｜國立中央大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	柳翔元 Hsiang-Yuan Liu
論文名稱：	地點資料庫中錯誤場景影像自動偵測系統 Automatic Incorrect Scene Detection System for Large Scale Location Database
指導教授：	鄭旭詠 HSU-YUNG CHENG
口試委員:
學位類別：	碩士 Master
系所名稱：	資訊電機學院 - 資訊工程學系 Department of Computer Science & Information Engineering
論文出版年：	2019
畢業學年度：	107
語文別：	英文
論文頁數：	38
中文關鍵詞：	深度學習、卷積神經網路、場景識別
外文關鍵詞：	Deep Learning, CNN, Scene Recognition
相關次數：	點閱：11 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

近年來由於深度學習網路的蓬勃發展，被廣泛應用於電腦視覺與圖形別的領域中，也因為人工智慧的發展，透過建置智慧化的系統，能夠有效幫助人類處理簡單且重複性高的問題．本篇論文實作一個自動化錯誤場景偵測系統，能夠有效取代以往以人工方式檢測地標圖像資料庫之正確性的過程，以節省人力成本．
在自動化錯誤場景偵測的系統中，我們提出了錯誤場景偵測演算法，以解決偵測錯誤場景的問題，並且基於此系統架構，我們提出了多級別特徵擷取器(Multiple Level Extractor)，透過擷取場景圖中不同級別的特徵，改善了Resnet50網路架構的特徵提取效果，以及多尺度距離度量(Multiple Scale Distance Measurement)，在給定的特徵擷取器之下，總和了在多種不同尺度下之特徵距離，能夠將系統之效能再提升．
最後，基於本系統的架構下，我們實驗了系統在不同的特徵擷取器與距離度量方式之下，影響系統效能之變化程度．

In recent years, due to the vigorous development of deep learning networks, deep learning has been widely used in the field of computer vision and graphic recognition. Because of the development of artificial intelligence, through the establishment of intelligent systems, it can effectively help humans to handle simple and repetitive problem.
We propose an automated incorrect scene detection system. which can effectively replace the process of manually detecting the correctness of the landmark image database to save human resources costs.
In the system of automatic incorrect scene detection, we propose an incorrect scene detection algorithm to solve the problem of detecting incorrect scenes, and based on this system architecture, we propose a MLE (Multiple Level Extractor). By extracting different levels of features in the scene image, improved the feature extraction effect of the Resnet50 network architecture. In addition, we also propose MSD(Multiple Scale Distance) measurement, which sums the feature distances at various scales under a given feature extractor. MSD also improved the performance of the system. Finally, based on the architecture of the system, we experimented with the system under different feature extractor and distance measurement methods , which affect the degree of system performance change.

中文摘要 i 
英文摘要 ii 
目錄 iii
圖目錄 . 表目錄 .
一、緒論 1
1-1 研究動機 1
1-2 相關研究 2
1-3 系統架構介紹 3
1-4 主要貢獻 3
二、地標資料庫描述與評估指標 5
2-1 地標資料庫  5
2-2 評估指標 6
2-2-1 模擬使用者上傳錯誤地標圖  6
2-2-2 ROC Curve 7
三、研究方法 9
3-1 特徵擷取階段 9
3-1-1 訓練資料集 9
3-1-2 BaseExtractor 9
3-1-3 MLE-MultipleLevelExtractor 10
3-2 圖片特徵距離度量階段 12
3-2-1 OverallDistance 12
3-2-2 MSD-MultiScaleDistance 13
3-3 錯誤場景偵測階段 14
四、實驗結果 16
4-1 實驗環境 16
4-2 比較預訓練 Resnet50 與使用地標資料庫訓練之 Resnet50  16
4-3 特徵擷取器使用不同損失函式訓練之比較  19
4-4 Baseline、MLE與MSD之比較 19
4-5 錯誤場景偵測演算法於不同的匹配場景數量 v 下之影響 19
4-6 使用 Multiple Scale Distance 度量於不同閥值 δ 下之比較  22
4-7 使用 Multiple Scale Distance 度量於不同尺度參數 r 下之比較 23
五、結論 25
參考文獻 27
                                

[1] Raia Hadsell, Sumit Chopra, and Yann LeCun, “Dimensionality reduction by learning an invariant mapping,” Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 1735–1742, 2006.
[2] Yi Sun, Xiaogang Wang, and Xiaoou Tang, “Deeply learned face representations are sparse, selective, and robust,” CoRR, vol. abs/1412.1265, 2014.
[3] Yi Sun, Xiaogang Wang, and Xiaoou Tang, “Deep learning face representation by joint identification-verification,” CoRR, vol. abs/1406.4773, 2014.
[4] Yi Sun, Ding Liang, Xiaogang Wang, and Xiaoou Tang, “Deepid3: Face recognition with very deep neural networks,” CoRR, vol. abs/1502.00873, 2015.
[5] Yi Sun, Xiaogang Wang, and Xiaoou Tang, “Sparsifying neural network connections for face recognition,” CoRR, vol. abs/1512.01891, 2015.
[6] Florian Schroff, Dmitry Kalenichenko, and James Philbin, “Facenet: A unified embedding for face recognition and clustering,” CoRR, vol. abs/1503.03832, 2015.
[7] SwamiSankaranarayanan,AzadehAlavi,CarlosD.Castillo,andRamaChellappa,“Triplet probabilistic embedding for face verification and clustering,” CoRR, vol. abs/1604.05417, 2016.
[8] Swami Sankaranarayanan, Azadeh Alavi, and Rama Chellappa, “Triplet similarity embed- ding for face verification,” CoRR, vol. abs/1602.03418, 2016.
[9] Jingtuo Liu, Yafeng Deng, Tao Bai, and Chang Huang, “Targeting ultimate accuracy: Face recognition via deep embedding,” CoRR, vol. abs/1506.07310, 2015.
27
[10] Changxing Ding and Dacheng Tao, “Robust face recognition via multimodal deep face representation,” CoRR, vol. abs/1509.00244, 2015.
[11] Li Shen, Zhouchen Lin, and Qingming Huang, “Center-loss:A Discriminative Feature Learning Approach for Deep Face Recognition Yandong,” vol. 1, pp. 499–515, 2015.
[12] Xiao Zhang, Zhiyuan Fang, Yandong Wen, Zhifeng Li, and Yu Qiao, “Range loss for deep face recognition with long-tail,” CoRR, vol. abs/1611.08976, 2016.
[13] Yue Wu, Hongfu Liu, Jun Li, and Yun Fu, “Deep Face Recognition with Center Invariant Loss,” pp. 408–414, 2017.
[14] Connor J. Parde, Carlos D. Castillo, Matthew Q. Hill, Y. Ivette Colon, Swami Sankara- narayanan, Jun-Cheng Chen, and Alice J. O’Toole, “Deep convolutional neural network features and the original image,” CoRR, vol. abs/1611.01751, 2016.
[15] Craig M.B. Marsh, Tracy P. Hamilton, Yaoming Xie, and Henry F. Schaefer, “L2- constrained Softmax Loss for Discriminative Face Verification Rajeev,” The Journal of Chemical Physics, vol. 96, no. 7, pp. 5310–5317, 1992.
[16] Christian Szegedy, Sergey Ioffe, and Vincent Vanhoucke, “Inception-v4, inception-resnet and the impact of residual connections on learning,” CoRR, vol. abs/1602.07261, 2016.
[17] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun, “Deep Residual Learning for Image Recognition,” 2015.
[18] Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei, “ImageNet: A large-scale hierarchical image database,” 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255, 2009.
[19] Rajeev Ranjan, Carlos D. Castillo, and Rama Chellappa, “L2-constrained Softmax Loss for Discriminative Face Verification,” 2017.
[20] Bolei Zhou, Aditya Khosla, Agata Lapedriza, Antonio Torralba, and Aude Oliva, “Places: An Image Database for Deep Scene Understanding,” pp. 1–12, 2016.
28

[21] Florian Schroff, Dmitry Kalenichenko, and James Philbin, “FaceNet: A unified embedding for face recognition and clustering,” Proceedings of the IEEE Computer Society Confer- ence on Computer Vision and Pattern Recognition, vol. 07-12-June-2015, pp. 815–823, 2015.

簡易檢索 / 詳目顯示

相關論文