跳到主要內容

簡易檢索 / 詳目顯示

研究生: 林宛儀
Wan Yi Lin
論文名稱: Multi-Proxy Loss:基於度量學習提出之損失函數用於細粒度圖像檢索
Multi-Proxy Loss: For Deep Metric Learning on Fine-grained Image Retrieval
指導教授: 范國清
Kuo-Chin Fan
韓欽銓
Chin-Chuan Han
口試委員:
學位類別: 碩士
Master
系所名稱: 資訊電機學院 - 資訊工程學系
Department of Computer Science & Information Engineering
論文出版年: 2022
畢業學年度: 110
語文別: 中文
論文頁數: 46
中文關鍵詞: 度量學習距離學習圖像檢索細粒度圖像卷積神經網路
外文關鍵詞: Deep Metric Learning, Distance metric learning, Image Retrieval, Fine-grained, Convention Network
相關次數: 點閱:21下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 本篇論文針對圖像檢索(Image retrieval )的任務上提出了一個新的損失函數。此方法基於Proxy_NCA以及Proxy_Anchor的方法上加上了多個代表點的方法,來提升樣本的豐富性。使得Batch size減少的情況下也能達到跟原來較大的batch size一樣的效果。並且使用SoftMax函數對類內代表點做加權。使得重要的代表點能得到更多的學習資源。除此損失函數地改良之外,也對現有的ResNet50進行了修改,只使用RestNet50的前三層做為特徵擷取,取消了ResNet50第三層的下採樣。並且加入了Attention機制取代原本ResNet50的第四層。Attention使用了SoftPlus函數對特徵圖的特徵做加權。使得重要的特徵能更明顯,不重要的特徵減少關注度。 相較於傳統Attention使用SoftMax函數能得到更好的效果。不管是新提出的損失函數,或是改良過後的ResNet50都相較於原始方法Recall@1都有很大的提升。


    In this paper, we propose a new loss function for Image Retrieval task. The new loss function makes an improvement based on Proxy-NCA and Proxy-Anchor Loss by adopting multiple proxies, to promote positive sample variety. Its shows better performance than Proxy-Anchor Loss even in the small batch size. Besides, we weighted intra-class proxy by SoftMax function to make important samples receive a higher gradient while training. In addition, we make some changes on ResNet50 by only using the first three-layer and adding a new attention module by using SoftPlus function to replace SoftMax. Finally, we obtain well results on recall@1 via our new method.

    摘要 I ABSTRACT II 目錄 III 圖目錄 V 表目錄 VI 第一章 簡介 1 第二章 相關研究 3 2-1 DEEP METRIC LEARNING 3 2-1-1 Triplet 3 2-1-2 Proxy-NCA 5 2-1-3 SoftTriple 7 2-1-4 Multi-Similarity 7 2-1-5 Proxy-Anchor 9 2-2 卷積神經網路 11 2-2-1 BN-inception 11 2-2-2 ResNet50 12 2-2-3 Ensemble learning 13 第三章 研究方法 14 3-1 LOSS FUNCTION設計 15 3-2 網路架構 19 第四章 實驗結果 22 4-1實驗設置 22 4-2資料集介紹 22 4-2-1 Stanford Cars Datasets(CARS196) 22 4-2-2 Caltech-UCSD Birds-200-2011 (CUB-200-2011) 23 4-3實驗測量指標 24 4-4實驗結果 25 第五章 結論與未來工作 32 參考文獻: 33

    [1] S. Na, L. Xumin, G. Yong. Research on k-means Clustering Algorithm: An Improved k-means Clustering Algorithm. Third International Symposium on Intelligent Information Technology and Security Informatics, 2010, pp. 63-67, doi: 10.1109/IITSI.2010.74.
    [2] M. A. Hearst, S. T. Dumais, E. Osuna, J. Platt and B. Scholkopf, Support vector machines. IEEE Intelligent Systems and their Applications, 1998, pp. 18-28, doi: 10.1109/5254.708428.
    [3] H, Jegou. M, Douze. and C , Schmid. Product quantization for nearest neighbor search. IEEE Trans. Pattern Anal. Mach. Intell.
    [4] F. Schroff, D. Kalenichenko and J. Philbin. FaceNet: A unified embedding for face recognition and clustering.Conference on Computer Vision and Pattern Recognition (CVPR), 2015, pp. 815-823, doi: 10.1109
    [5] X, Wang. X, Han. W, Huang. D, Dong. and M,R Scott. Multi-similarity loss with general pair weighting for deep metric learning. Conference on Computer Vision and Pattern Recognition (CVPR), 2019.
    [6] Y. Movshovitz-Attias, A. Toshev, T. Leung, S. Ioffe and S. Singh. No Fuss Distance Metric Learning Using Proxies. International Conference on Computer Vision (ICCV), Venice, Italy, 2017 pp. 360-368.doi: 10.1109
    [7] Q, Qian. L, Shang. B, Sun. J, Hu.T, Tacoma. H, Li, and R, Jin. Softtriple loss: Deep metric learning without triplet sampling. International Conference on Computer Vision (ICCV), 2019.
    [8] K,Sungyeon. K, Dongwon. C, Minsu. K, Suha. Proxy Anchor Loss for Deep Metric Learning. https://doi.org/10.48550/arXiv.2003.13911
    [9] Sergey Ioffe and Christian Szegedy. Batch normalization: Accelerating deep network training by reducing internal covariate shift. International Conference on Machine Learning (ICML), 2015.
    [10] H, Kaiming. Z, Xiangyu. R, Shaoqing. and S, Jian. Deep residual learning for image recognition. Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
    [11]G,Thomas. Dietterich Oregon State University Corvallis Oregon USA tgdcsorstedu WWW home page http://www.cs.orst.edu/tgd
    [12] B, Lucas and H, Alexander contribute https: //doi.org/10.48550/arXiv 1703.07737
    [13]V, Ashish. S, Noam. P, Niki. J, Parmar. L, Uszkoreit. A, Jones. N. Gomez, K, Lukasz and P, Illia. https://doi.org/10.48550/arXiv.1706.03762
    [14] J, Krause., M, Stark., J, Deng and L, Fei-Fei. 3d object representations for fine-grained categorization. Workshop on 3D Representation and Recognition, 2013.
    [15] C, Wah. S, Branson. P, Welinder. P, Perona. and S, Belongie. The Caltech-UCSD Birds-200-2011 Dataset. Technical Report CNS-TR-2011-001, California Institute of Technology, 2011.
    [16] D, Jia. D, Wei. S, Richard. L, Li-Jia. L, Kai and L, Fei-Fei. ImageNet: a large-scale hierarchical image database. Conference on Computer Vision and Pattern Recognition (CVPR), 2009.
    [17] L, Ilya. and H, Frank . Decoupled weight decay regularization. International Conference on Learning Representations (ICLR), 2019.
    [18] A.F, McDaid. D, Greene., and N,J Hurley. Normalized mutual information to evaluate overlapping community finding algorithms. CoRR, abs/1110.2515, 2011.
    [19] O, Song. Hyun. Deep metric learning via lifted structured feature embedding. Conference on Computer Vision and Pattern Recognition(CVPR), 2016.
    [20] S, Hyun Oh. J, Stefanie. R, Vivek. and M, Kevin. Deep metric learning via facility location.Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
    [21] K, Sohn. Improved deep metric learning with multi-class n-pair loss objective. Neural Information Processing Systems (NIPS), 2016.
    [22] C. Szegedy et al.Going deeper with convolutions.Conference on Computer Vision and Pattern Recognition (CVPR), 2015, pp. 1-9, doi: 10.1109.
    [23] J, Wang. F, Zhou. S, Wen. X, Liu. and Y, Lin. Deep metric learning with angular loss. International Conference on Computer Vision (ICCV), 2017.
    [24] M,T Law. R, Urtasun. and R,S Zemel. Deep spectral clustering learning. International Conference on Machine Learning (ICML), 2017.
    [25] G, Weifeng. H, Weilin. D, Dengke, and S, Matthew. Deep metric learning with hierarchical triplet loss. European Conference on Computer Vision (ECCV), 2018.
    [26] A, Zhai. and H, Wu. Classification is a strong baseline for deep metric learning. British Machine Vision Conference (BMVC), 2019.
    [27] P, Jacob. D, Picard. A, Histace. and E, Klein. Metric learning with HORDE: high-order regularizer for deep embeddings. International Conference on Computer Vision (ICCV), 2019.
    [28] H, Xuan. A, Stylianou. and R, Pless. Improved embeddings with easy positive triplet mining. Winter Conference on Applications of Computer Vision (WACV), 2020.
    [29] I, Elezi. S, Vascon. A, Torchinovich. M, Pelillo. and L, LealTaixe. The group loss for deep metric learning. European Conference in Computer Vision (ECCV), 2020.
    [30] E, Teh. T, DeVries., and G,W Taylor. Proxynca++: Revisiting and revitalizing proxy neighborhood component analysis. European Conference on Computer Vision (ECCV), 2020.
    [31] Y, Zhu., M, Yang. C, Deng. and W, Liu. Fewer is more: A deep graph metric learning perspective using fewer proxies. Conference and Workshop on Neural Information Processing Systems (NeurIPS), 2020.

    QR CODE
    :::