跳到主要內容

簡易檢索 / 詳目顯示

研究生: 張庭韶
Ting-Shao Chang
論文名稱: 運用超解析技術提升衛星影像物件偵測準確率
Improving Object Detection Accuracy in Satellite Imagery through Super Resolution Methods
指導教授: 蔡富安
Fuan Tsai
口試委員:
學位類別: 碩士
Master
系所名稱: 工學院 - 土木工程學系
Department of Civil Engineering
論文出版年: 2024
畢業學年度: 113
語文別: 英文
論文頁數: 99
中文關鍵詞: 超解析物件偵測深度學習
外文關鍵詞: Super-Resolution, Object Detection, Deep Learning
相關次數: 點閱:18下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 衛星影像判讀在許多方面具有重要應用。在部分應用時,為精確識別和分析事件,除依賴分析師的專業經驗與對事件背景的了解之外,關鍵在於能否迅速獲取高解析度的衛星影像,並清楚呈現目標細節。然而,偵照過程中受限於經費或遇到較大傾角或惡劣天氣條件,影像的空間解析度可能不如預期,導致物體細節模糊不清。但分析師仍需要高解析度影像來監控如非法走私船等特定目標,從而精準評估局勢。為解決此問題,本研究旨在採用Real-ESRGAN和Efficient Super-Resolution Transformer (ESRT) 兩種超解析度模型提升影像的空間解析度,使影像在視覺上更加銳利,並根據實驗結果評估這兩種方法的優劣。實驗結果顯示,經由Real-ESRGAN處理的高解析度和超高解析度影像在視覺效果上更為銳利,但部分細節無法完全重建,部分特徵可能會失真或紋理無法完整保留;相比之下,ESRT模型產生的超解析影像在邊緣銳利度上雖不及Real-ESRGAN,但在影像紋理細節的重建上表現較佳。
    近年來,隨著衛星星群的快速發展,衛星影像數量大幅增加。然而,分析師在判讀影像時必須逐一檢視影像上每處細節,工作過程耗時費力。因此,本研究的另一個目標是運用物件偵測技術,快速識別影像中的關鍵目標,幫助分析師更快獲取重要資訊,提升整體工作效率。然而,由於高解析度影像的成本高昂,部分預算有限的使用者無法輕易取得,僅能購買較低解析度的影像。但較低解析度影像的細節不夠清晰,可能導致物件偵測的準確度下降,難以達到輔助分析師快速分析影像的效果。為解決此問題,本研究嘗試先利用超解析方法增強原始影像的空間特徵,並將增強後的影像輸入物件偵測模型進行測試。實驗結果顯示,相比與原始解析度影像,超解析影像應用於物件偵測的準確率顯著提高,另強化後的影像空間特徵能幫助分析師更準確地判釋目標,從而進行更全面的分析。


    Satellite image interpretation is crucial in assorted applications. To accurately identify and analyze targets or events, in addition to relying on analysts' expertise and understanding of the background, a helpful approach to obtaining high-resolution satellite images that present target details quickly. However, due to budget and other operational factors, if large off-nadir angles or adverse weather conditions are encountered, the spatial resolution of the image may not be enough, making object details unclear. Despite this, analysts still require high-resolution images to monitor specific targets, such as illegal smuggling vessels, to assess the situation accurately. To address this issue, this study enhances image spatial resolution using two super-resolution (SR) models, Real-ESRGAN and Efficient Super-Resolution Transformer (ESRT), making the images visually sharper. The study evaluates the advantages and disadvantages of these two methods. Experimental results show that images processed with Real-ESRGAN become visually sharper, especially in high-resolution and very high-resolution images. However, some details may not be fully reconstructed, and certain features may be distorted or textures not well preserved. In contrast, the super-resolution images produced by the ESRT model are less sharp in terms of edge clarity but perform better in reconstructing image texture details.
    In addition, with the rapid development of satellite constellations, the objective of satellite images has significantly increased. However, analysts must examine every detail of the images during interpretation, making the process time-consuming and labor-intensive. Therefore, another objective of this study is to apply object detection technology to quickly identify critical targets within the images, helping analysts to obtain important information more efficiently and improve overall task operation. However, due to the high cost of high-resolution images, some users with limited budgets can only afford lower-resolution images. The lack of precise details in these lower-resolution images may reduce object detection accuracy, making it challenging for analysts to quickly interpret the images. To address this issue, this study attempts first to enhance the spatial features of the original images using super-resolution methods, then analyze the enhanced images with object detection models. The experimental results indicate that compared to the original resolution images, object detection accuracy has been significantly improved when using super-resolution images. Additionally, the enhanced spatial features of the images help analysts interpret targets more accurately, enabling more comprehensive analysis.

    摘要 v Abstract vii Table of Contents x List of Figures xii List of Tables xv Chapter 1. Introduction 1 1-1 Research Origin and Background 1 1-2 Objectives 4 1-3 Thesis Structure 5 Chapter 2. Literature Review 7 2-1 Super-Resolution 7 2-1-1 Traditional methods 8 2-1-2 Deep Learning methods 10 2-2 Object Detection 16 2-3 SR for Object Detection 18 Chapter 3. Image Dataset 20 3-1 Training Data for Super-Resolution Model 20 3-2 Testing Data for Super-Resolution Model 21 3-2-1 Very High-Resolution Satellite Image 21 3-2-2 High-Resolution Satellite Image 23 3-3 Dataset for Object Detection Model 24 Chapter 4. Methodology 26 4-1 Super-Resolution using Real ESRGAN model 29 4-1-1 Image Pre-processing 29 4-1-2 Model architecture 30 4-1-3 Training process and parameter settings 32 4-2 Super-Resolution using the ESRT model 34 4-2-1 Image Pre-processing 34 4-2-2 Model architecture and operational mechanism 34 4-3 Object detection methods 46 Chapter 5. Results 52 5-1 Super-Resolution Results 52 5-2 Object Detection Results 57 Chapter 6. Discussions 74 6-1 Comparison of the Real-ESRGAN and ESRT Models 74 6-2 The Impact of Object Edge Sharpness on Object Detection Accuracy 76 6-3 Super-resolution for object detection 76 Chapter 7. Conclusion and Future Work 78 Reference 81

    Aiello, M., Vezzoli, R., & Gianinetto, M., 2019. Object-based image analysis approach for vessel detection on optical and radar images. Journal of Applied Remote Sensing, Vol. 13, Issue 1, 014502.
    Alain, H.,& Djemel, Z., 2010. Image quality metrics: PSNR vs. SSIM. 2010 20th International Conference on Pattern Recognition., Istanbul, Turkey., Page(s):2366 – 2369.
    Boain, R. J., 2004. A-B-Cs of Sun-Synchronous Orbit Mission Design. AAS/AIAA Space Flight Mechanics Conference Proceedings, AAS04-108, Hawaii, USA, 8-12 February, 2004.
    Dong, C., Loy, C. C., He, K., & Tang, X., 2015. Image super-resolution using deep convolutional networks. IEEE transactions on pattern analysis and machine intelligence, 38(2), 295-307.
    Feng, F., Hu, Y., Li, W., & Yang, F., 2024. Improved YOLOv8 algorithms for small object detection in aerial imagery. Journal of King Saud University-Computer and Information Sciences, 36(6), 102113.
    Girshick, R., 2015. Fast r-cnn. In Proceedings of the IEEE international conference on computer vision (pp. 1440-1448).
    Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., & Bengio, Y., 2014. Generative Adversarial Nets. NeurIPS 2014, 3. doi:10.1145/3422622.
    Jia, Xu., Chen, S., Xu, C., Liu, J., Tian, Q., Han, Z., Dai, Z., & Ren, X., 2020. Unsupervised image super-resolution with an indirect supervised path. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops (pp. 468-469).
    Jocher, G. “YOLOv8 by Ultralytics”, https://github.com/ultralytics/ultralytics, 2023. Accessed: June 8, 2023.
    Khaledyan, D., Amirany, A., Jafari, K., Moaiyeri, M. H., Khuzani, A. Z., & Mashhadi, N., 2020. Low-cost implementation of bilinear and bicubic image interpolation for real-time image super-resolution. In 2020 IEEE Global Humanitarian Technology Conference (GHTC) (pp. 1-5). IEEE.
    Kim, J., Lee, J.K., & Lee, K.M., 2016. Accurate image super resolution using very deep convolutional networks. The IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2016, pp. 1646-1654.
    Lam, D., Kuzma, R., McGee, K., Dooley, S., Laielli, M., Klaric, M., Bulatov, Y. & McCord, B., 2018. xview: Objects in context in overhead imagery. arXiv preprint arXiv:1802.07856.
    Ledig, C., Theis, L., Huszar, F., Caballero, J., Cunningham, A., Acosta, A., Aitken, A., Tejani, A., Totz, J., Wang, Z., & Shi, W., 2017. Photo-Realistic Single Image Super resolution Using a Generative Adversarial Network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017, pp. 105–114.
    Li, Y., Zhang, K., Liang, J., Cao, J., Liu, C., Gong, R., Zhang, Y., Tang., H, Liu., Y, Demandolx., D, Ranjan, R., Timofte, R, & Van Gool, L., 2023. Lsdir: A large scale dataset for image restoration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 1775-1787).
    Liang, J., Cao, J., Sun, G., Zhang, K., Van Gool, L., & Timofte, R., 2021. Swinir: Image restoration using swin transformer. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 1833-1844).
    Lin, T. Y., Dollár, P., Girshick, R., He, K., Hariharan, B., & Belongie, S., 2017. Feature pyramid networks for object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 2117-2125).
    Liu, S., Qi, L., Qin, H., Shi, J., & Jia, J., 2018. Path aggregation network for instance segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 8759-8768).
    Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C. Y., & Berg, A. C., 2016. Ssd: Single shot multibox detector. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14 (pp. 21-37). Springer International Publishing.
    Lim, B., Son, S., Kim, H., Nah, S., & Lee, K., 2017. Enhanced Deep Residual Networks for Single Image Super resolution. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops,2017, pp136-144.
    Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., & Guo, B.,2021. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 10012-10022).
    Lu, Z., Li, J., Liu, H., Huang, C., Zhang, L., & Zeng, T., 2022. Transformer for single image super-resolution. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 457-466).
    Redmon, J., Divvala, S., Girshick, R., & Farhadi, A., 2016. You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 779-788).
    Ren, S., He, K., Girshick, R., & Sun, J., 2016. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE transactions on pattern analysis and machine intelligence, 39(6), 1137-1149.
    Rukundo, O., & Cao, H., 2012. Nearest neighbor value interpolation. arXiv preprint arXiv:1211.1768.
    Shang, T., Dai, Q., Zhu, S., Yang, T., & Guo, Y., 2020. Perceptual extreme super-resolution network with receptive field block. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops (pp. 440-441).
    Shermeyer, J., & Van Etten, A., 2019. The effects of super-resolution on object detection performance in satellite imagery. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (pp. 0-0).
    Su, X., Zhang, J., Ma, Z., Dong, Y., Zi, J., Xu, N., Zhang, H., Xu, F., & Chen, F., 2024. Identification of Rare Wildlife in the Field Environment Based on the Improved YOLOv5 Model. Remote Sensing, 16(9), 1535.
    Tang, C., Feng, Y., Yang, X., Zheng, C., & Zhou, Y., 2017. The object detection based on deep learning. In 2017 4th international conference on information science and control engineering (ICISCE) (pp. 723-728). IEEE.
    Terven, J., Córdova-Esparza, D. M., & Romero-González, J. A., 2023. A comprehensive review of yolo architectures in computer vision: From yolov1 to yolov8 and yolo-nas. Machine Learning and Knowledge Extraction, 5(4), 1680-1716.
    Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L. and Polosukhin, I., 2017. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems, pp. 6000–6010, 2017.
    Wang, X., & Song, J., 2021. ICIoU: Improved loss based on complete intersection over union for bounding box regression. IEEE Access, 9, 105686-105695.
    Wang, X., Yu, K., Wu, S., Gu, J., Liu, Y., Dong, C., Loy, C.C., Q, Y., & Tang, X., 2018. ESRGAN: enhanced super resolution generative adversarial networks. European Conference on Computer Vision.,2018, pp 63-79.
    Wang, X., Xie, L., Dong, C., & Shan, Y., 2021. Real-ESRGAN: Training Real-World Blind Super resolution with Pure Synthetic Data. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, 2021, pp. 1905-1914.
    Wang, X., Yu, K., Dong, C., & Loy, C. C., 2018. Recovering realistic texture in image super-resolution by deep spatial feature transform. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 606-615).
    Zhang, J., Lei, J., Xie, W., Fang, Z., Li, Y., & Du, Q., 2023. SuperYOLO: Super resolution assisted object detection in multimodal remote sensing imagery. IEEE Transactions on Geoscience and Remote Sensing, 61, 1-15.
    Zhang, Y., Guo, Z., Wu, J., Tian, Y., Tang, H., & Guo, X., 2022. Real-time vehicle detection based on improved yolo v5. Sustainability, 14(19), 12274.
    陳芷家(2023)。基於間隔密集連接策略的Swin Transformer 影像超解析度處理及其應用。淡江大學電機工程學系系 碩 士 論 文 , 新北市 。 取 自https://hdl.handle.net/11296/8f2u92

    QR CODE
    :::