跳到主要內容

簡易檢索 / 詳目顯示

研究生: 王映筑
Ying-Chu Wang
論文名稱: 運用Openpose改善以GAN為基礎之虛擬試衣視覺效果
Using Openpose To Improve GAN-based Virtual Try-on System
指導教授: 范國清
Kuo-Chin Fan
口試委員:
學位類別: 碩士
Master
系所名稱: 資訊電機學院 - 資訊工程學系
Department of Computer Science & Information Engineering
論文出版年: 2021
畢業學年度: 109
語文別: 中文
論文頁數: 71
中文關鍵詞: 生成對抗網路譜歸一化生成對抗網路虛擬試衣視覺試衣人體骨幹偵測
外文關鍵詞: GAN, SNGAN, virtual try-on, visual try-on, Openpose
相關次數: 點閱:7下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 網路購物的蓬勃發展帶動實體商店逐漸轉型,積極開發網路銷售平台,為了在電子銷售通路上提升其銷售額,因此,欲提供更多商品相關資訊給消費者,激發客戶對商品的期待感,其中,服飾類別為網路購物銷售主要產品之一,因此,各種服飾品牌紛紛推出虛擬試衣,消費者不用親自試穿衣服,也能有對該衣服之參考依據。
    電腦視覺技術為虛擬試衣重要之一環,生成對抗網路(GAN)已被廣泛使用於此類領域,在本論文中,我們採取三個譜歸一化生成對抗網路(SNGAN)及Openpose人體節點偵測去生成盡可能自然的試衣成果。三個譜歸一化生成對抗網路分別用以生成目標人物手臂、目標人物著目標衣服之變形衣服與最後的試衣結果,因非傳統3D建模後再試衣,因此,本論文將以視覺試衣稱之,整體架構由三模塊構成:語意生成模塊、衣服變形模塊、內容融合模塊,透過空間變形網路保留衣服細節,用以改善視覺試衣,使生成結果更加貼近真實自然的試衣成果,並與其他亦為視覺試衣之實驗進行比較。
    本論文僅針對上半身試衣,並對衣服、手部另做評比,搭配不同的損失函數,比較生成對抗網路之訓練結果。未來,本論文所提及之模型亦可應用於其他各式類型實驗,例:髮型設計、下半身試衣、衣物設計模擬等。

    關鍵詞:生成對抗網路、譜歸一化生成對抗網路、虛擬試衣、視覺試衣、人體骨幹偵測


    The vigorous development of online shopping drives the transformation of physical stores. The physical stores actively develop online sales platforms in order to increase their sales on electronic sales channels. Therefore, we want to provide more product information to consumers. With the information, we can stimulate customers’ desire for the products. Then customers buy it. Clothing is one of the main products sold in online shopping. Therefore, various clothing stores have launched virtual try-on system. Consumers do not need to try on the clothes in person, and they can have a reference for the clothes.
    Computer vision technology is an important part of virtual try-on system. Generative Adversarial Network has been widely used in this field. In this paper, we adopt three Spectral Normalized Generative Adversarial Networks and Openpose for human body keypoints detection. The three Spectral Normalized Generative Adversarial Networks was used respectively to generate the arms of target person, the warping clothes, and the final try-on result. We will call it visual try-on because of it only depends on image. The proposed method is composed of three modules, including semantic generation module, clothing warping module, and content fusion module. The clothes details are retained through the spatial transformer network. We hope to make the generated results as close as to the reality. This paper only try-on the upper clothing. We compare clothes and hands separately, with different loss functions. In the future, the methods mentioned in this paper can also be applied to various other types of experiments, such as hair styling, lower clothing try-on, clothing design simulation.
    Keywords: Generative Adversarial Network(GAN), spectral normalized Generative Adversarial Networks(SNGAN), virtual try-on, visual try-on, Openpose

    摘要I AbstractII 致謝III 目錄 IV 圖目錄V 表目錄VII 第一章 緒論1 1.1 研究背景與動機1 1.2 研究目的2 1.3 論文架構3 第二章 文獻探討與回顧4 2.1 人體骨幹偵測4 2.2 生成對抗網路7 2.2.1 Generative Adversarial Network7 2.2.2 Conditional Generative Adversarial Network 10 2.2.3 SN Generative Adversarial Network 11 2.3 視覺試衣15 2.3.1 VITON 15 2.3.2 CP-VTON 18 2.3.3 ACGPN 21 第三章 研究方法29 3.1 模型架構29 3.2 實驗流程30 3.3 資料集36 第四章 實驗結果42 4.1 開發環境與工具42 4.2 影像品質評估方法44 4.3 實驗結果與展示48 第五章 結論與未來研究56 參考文獻

    [1] Zhe Cao, Tomas Simon, Shih-En Wei, and Yaser Sheikh, “OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields” IEEE Transactions on Pattern Analysis and Machine Intelligence, pp.172-186, 2021.
    [2] Alexander Toshev, and Christian Szegedy, “DeepPose: Human Pose Estimation via Deep Neural Networks” in Proceeding of the IEEE Conference on Computer Vision and Pattern Recognition, pp.1653-1660, 2014.
    [3] Jonathan Tompson, Arjun Jain, Yann LeCun, and Christoph Bregler, “Joint Training of a Convolutional Network and a Graphical Model for Human Pose Estimation” in Advances in Neural Information Processing Systems, 2014.
    [4] Alejandro Newell, Kaiyu Yang, and Jia Deng, “Stacked Hourglass Networks for Human Pose Estimation” in Proceedings of the European Conference on Computer Vision (ECCV), 2016.
    [5] Han Yang, Ruimao Zhang, Xiaobao Guo, Wei Liu, Wangmeng Zuo, and Ping Luo, “Towards Photo-Realistic Virtual Try-On by Adaptively Generating Preserving Image Context” in arXiv: 2003.05863, 2020.
    [6] Xintong Han, Zuxuan Wu, Zhe Wu, Ruichi Yu, and Larry S. Davis, “VITON: An Image-based Virtual Try-on Network” in Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp.7543-7552, 2018.
    [7] Ke Gong, Xiaodan Liang, Dongyu Zhang, Xiaohui Shen, and Liang Lin, “Look into Person: Self-supervised Structure-sensitive Learning and A New Benchmark for Human Parsing” in Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, pp.932-940, 2017.
    [8] Olaf Ronneberger, Philipp Fischer, and Thomas Brox, “U-Net: Convolutional Networks for Biomedical Image Segmentation” in MICCAI, pp.234-241, 2015.
    [9] Liqian Ma, Xu Jia, Qianru Sun, Bernt Schiele, Tinne Tuytelaars, and Luc Van Gool, “Pose Guided Person Image Generation” in Advances in Neural Information Processing Systems, 2017.
    [10] Max Jaderberg, Karen Simonyan, Andrew Zisserman, and Koray Kavukcuoglu, “Spatial Transformer Networks” in Advances in Neural Information Processing Systems, pp.2017-2025, 2015.
    [11] Jean Duchon, “Splines minimizing rotation-invariant seminorms in sobolev spaces” in Construction theory of functions of several variables, pp. 85-100, 1977.
    [12] Shane Barratt, and Rishi Sharma, “A Note on the Inception Score” in arXiv preprint arXiv: 1801.01973, 2018.
    [13] Bochao Wang, Huabin Zheng, Xiaodan Liang, Yimin Chen, Liang Lin, and Meng Yang, “Towards Characteristic-Preserving Image-based Virtual Try-On Network” in Proceedings of the European Conference on Computer Vision, pp. 589-603, 2018.
    [14] Guilin Liu, Fitsum A. Reda, Kevin J. Shih, Ting-Chun Wang, Andrew Tao, and Bryan Catanzaro, “Image Inpainting for Irregular Holes Using Partial Convolutions” in Proceedings of the European Conference on Computer Vision, pp. 85-100, 2018.
    [15] J. Johnson, A. Alahi, and L. Fei-Fei, “Perceptual Losses for Real-Time Style Transfer and Super-Resolution,” European Conference on Computer Vision, pp. 694-711, 2016.
    [16] Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio, “Generative Adversarial Nets” in Proc. Advances Neural Information Processing Systems Conf., pp.2672-2680, 2014.
    [17] Christian Ledig, Lucas Theis, Ferenc Huszar, Jose Caballero, Andrew Cunningham, Alejandro Acosta, Andrew Aitken, Alykhan Tejani, Johannes Totz, Zehan Wang, and Wenzhe Shi, “Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network” in arXin: 1609.04802, 2016.
    [18] Yunjey Choi, Minje Choi, Munyoung Kim, Jung-Woo Ha, Sunghun Kim, and Jaegul Choo, “StarGAN: Unified Generative Adversarial Networks for Multi-Domain Image-to-Image Translation” in CVPR, pp.8789-8797, 2018.
    [19] Ming-Yu Liu, and Oncel Tuzel, “Coupled Generative Adversarial Networks”, in Advances in Neural Information Processing Systems, pp.469-477, 2016
    [20] Li-Chia Yang, Szu-Yu Chou, and Yi-Hsuan Yang, “MidiNet: A Convolutional Generative Adversarial Network for Symbolic-domain Music Generation” in Proceedings of the 18th International Society for Music Information Retrieval Conference, pp.314-331, 2017.
    [21] Hao-Wen Dong, Wen-Yi Hsiao, Li-Chia Yang, and Yi-Hsuan Yang “MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic Music Generation and Accompaniment” in Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018.
    [22] Olof Mogren, “C-RNN-GAN: Continuous recurrent neural networks with adversarial training” in arXiv preprint arXiv:1611.09904, 2016.
    [23] Scott Reed, Zeynep Akata, Xinchen Yan, Lajanugen Logeswaran, Bernt Schiele, and Honglak Lee, “Generative Adversarial Text to Image Synthesis”, in arXiv preprint arXiv: 1605.05396, 2016.
    [24] Han Zhang, Tao Xu, Hongsheng Li, Shaoting Zhang, Xiaogang Wang, Xiaolei Huang, and Dimitris Metaxas, “StackGAN: Text to Photo-Realistic Image Synthesis With Stacked Generative Adversarial Networks” in ICCV, 2017.
    [25] Tao Xu, Pengchuan Zhang, Qiuyuan Huang, Han Zhang, Zhe Gan, Xiaolei, Huang, and Xiaodong He, “AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks”, in Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, 2018.
    [26] Runde Li, Jinshan Pan, Zechao Li, and Jinhui Tang, “Single Image Dehazing via Conditional Generative Adversarial Network” in Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, 2018.
    [27] Rui Qian, Robby T. Tan, Wenhan Yang, Jiajun Su, and Jiaying Liu, “Attentive Generative Adversarial Network for Raindrop Removal from A Single Image” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2482-2491, 2018.
    [28] Lvmin Zhang, Yi Ji, and Xin Lin, “Style Transfer for Anime Sketches with Enhanced Residual U-net and Auxiliary Classifier GAN”, in Proceedings of Asian Con. Pattern Recognition, pp.506-511, 2017.
    [29] Mihdi Mirza, and Simon Osindero, “Conditional Generative Adversarial Nets” in CoRR, abs/1411.1784, 2014.
    [30] Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A. Efros, “Image-To-Image Translation With Conditional Adversarial Networks” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp.1125-1134, 2017.
    [31] Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Andrew Tao, Jan Kautz, and Bryan Catanzaro, “High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8798-8807, 2018.
    [32] Orest Kupyn, Volodymyr Budzan, Mykola Mykhailch, Dmytro Mishkin, and J. Matas, “DeblurGAN: Blind Motion Deblurring Using Conditional Adversarial Networks” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8183-8192, 2018.
    [33] Martin Arjovsky, and Leon Bottou, “Towards Principled Methods for Training Generative Adversarial Networks” in International Conference on Learning Representations, 2017.
    [34] Martin Arjovsky, Soumith Chintala, Leon Bottou, “Wasserstein Generative Adversarial Network”, in arXiv: 1701.07875, 2017
    [35] Ishaan Gulrajani, Faruk Ahmed, Martin Arjovsky, Vincent Dumoulin, and Aaron Courville, “Improved Training of Wasserstein GANs” in CORR, abs/1704.00028, 2017
    [36] Takeru Miyato, Toshiki Kataoka, Masanori Koyama, and Yuichi Yoshida, “Spectral Normalization for Generative Adversarial Networks” in International Conference on Learning Representation (ICLR), 2018
    [37] Github項目- Openpose 關鍵點輸出格式: https://www.aiuai.cn/aifarm712.html
    [38] Guosheng Lin, Anton Milan, Chunhua Shen, and Ian Reid, “RefineNet: Multi-Path Refinement Networks for High-Resolution Semantic Segmentation” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5168-5177, 2017.
    Anaconda介紹及安裝教學:https://medium.com/python4u/anaconda%E4%BB%8B%E7%B4%B9%E5%8F%8A%E5%AE%89%E8%A3

    QR CODE
    :::