| 研究生: |
鄭文喻 Wen-Yu Cheng |
|---|---|
| 論文名稱: |
深度學習結合擴增實境之繪圖場景建構系統 A Drawing Scene Construction System based on the Integration of Augmented Reality and Deep Learning |
| 指導教授: |
蘇木春
Mu-Chun Su |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
資訊電機學院 - 資訊工程學系 Department of Computer Science & Information Engineering |
| 論文出版年: | 2019 |
| 畢業學年度: | 107 |
| 語文別: | 中文 |
| 論文頁數: | 105 |
| 中文關鍵詞: | 擴增實境 、Unity3D 、物件偵測 、生成對抗網路 、繪圖 |
| 外文關鍵詞: | augmented reality, Unity3D, object detection, generative adversarial network, drawing |
| 相關次數: | 點閱:11 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
對於孩童而言,繪圖是個有趣又能表達自我的活動,繪圖不僅富有樂趣,更可以幫助孩童的手眼協調性、提升孩童的視覺化思考、創造力以及自信心的培養。本論文提出一套繪圖場景建構系統,結合繪圖與擴增實境技術,並使用低成本且常見的繪圖工具,給予孩童在繪畫與視覺上多層次的體驗。
本系統透過行動裝置作為人機介面,提供簡易的操作,將繪圖場景結合擴增實境技術,使空間具體化後,更有助於孩童學習抽象概念,增進孩童享受繪畫的樂趣。系統功能概分為四個部分: (1)繪圖物件辨識(2)旋轉角度分析(3)立體模型貼圖生成(4)擴增實境場景呈現。
目前本系統已實作八種繪圖物件模型。根據論文中實驗顯示,此八種類別偵測的平均辨識率達到89%,在實際測試下系統建構出的擴增實境場景辨識能力穩定性為95%,由此證明本系統對於繪圖辨識及場景建構上擁有良好的呈現效果。
For children, drawing is an interesting and self-expression activity. Drawing is not only fun, but also helps children improve their hand-eye coordination. It could enhance children's visual thinking, creativity and self-confidence. This thesis proposes a drawing scene construction system which is combined with drawing and augmented reality technology. The system uses low-cost and common drawing tools to give children a multi-level experience in painting and visualization.
The proposed system involves in simple operations through the use of a mobile device as a human-machine interface, and combines the drawing scene with the augmented reality technology to make a 2-D painting become a 3-D concrete scene and help children learn abstract concepts and improve children's enjoyment of painting. The system is consisted of four modules: (1) a drawing object recognition module, (2) a rotation angle analysis module, (3) a 3-D model texture generation module, and (4) an augmented reality scene rendering module.
At present, eight kinds of drawing object models have been implemented in the system. Simulation results showed that the average recognition rate of the eight models could reach 89% correct. The field test also showed that the stability of the scene recognition ability based on the augmented reality was about 95%. These testing results demonstrated that the drawing scene construction system can provide accurate recognition and has a good rendering effect.
[1] P. Milgram, H. Takemura, A. Utsumi, and F. Kishino, "Augmented reality: A class of displays on the reality-virtuality continuum," in Telemanipulator and telepresence technologies, vol. 2351, pp. 282-293, 1995.
[2] R. T. J. P. T. Azuma and V. Environments, "A survey of augmented reality," vol. 6, no. 4, pp. 355-385, 1997.
[3] "SketchAR:如何用AR繪畫," SketchAR, [Online]. Available: https://play.google.com/store/apps/details?id=ktech.sketchar&hl=zh_TW. [Accessed 8 - Jun - 2019].
[4] "Just a Line - 隨時隨地透過 AR 作畫," Google Creative Lab, [Online]. Available: https://play.google.com/store/apps/details?id=com.arexperiments.justaline&hl=zh_TW. [Accessed 7 - Jun - 2019].
[5] "Disney Colour and Play," StoryToys Entertainment Limited, [Online]. Available: https://apps.apple.com/ca/app/disney-colour-and-play/id957471210. [Accessed 8 - Jun - 2019].
[6] "Augmented Creativity," Disney Research, [Online]. Available: https://studios.disneyresearch.com/augmented-creativity/. [Accessed 8 - Jun - 2019].
[7] R. Girshick, J. Donahue, T. Darrell, and J. Malik, "Rich feature hierarchies for accurate object detection and semantic segmentation," in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 580-587, 2014.
[8] R. Girshick, "Fast r-cnn," in Proceedings of the IEEE international conference on computer vision, pp. 1440-1448, 2015.
[9] S. Ren, K. He, R. Girshick, and J. Sun, "Faster r-cnn: Towards real-time object detection with region proposal networks," in Advances in neural information processing systems, pp. 91-99, 2015.
[10] K. He, G. Gkioxari, P. Dollár, and R. Girshick, "Mask r-cnn," in Proceedings of the IEEE international conference on computer vision, pp. 2961-2969, 2017.
[11] Y. LeCun, C. Cortes and C. J. Burges, "THE MNIST DATABASE of Handwritten Digits," [Online]. Available: http://yann.lecun.com/exdb/mnist/. [Accessed 10 - Jun - 2019].
[12] J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, "You only look once: Unified, real-time object detection," in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 779-788, 2016.
[13] W. Liu et al., "SSD: Single shot multibox detector," in European conference on computer vision, pp. 21-37, 2016.
[14] K. E. Van de Sande, J. R. Uijlings, T. Gevers, and A. W. Smeulders, "Segmentation as selective search for object recognition," in ICCV, vol. 1, no. 2, p. 7, 2011.
[15] K. Simonyan and A. Zisserman, "Very Deep Convolutional Networks for Large-Scale Image Recognition," in International Conference on Learning Representations, pp. 1-14, 2014.
[16] C. Szegedy et al., "Going deeper with convolutions," in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1-9, 2015.
[17] K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770-778, 2016.
[18] C. Szegedy, S. Ioffe, V. Vanhoucke, and A. A. Alemi, "Inception-v4, inception-resnet and the impact of residual connections on learning," in Thirty-First AAAI Conference on Artificial Intelligence, 2017.
[19] "Vanishing Gradient Problem," Wikipedia, [Online]. Available: https://en.wikipedia.org/wiki/Vanishing_gradient_problem. [Accessed 10 - Jun - 2019].
[20] I. Goodfellow et al., "Generative adversarial nets," in Advances in neural information processing systems, pp. 2672-2680, 2014.
[21] M. Mirza and S. Osindero, "Conditional Generative Adversarial Nets," arXiv preprint arXiv:1411.1784, 2014.
[22] P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros, "Image-to-image translation with conditional adversarial networks," in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1125-1134, 2017.
[23] O. Ronneberger, P. Fischer, and T. Brox, "U-net: Convolutional networks for biomedical image segmentation," in International Conference on Medical image computing and computer-assisted intervention, pp. 234-241, 2015.
[24] "Unity-Multiplatform Support," Unity, [Online]. Available: https://unity3d.com/unity/features/multiplatform. [Accessed 12 - Jun - 2019].
[25] "Optimizing Target Detection and Tracking Stability," Vuforia, [Online]. Available: https://library.vuforia.com/articles/Solution/Natural-Features-and-Ratings. [Accessed 10 - Jun - 2019].
[26] "UV mapping-Wikipedia," Wikipedia, [Online]. Available: https://en.wikipedia.org/wiki/UV_mapping. [Accessed 11 - Jun - 2019].
[27] "quickdraw-Dataset," Google Creative Lab, [Online]. Available: https://github.com/googlecreativelab/quickdraw-dataset. [Accessed 13 - Jun - 2019].
[28] "Quick, Draw!," Google Creative Lab, [Online]. Available: https://experiments.withgoogle.com/quick-draw. [Accessed 12 - Jun - 2019].
[29] J. Canny, "A computational approach to edge detection," in Readings in computer vision: Elsevier, pp. 184-203, 1987.
[30] A. Maćkiewicz, W. J. C. Ratajczak, and Geosciences, "Principal components analysis (PCA)," vol. 19, no. 3, pp. 303-342, 1993.
[31] "主成分分析(Principal Component Analysis, PCA)," Tommy Huang, [Online]. Available: https://medium.com/@chih.sheng.huang821/%E6%A9%9F%E5%99%A8-%E7%B5%B1%E8%A8%88%E5%AD%B8%E7%BF%92-%E4%B8%BB%E6%88%90%E5%88%86%E5%88%86%E6%9E%90-principle-component-analysis-pca-58229cd26e71. [Accessed 12 - Jun - 2019].
[32] "TurboSquid: Free 3D Models," TurboSquid, [Online]. Available: https://www.turbosquid.com/. [Accessed 12 - Jun - 2019].