跳到主要內容

簡易檢索 / 詳目顯示

研究生: 楊恩慈
En-Cih Yang
論文名稱: 基於粒子群演算法之三維手寫文字辨識
Stereo-based 3D Space Handwriting Recognition Tracking by Particle Swarm Optimization
指導教授: 范國清
Kuo-Chin Fan
口試委員:
學位類別: 碩士
Master
系所名稱: 資訊電機學院 - 資訊工程學系
Department of Computer Science & Information Engineering
論文出版年: 2017
畢業學年度: 105
語文別: 中文
論文頁數: 84
中文關鍵詞: 立體視覺連續密度函數粒子群演算法三維空間書寫多層感知機數字辨識
相關次數: 點閱:9下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 近年隨著低功耗之微型處理器蓬勃發展,大數據分析以及人工智慧受益於硬體性能的提升與網路的普及,得以應用於多元領域。在此背景下,電腦視覺領域同樣受惠於硬體更強的運算能力與人工智慧,能有效地解決問題、提高準確率,協助智慧化與自動化的發展。
    本論文主要研究於知悉手指與鏡頭之間距離並進行手指追蹤,並實現在三維空間中的手寫數字辨識。然而,此類三維手寫辨識系統通常利用具有紅外線感知功能的影像紀錄器,在戶外、或距離目標物較遠的情況,將無法有效地接收紅外線的反射,造成系統後續的指尖判斷、追蹤、軌跡判斷等都將難以進行。
    基於以上問題,本論文嘗試基於立體視覺產生深度資訊影像協助手指追蹤,循序漸進的方式判斷出手指,並進行追蹤,與軌跡判別。透過深度資訊,能夠判斷屬於目標的部份與鏡頭之間的距離,藉此特性能夠排除影像中非屬於此距離範圍中的物體及背景。
    本系統提出的方法不依賴設定參數取得感興趣區域的深度資訊範圍,而是利用連續密度函數( Probability Density Function, PDF )自動估算出感興趣區域的深度資訊範圍,並利用粒子群演算法( Particle Swarm Optimization, PSO ),追蹤目標物( 手掌 )。在取得手掌部份後,進一步利用灰階影像分析出手指尖的位置,改進深度影像無法檢測遠處細緻物體的問題,並記錄手指位置、移動路徑、判斷實筆( 文字筆畫 )、虛筆( 非文字筆畫的移動軌跡 )與修正軌跡。最後利用 MNIST 資料集訓練多層感知機( Multilayer Perceptron, 縮寫MLP ),將書寫軌跡輸入至多層感知機網路做數字的辨別。
    在實驗中,本系統將展示可靠、具自動追蹤、不限環境的三維空間手寫文字辨識系統。


    Recently, based on the improvements of hardware performance and the popularity of internet, big data analysis and artificial intelligence were successfully applied in a wide range of applications. Similarly, computer vision technology also benefited from the powerful performance of hardware and artificial intelligence, so that the computer vision technology could solve problem more efficiently and accurately and improve the development of automation.
    In this thesis, we aim at measuring the distance between the finger and camera, and tracking the finger to fulfill a stereo vision-based hand-writing recognition system in three-dimensional (3D) space. Traditionally, the researchers usually applied infrared sensors to recognize human's hands. However, the infrared sensor solution still be challenged in hand tracking algorithm under widely varying lighting, distance limitation, and the outdoor condition.
    As mentioned above, this thesis attempts to generate the depth information based on stereo vision for improving the finger tracking. Through the depth information, we determine and track the fingers step by step. Also, tracking target would be excluded from other objects and background.
    In this thesis, the Probability Density Function is applied to get the threshold value, which could find out the region of interest automatically instead of manually. Furthermore, the proposed system uses Particle Swarm Optimization for hand tracking. After getting the hand (palm) position in each frame, the grayscale image would be used to analyze the fingers. Finally, the multilayer perceptron is used to train the MNIST dataset for hand-writing character validation.
    The experimental results demonstrate that the proposed system could recognize hand-writing digits in 3D space in high accuracy without any constraints and restricted environment.

    摘要 IV Abstract IV 圖目錄 VIII 表目錄 XI 第一章 緒論 1 1.1 研究動機 1 1.2 相關研究 4 1.3 系統流程與論文架構 6 1.3.1 系統流程 6 1.3.2 論文架構 8 第二章 前處理與深度資訊擷取 10 2.1 相機參數與影像特性 10 2.1.1 雙鏡頭攝影機 10 2.1.2 影像扭曲與相機參數 12 2.2 相機校正與影像校正 17 2.2.1 相機校正與相機參數矩陣 17 2.2.2 影像校正 22 2.3 特徵配對與深度資訊 23 第三章 追蹤目標擷取與判定 26 3.1 移動物擷取 27 3.1.1 特徵擷取 27 3.1.2 形態學修正 29 3.2 目標判定 32 3.3 目標修正 34 3.3.1 Parzen Window Density Estimation 35 3.3.2 掌心判斷 40 第四章 追蹤目標與數字判定 41 4.1 指尖追蹤 42 4.1.1 手掌追蹤 42 4.1.2 指尖判斷 48 4.2 軌跡擷取與修正 51 4.3 軌跡判斷 53 4.3.1 多層感知機 53 第五章 追蹤目標擷取與判定 59 5.1 實驗設備與樣本 59 5.1.1 實驗設備 59 5.1.2 實驗樣本 60 5.1.3 測試指標 61 5.2 初始目標擷取與目標追蹤 62 5.2.1 初始目標擷取 62 5.2.2 目標追蹤 65 5.3 軌跡判別 67 5.3.1 多層感知機 67 5.3.2 數字識別 68 第六章 結論與未來工作 69 參考文獻 70

    [1] Microsoft Corp. Kinect for XBOX 360. Redmond WA.

    [2] H. P. H. Shum, E. S. L. Ho, “Real-time physical modeling of character movements with Microsoft Kinect”, Proc. ACM Symp. Virtual Reality Softw. Technol. (VRST ), pp. 17-24, 2012.

    [3] Z. Zhang, “Microsoft kinect sensor and its effect”, Multimedia, IEEE, vol. 19, no. 2, pp. 4–10, 2012.

    [4] L. Xia, C. Chen, J. Aggarwal, “Human detection using depth information by Kinect”, Proc. Int. Workshop HAU3D, pp. 15-22, 2011-Jun.

    [5] Y. Liu, N. Wang, C. Lv, J. Cui, “Human body fall detection based on the Kinect sensor”, 2015 8th International Congress on Image and Signal Processing (CISP ), pp. 367-371, 2015.

    [6] S. Kaenchan, P. Mongkolnam, B. Watanapa, S. Sathienpong, “Automatic multiple kinect cameras setting for simple walking posture analysis”, Computer Science and Engineering Conference (ICSEC ) 2013 International. IEEE, 2013.

    [7] Scharstein, D. “Matching Images by Comparing Their Gradient Fields”, Proceedings of the 12th IAPR International Conference on Pattern Recognition, 1, 572-575, 1994.
    [8] Lee, Z., Juang, J. and Nguyen, T.Q. “Local Disparity Estimation with Three-Moded Cross Census and Advanced Support Weight”, IEEE Transactions on Multimedia, 15, 1855-1864, 2013.
    [9] V. Borisagar and M. Zaveri, “Census and Segmentation-Based Disparity Estimation Algorithm Using Region Merging”, Journal of Signal and Information Processing, vol. 6, pp.191-202, 2015.

    [10] Zabih, R. and Woodfill, J. ,“Non-Parametric Local Transforms for Computing Visual Correspondence”, Proceedings of Third European Conference of Computer Vision, 801, 151-158, 1994.

    [11] Kamencay P, Breznan M, Jarina R, Lukac P, Zachariasova M, “ Improved depth map estimation from stereo images based on hybrid method”, Radioengineering vol. 22, no. 4, p.70-78, 2012

    [12] M. Schmeing and X. Jiang, “Color segmentation based depth image filtering”, in Proc. Int. Workshop on Depth Image Analysis, 2012.

    [13] B. McKinnon, J. Baltes, “Practical region-based matching for stereo vision”, IWCIA, Vol. 3322 of Lecture Notes in Computer Science, Springer, pp. 726-738, 2005.

    [14] Keskin C, Kirac F, Kara YE, Akarun L, “ Real time hand pose estimation using depth sensors”, In: ICCV workshops, pp. 1228-123, 2011

    [15] H. Shum, E. Ho, Y. Jiang, S. Takagi, “Real-time posture reconstruction for microsoft kinect”, IEEE Trans. Cybern., vol. 43, no. 5, pp. 1357-1369, Oct. 2013.

    [16] Y. Xu, Q. Wang, X. Bai, and Y. L. Chen, “A novel feature extracting method for dynamic gesture recognition based on support vector machine”, in Proc. IEEE Int. Conf. Inf. Autom., pp. 437-441, 2014.

    [17] G. Marin, F. Dominio, and P. Zanuttigh, “Hand gesture recognition with jointly calibrated leap motion and depth sensor”, Multimedia Tools Appl., pp. 1-25, 2015.

    [18] 鄭喬勻,「3D 立體視覺於手勢切割之應用」,國立中興大學,碩士論文,民國101年。

    [19] 徐瑩珊 ,「基於立體視覺的連續三維手勢辨識」,國立中央大學,碩士論文,民國99年。

    [20] Dominio F, Donadeo M, Zanuttigh P, “Combining multiple depth-based descriptors for hand gesture recognition”, Pattern Recogn Lett 50:101–111, 2014.

    [21] C. Pantilie, S. Bota, I. Haller and S. Nedevschi, “Real-time obstacle detection using dense stereo vision and dense optical flow”, Proc. IEEE Int. Conf. Intelligent Computer Communication and Processing (ICCP ), pp. 191-196, 2010.

    [22] 張宸銘,「應用視訊之自動化軌跡追蹤系統」,中原大學,碩士論文,民國98年。

    [23] Sanjivani Shantaiya, Kesari Verma, Kamal Mehta, “Multiple Object Tracking using Kalman Filter and Optical Flow”, European Journal of Advances in Engineering and Technology, vol. 2, no. 2, pp. 34-39, 2015.

    [24] Sepehr Aslani, Homayoun Mahdavi, “Optical Flow Based Moving Object Detection and Tracking for Traffic Surveillance”, World Academy of Science Engineering and Technology International Journal of Electrical Computer Electronics and Communication Engineering, vol. 7, no. 9, 2013.

    [25] X. Zhang, P. Jiang and F. Wang, “Overtaking Vehicle Detection Using A Spatio- temporal CRF”, IEEE Intell. Veh. Symp., no. IV, pp. 338-342, 2014
    [27] X. Zhang, W. Hu, S. Maybank, X. Li, M. Zhu, “Sequential particle swarm optimization for visual tracking”, Proc. IEEE Conf. Comput. Vision Patt. Recog., pp. 1-8, Jun. 2008.

    [28] J.R.Siddiqui, S.Khatibi, “Visual Tracking using Particle Swarm Optimization”, eprint arXiv:1401.4648, 2014.
    [29] J. Jin, A. Dundar, J. Bates, C. Farabet, E. Culurciello, “Tracking with deep neural networks”, Proc. 47th Annu. Conf. Inf. Sci. Syst. (CISS), pp. 1-5, Mar. 2013.

    [30] 朱啟文,「基於 leap motion 之三維手寫中文簽名確認」,國立中央大學,碩士論文,民國104年。

    [31] 凌于翔,「基於生物特徵之空中手寫中文簽名身份認證」,國立中央大學,碩士論文,民國105年。

    [32] 歐軒慈,「基於慣性感測器與肌肉訊號之穿戴式裝置三維手寫身份認證」,國立中央大學,碩士論文,民國105年。

    [33] N. Das, B. Das, R. Sarkar, S. Basu, M. Kundu, and M. Nasipuri, “Handwritten Bangla basic and compound character recognition using MLP and SVM classifier”, Journal of Computing, vol 2, no. 2, 2010.

    [34] A. Bellili, M. Gilloux, and P. Gallinari, “An hybrid MLP-SVM handwritten digit recognizer”, In Proc. of 6 International Conference on Document Analysis and Recognition, pages 28-31, Seattle, USA, 2001.

    [35] G. Singh, M. Sachan, "Multi-layer perceptorn (MLP ) neural network technique for offline handwritten gurmukhi character recognition", IEEE International conference on computational intelligence and computing research, pp. 221-225, December, 2014.

    [36] Y. Chherawala, P. P. Roy, and M. Cheriet, “Feature set evaluation for offline handwriting recognition systems: Application to the recur- rent neural network model”, IEEE Trans. Cybern., vol. 46, no. 12, pp. 2825–2836, Dec, 2015.

    [37] X. Zhang, F. Yin, Y. Zhang, C. Liu, Y. Bengio, “Drawing and recognizing Chinese characters with recurrent neural network”, IEEE Transactions on Pattern Analysis and Machine Intelligence, April, 2017.

    [38] Huper Laboratories Co., Ltd. HuperEyes.

    [39] Z. Zhang. “A flexible new technique for camera calibration”, IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(11 ):1330-1334, 2000.

    [40] HARTLEY, R., ZISSERMAN, “A. Multiple View Geometry in Computer Vision. 2nd ed”, Cambridge: University Press, 2005.

    [41] E. Parzen, “On the estimation of a probability density function and the mode”, Annals of Math. Stats., 33:1065-1076, 1962.
    [42] J. Kennedy, R. C. Eberhart, “Particle swarm optimization”, Proc. IEEE Int. Conf. Neural Netw., vol. 4, pp. 1942-1948, 1995.

    [43] On-line resources : The MNIST database. 2017年7月8日,取自http://yann.lecun.com/exdb/mnist/

    [44] On-line resources : Keras: The Python Deep Learning library. 2017年7月8日,取自https://keras.io

    [45] On-line resources : Opencv github samples. 2017年7月8日,取自https://github.com/opencv/opencv

    [46] Sheldon M. Ross, A First Course in Probability., 6/e., Prentice Hall, 2002.

    [47] Trevor Hastie, Robert Tibshirani, Jerome Friedman, The Elements of Statistical Learning:Data Mining, Inference, and Prediction., 2/e., Springer-Verlag New York, 2009.

    QR CODE
    :::