| 研究生: |
王薏婷 YI-TING WANG |
|---|---|
| 論文名稱: |
遞歸神經網絡在語音辨識上之表現分析 |
| 指導教授: | 洪盟凱 |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
理學院 - 數學系 Department of Mathematics |
| 論文出版年: | 2019 |
| 畢業學年度: | 107 |
| 語文別: | 中文 |
| 論文頁數: | 54 |
| 中文關鍵詞: | 語音辨識 |
| 相關次數: | 點閱:13 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
語音辨識是人工智慧相當關注的領域,但受限於不同環境的影響,至今依舊
難有一個系統能如人類般清晰的識別。本研究旨在探討梅爾頻率倒譜系數(MFCCs)
及連接性音頻分類(CTC)在語音辨識系統上的功能性。
本研究使用github 上所提供的無噪聲語料,以不同的處理方式建構遞歸神
經網絡模型,並選定一些變因做為探討比較的對象。
Speech recognition is part of the artificial intelligence that is highly
concerned, but is limited by different environmental influences. It is
still a difficult subject to have a system that can be clearly identified
as humans. This study aims to investigate the functionality of the Mel
Frequency Cepstral Coefficients (MFCCs) and the Connectionist Temporal
Classification (CTC) on speech recognition systems. This study uses the
noise-free corpus provided on github to construct a recursive neural
network model in different ways, and selects some variables as the object
of discussion and comparison.
[1].周志華(2016)。清華大學出版社,機器學習。
[2].蘇木春,張孝德。全華圖書股份有限公司。機器學習:類神經網絡、模糊系
統以及基因演算法則。
[3].黃安埠(2017)。電子工業出版社,深入淺出深度學習-原理剖析與python
實踐。
[4].林大貴 (2017)。TensorFlow + Keras 深度學習人工智慧實務應用。博碩出
版社。
[5].Nikhil Buduma (2018)。Deeping Learning 深度學習基礎 – 設計下一代
人工智慧演算法。碁峰資訊股份有限公司。
[6].Logan, Beth. "Mel Frequency Cepstral Coefficients for Music
Modeling." ISMIR. Vol. 270. 2000.
[7]. Ordonez, F.J., Englebienne, G.,de Toledo, P., van Kasteren, T.,
Sanchis, A., Krose, B. In-Home Activity Recognition: Bayesian
Inference for Hidden Markov Models. Perv. Comput. IEEE 2014, 13,
67–75.
[8].Ruder, S. (2017). An overview of gradient descent optimization
algorithms. arXiv:1609.04747 .
[9]. Rashidi, P.; Cook, D.J. The resident in the loop: Adapting the smart
home to the user. IEEE Trans. Syst. Man. Cybern. J. Part A 2009, 39,
949–959.
[10]. Qian, N. (1999). On the momentum term in gradient descent learning
algorithms. Nerual networks: the official journal of the
International Neural Network Society,12:145-151.
[11]. Graves, Alex. Fernández, Santiago. Gomez, Faustino (2006).
43
"Connectionist temporal classification: Labelling unsegmented
sequence data with recurrent neural networks". In Proceedings of the
International Conference on Machine Learning, ICML 2006: 369–376.
[12]. Sainath, T., Vinyals, O., Senior, A., Sak, H. Convolutional,
Long Short-Term Memory, fully connected Deep Neural Networks. In
Proceedings of the 40th International Conference on Acoustics,
Speech and Signal Processing (ICASSP), Brisbane, Australia, 19–
24 April 2015; pp. 4580–4584.