| 研究生: |
劉孟儒 Meng-Ru Liu |
|---|---|
| 論文名稱: |
運用VGG網絡對靜息態功能性磁振造影成分圖進行區分 Classification of RS-fMRI component maps using Visual Geometry Group network |
| 指導教授: |
段正仁
Jeng-Ren Duann 張智宏 Chih-Hung Chang |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
生醫理工學院 - 認知與神經科學研究所 Graduate Institute of Cognitive and Neuroscience |
| 論文出版年: | 2023 |
| 畢業學年度: | 111 |
| 語文別: | 中文 |
| 論文頁數: | 77 |
| 中文關鍵詞: | 功能性磁振造影 、VGG網絡 、獨立成分分析 、圖像識別 |
| 相關次數: | 點閱:17 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
獨立成分分析作為眾多分析靜息態功能性磁振造影的方式之一,被廣泛應
用在各樣的研究當中,但獨立成分分析所產出的結果 – 成分圖(component map)
並非全部來源於腦部活化,更多的是由儀器雜訊、頭部晃動或心臟跳動所引
起。為了將成分圖區分成腦部活化以及非腦部活化,目前最常使用的方式為人
工判別,但隨著科技的發達,我們應追求客觀的判別方式,因此本實驗希望能
透過卷積神經網絡的其中一個經典模型—VGG(Visual Geometry Group)的架構作
為參考,透過監督式學習的方法,訓練出一個最適合的模型來幫助醫師,能夠
對大量的成分圖進行初次的篩選,將屬於腦部活化的成分圖給找出來。
在這篇論文中,我們針對 4 項模型建構前非常重要的參數進行測試,包含
Epochs、模型層數、學習率大小以及卷積核大小,找出各樣參數該如何設定,
才能使模型的表現最佳化。此外,由於硬體設備的不足,我們必須降低輸入成
分圖的解析度,因此我們也對 180x180 以及 50x50 這兩個降低後的解析度進行
模型的訓練,並找出兩者間的模型表現的差異。
本實驗的資料為實驗室先前進行其他實驗收取資料的二次使用,取當中 10
位健康受試者 6 分鐘的靜息態磁振造影經過獨立成分分析後的成分圖,經過前
處理將其進行空間的正規化,對齊並疊套在膨脹處理後的標準腦圖譜上,並經
由專家對所有的成分圖進行標記區分後放入模型當中訓練。結果發現,在最佳
化模型參數的狀況下,180x180 所訓練出來的 VGG 模型在 Test AUC 上顯著的
高於 50x50 所訓練出來的 VGG 模型,並且當我們將預測錯誤的成分圖放大
後,我們發現在 50x50 以及 180x180 的成分圖上都有特徵丟失以及模糊的狀
況,但 50x50 的情況更為嚴重,因此可以得知降低圖片的解析度確實會影響模
型的判斷,因此在硬體設備許可的狀況下,我們應該將完整圖片輸入,才能使
模型的表現最佳化。
Independent Component Analysis (ICA) is widely used as one of the
methods for analyzing resting-state functional magnetic resonance imaging
(rsfMRI) data in various research studies. However, the component maps
generated by ICA do not solely originate from brain activation but are often
influenced by instrument noise, head motion, or cardiac activity. To distinguish
brain-activated component maps from non-brain-activated ones, manual
inspection is commonly employed. However, with the advancement of technology,
there is a need for objective discrimination methods. Therefore, in this
experiment, we aimed to utilize the architecture of one of the commonly used
Convolutional Neural Networks (CNN) models, VGG, as a classification model.
Through supervised learning, we trained the VGG model to best sort out the
brain activation independent components from a large number of component
maps.
In this work, we conducted tests on four crucial parameters for constructing
the models, including the number of epochs, the number of model layers,
learning rate, and convolutional kernel size, to determine the optimal settings for
achieving the best model performance. The tested images were constructed by
combining the four views (left and right lateral views and left and right medial
views) of the component maps, spatially normalized and overlaid on the inflated
Montreal Neurological Institute (MNI) standard brain. In addition, due to
hardware limitations, we had to reduce the resolution of the tested images of the
component maps. As a result, we trained the model to differentiate tested images
with two different resolution, 180x180 and 50x50 from the original 520x370, and
examine the model performance, respectively.
The data used in this experiment were obtained as secondary usage from
previously conducted experiments in the laboratory. Component maps from
resting-state fMRI scans of 10 healthy subjects, collected over a 6-minute period,
were preprocessed, classified, and utilized for training the model. The results
show that, under optimized model parameters, the VGG model trained with
180x180 resolution significantly outperforms the one trained with 50x50
resolution in terms of Test AUC. Additionally, when we magnify the misclassified
component maps, we observe feature loss and blurriness in both the 50x50 and
180x180 maps, with the 50x50 resolution exhibiting more severe issues. This
indicates that reducing image resolution does affect the model's judgment,
suggesting that, whenever possible within the constraints of hardware resources,
inputting the complete images would optimize the model's performance.
Agrawal, A., & Mittal, N. J. T. V. C. (2020). Using CNN for facial expression
recognition: a study of the effects of kernel size and number of filters on
accuracy. 36(2), 405-412.
Bell, A. J., & Sejnowski, T. J. J. N. c. (1995). An information-maximization approach
to blind separation and blind deconvolution. 7(6), 1129-1159.
Biswal, B., Zerrin Yetkin, F., Haughton, V. M., & Hyde, J. S. J. M. r. i. m. (1995).
Functional connectivity in the motor cortex of resting human brain using echo‐
planar MRI. 34(4), 537-541.
Biswal, B. B., Kylen, J. V., & Hyde, J. S. J. N. i. B. (1997). Simultaneous assessment
of flow and BOLD signals in resting‐state functional connectivity maps. 10(4‐
5), 165-170.
Brett, M., Penny, W., & Kiebel, S. J. H. b. f. (2003). Introduction to random field
theory. 2, 867-879.
Burman, P. J. B. (1989). A comparative study of ordinary cross-validation, v-fold
cross-validation and the repeated learning-testing methods. 76(3), 503-514.
Chansong, D., & Supratid, S. (2021). Impacts of kernel size on different resized
images in object recognition based on convolutional neural network. Paper
presented at the 2021 9th International Electrical Engineering Congress
(iEECON).
Flandin, G., Novak, M. J. J. f. B., & Applications, C. (2020). fMRI data analysis using
SPM. 89-116.
Friston, K. J., Frith, C., Liddle, P., Frackowiak, R. J. J. o. C. B. F., & Metabolism.
(1991). Comparing functional (PET) images: the assessment of significant
change. 11(4), 690-699.
Glover, G. H. J. N. C. (2011). Overview of functional magnetic resonance imaging.
22(2), 133-139.
Hinton, G., Deng, L., Yu, D., Dahl, G. E., Mohamed, A.-r., Jaitly, N., . . . Sainath, T.
N. J. I. S. p. m. (2012). Deep neural networks for acoustic modeling in speech
recognition: The shared views of four research groups. 29(6), 82-97.
Hung, C.-C., Liu, Y.-H., Huang, C.-C., Chou, C.-Y., Chen, C.-M., Duann, J.-R., . . .
Lin, C.-P. J. S. r. (2020). Effects of early ketamine exposure on cerebral gray
matter volume and functional connectivity. 10(1), 1-13.
Katti, G., Ara, S. A., & Shireen, A. J. I. j. o. d. c. (2011). Magnetic resonance imaging
(MRI)–A review. 3(1), 65-70
LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. J. P. o. t. I. (1998). Gradient-based
learning applied to document recognition. 86(11), 2278-2324.
Lee, M. H., Smyser, C. D., & Shimony, J. S. J. A. J. o. n. (2013). Resting-state fMRI:
a review of methods and clinical applications. 34(10), 1866-1872.
Li, X., Wang, W., Hu, X., & Yang, J. (2019). Selective kernel networks. Paper
presented at the Proceedings of the IEEE/CVF conference on computer vision
and pattern recognition.
McCullagh, P., & Nelder, J. (1989). Generalized Linear Models Second edition
Chapman & Hall: London.
McCulloch, W. S., & Pitts, W. J. T. b. o. m. b. (1943). A logical calculus of the ideas
immanent in nervous activity. 5, 115-133.
McKeown, M. J., Jung, T.-P., Makeig, S., Brown, G., Kindermann, S. S., Lee, T.-W.,
& Sejnowski, T. J. J. P. o. t. N. A. o. S. (1998). Spatially independent activity
patterns in functional MRI data during the Stroop color-naming task. 95(3),
803-810.
Mittal, V., Gangodkar, D., & Pant, B. (2020). Exploring The Dimension of DNN
Techniques For Text Categorization Using NLP. Paper presented at the 2020
6th International Conference on Advanced Computing and Communication
Systems (ICACCS).
Ogawa, S., Lee, T.-M., Kay, A. R., & Tank, D. W. J. p. o. t. N. A. o. S. (1990). Brain
magnetic resonance imaging with contrast dependent on blood oxygenation.
87(24), 9868-9872.
Raichle, M. E., MacLeod, A. M., Snyder, A. Z., Powers, W. J., Gusnard, D. A., &
Shulman, G. L. (2001). A default mode of brain function. 98(2), 676-682. doi:
doi:10.1073/pnas.98.2.676
Rutt, B. K., & Lee, D. H. J. J. o. M. R. I. (1996). The impact of field strength on
image quality in MRI. 6(1), 57-62.
Sa, I., Ge, Z., Dayoub, F., Upcroft, B., Perez, T., & McCool, C. J. s. (2016).
Deepfruits: A fruit detection system using deep neural networks. 16(8), 1222.
Siddique, F., Sakib, S., & Siddique, M. A. B. (2019). Handwritten digit recognition
using convolutional neural network in Python with tensorflow and observe the
variation of accuracies for various hidden layers: Preprints.
Simonyan, K., & Zisserman, A. J. a. p. a. (2014). Very deep convolutional networks
for large-scale image recognition.
Stork, D. (2001). Foundations of Occam’s razor and parsimony in learning. Paper
presented at the NIPS 2001 Workshop.
Sun, Y., Chen, Y., Wang, X., & Tang, X. J. A. i. n. i. p. s. (2014). Deep learning face
representation by joint identification-verification. 27.
Thulborn, K. R., Waterton, J. C., Matthews, P. M., & Radda, G. K. J. B. e. B. A.-G. S.
(1982). Oxygenation dependence of the transverse relaxation time of water
protons in whole blood at high field. 714(2), 265-270.
Van Den Heuvel, M., Mandl, R., & Hulshoff Pol, H. J. P. o. (2008). Normalized cut
group clustering of resting-state FMRI data. 3(4), e2001.
Ying, X. (2019). An overview of overfitting and its solutions. Paper presented at the Journal of physics: Conference series