跳到主要內容

簡易檢索 / 詳目顯示

研究生: 賴之康
Chih-Kang Lai
論文名稱: 運用卷積神經網路偵測網站頁面異常研究
Detecting Abnormal Website Pages by Convolutional Neural Networks
指導教授: 蔡志豐
Chih-Fong Tsai
口試委員:
學位類別: 碩士
Master
系所名稱: 管理學院 - 資訊管理學系
Department of Information Management
論文出版年: 2020
畢業學年度: 108
語文別: 中文
論文頁數: 78
中文關鍵詞: 網頁跑版影像辨識深度學習卷積神經網路
外文關鍵詞: web page, Abnormal Website
相關次數: 點閱:8下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 現代的人,使用電腦或行動裝置上網已經是每天習慣要做的事,瀏覽的網站從一般的內容型網站、社群網站、影音媒體網站到電子商務型網站都有,應該有人碰過,進到某個網站就發現頁面錯誤或是呈現的網頁內容是壞掉的,可能少了一張圖片、圖片與文字對不起來或是某個區塊跑到了不該出現的地方,這樣的狀況在業界稱之為跑版;相信建置網站的開發團隊都極不願意把這些跑版的資訊呈現在使用者的眼前,這樣不僅可能會流失網站的流量,最重要的是讓自己網站的品質受到了質疑與傷害;本研究主要在探討使用圖片/影像辯識的方法,對網站頁面轉成的圖片進行辨識是否有跑版的問題發生;實驗中使用深度學習在影像辨識領域表現得最好的卷積神經網路演算法,搭配圖片數量、圖片尺寸、訓練回合數、卷積層數等變因進行訓練,根據本研究實驗得到的結果顯示,若各變因有適當的
    調整,則所獲得的準確率及混淆矩陣分類正確性都會獲得良好的改善。


    Modern people, using computers or mobile devices to surf the Internet is a habit
    to do every day. The websites browsed are from general content sites, community sites,
    video sites to e-commerce sites. In some cases, when we visit a website, some webpage
    layout is incorrect or the content of the presented webpage is broken. There may be a
    missing picture, image and non-matched text, or a content block has moved to a place
    where it should not appear. This situation is called "broken layout" within the industry.
    It is true that the development team that built the website is very reluctant to show
    the "broken layout" to end users, so that not only may the website traffic be lost, but
    also the degradation for the quality of the website. Users would have doubt about the
    quality and hurt brand.
    This research is mainly to explore the method of using image identification to
    identify whether the image converted from the website page has a "broken layout"
    problem; deep learning is used in the experiment since it performs well in the field of
    image identification. The neural network algorithm is trained with different factors such
    as the number of training pictures, the size of the pictures, the number of training
    iterations, and the number of convolutional layers. According to the results of this
    research, if the various factors are adjusted appropriately, the accuracy rate obtained
    and the confusion matrix classification accuracy will be improved.

    摘要 I Abstract II 誌謝 III 目錄 IV 圖目錄 V 表目錄 VI 第一章 緒論 1 第一節 研究背景 1 第二節 研究動機 2 第三節 研究目的 3 第二章 文獻探討 5 第一節 網頁的組成元素 5 第二節 深度學習 6 第三節 卷積神經網路 8 第三章 研究方法 10 第一節 研究設計 10 第二節 資料蒐集 11 第三節 資料前處理 12 第四節 實驗設計 13 第四章 實驗結果 16 第一節 類別數量不平衡 16 第二節 類別數量平衡結果 22 第三節 設定圖檔不同尺寸 28 第四節 設定不同訓練回合數 35 第五節 設定不同卷積運算層數 39 第六節 綜合測試實驗 45 第五章 結論與未來展望 50 第一節 研究結論 50 第二節 研究貢獻 51 第三節 未來研究方向與建議 52 參考文獻 53 附錄一 56 附錄二 61

    【英文文獻】
    Behnke, S. (2003) . Hierarchical neural networks for image interpretation. Springer. 2766
    Cireşan, D. C., Meier, U., Gambardella, L. M., & Schmidhuber, J. (2010) . Deep, big, simple neural nets for handwritten digit recognition. Neural Computation, 22 (12) , 3207-3220.
    Fukushima, K. (1980) . Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological cybernetics, 36 (4) , 193-202.
    Hinton, G. E., & Salakhutdinov, R. R. (2006) . Reducing the dimensionality of data with neural networks. science, 313 (5786) , 504-507.
    Hubel, D. H. (1962) . Receptive fields, binocular interaction and functional architecture in the cat's visual cortex. The Journal of physiology, 160, 106-154.
    Hubel, D. H., & Wiesel, T. N. (1968) . Receptive fields and functional architecture of monkey striate cortex. The Journal of physiology, 195 (1) , 215-243.
    Le, Q. V. (2013) . Building high-level features using large scale unsupervised learning. Paper presented at the 2013 IEEE international conference on acoustics, speech and signal processing.
    LeCun, Y., Bengio, Y., & Hinton, G. (2015) . Deep learning. nature, 521 (7553) , 436-444.
    LeCun, Y., Boser, B., Denker, J. S., Henderson, D., Howard, R. E., Hubbard, W., & Jackel, L. D. (1989) . Backpropagation applied to handwritten zip code recognition. Neural Computation, 1 (4) , 541-551.
    LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998) . Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86 (11) , 2278-2324.
    Markoff, J. (2012) . How many computers to identify a cat? 16,000. New York Times, 25.
    Schmidhuber, J. (1992) . Learning complex, extended sequences using the principle of history compression. Neural Computation, 4 (2) , 234-242.
    Schmidhuber, J. (2015) . Deep learning in neural networks: An overview. Neural networks, 61, 85-117.
    Taigman, Y., Yang, M., Ranzato, M. A., & Wolf, L. (2014) . Deepface: Closing the gap to human-level performance in face verification. Paper presented at the Proceedings of the IEEE conference on computer vision and pattern recognition.
    Werbos, P. (1974) . Beyond regression:" new tools for prediction and analysis in the behavioral sciences. Ph. D. dissertation, Harvard University.

    【中文文獻】
    郭柏宏 (2015) 。基於深度學習之靜態影像超解析度技術。國立成功大學電機工程研究所博碩士論文。

    【書籍與網站】
    Brandon (2016) . End to End Machine Learning - How do Convolutional Neural Networks work? Retrieved June, 2020 from https://e2eml.school/how_convolutional_neural_networks_work.html
    CH.Tseng (2017, September 12) 。初探卷積神經網路。Retrieved June, 2020 from: https://chtseng.wordpress.com/2017/09/12/初探卷積神經網路/
    Cinnamon AI Taiwan (2019, June 5) 。深度學習:CNN原理。Retrieved June, 2020 from: https://medium.com/@CinnamonAITaiwan/深度學習-cnn原理-keras實現-432fd9ea4935
    Machine Learning Notebook (2017, April 07) . Convolutional Neural Networks - Basics. Retrieved June, 2020 from: https://mlnotebook.github.io/post/CNN1/
    Steven Shen (2018, January 2) 。入門深度學習—2。Retrieved June, 2020 from: https://medium.com/@syshen/入門深度學習-2-d694cad7d1e5
    三津村直貴 (2018) 。圖解AI人工智慧大未來:關於人工智慧一定要懂的96件事。臺灣:旗標出版社。
    全球資訊網協會 (World Wide Web Consortium,W3C) 簡介 (2020.06) 取自:https://www.w3.org/
    林大貴 (2017) 。TensorFlow+Keras 深度學習人工智慧實務應用。臺灣:博碩出版社。

    QR CODE
    :::