跳到主要內容

簡易檢索 / 詳目顯示

研究生: 彭彥勛
Yen-Hsun Peng
論文名稱: 主題模型應用:透過新聞資料分析電動車產業
指導教授: 葉英傑
Ying-Chieh Yeh
口試委員:
學位類別: 碩士
Master
系所名稱: 管理學院 - 工業管理研究所
Graduate Institute of Industrial Management
論文出版年: 2023
畢業學年度: 111
語文別: 中文
論文頁數: 51
中文關鍵詞: 電動車主題模型隱含狄利克雷分佈非監督式學習斯皮爾曼等級相關係數
外文關鍵詞: Electric vehicle, Topic modeling, Latent Dirichlet Allocation, Unsupervised machine learning, Spearman’s rank correlation coefficient
相關次數: 點閱:14下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 現今文字探勘的技術成熟,作為新穎的管理學工具,文字探勘為定性研究帶 來了更多的見解。不同於過往定性研究倚賴研究者的背景知識及主觀判斷,文字 探勘有大量的資料作為研究對象,可作為質化研究與量化研究的橋樑為管理學提 供了不同面向的見解及成果。
    本文採用主題模型對網路上關於電動車的新聞資料進行分析,我們對新聞網 站進行爬蟲擷取近十年有關電動車的資料,並搜集了電動車的銷售數量。我們透 過隱含迪利克雷分佈(LDA)對爬取的新聞資料進行建模並取得文件的主題分佈, 並比較兩個地區的隨時間推移的主題分佈,我們同時也對主題的分佈及電動車新 車比例進行斯皮爾曼等級相關係數的分析,找出潛在的主題中與電動車銷量之間 的關係。
    實驗結果顯示政策、充電與自動駕駛相關議題的主題分佈與電動車新車比例 呈現顯著的負相關,同時我們也在實驗中捕捉到疫情與電動車的發展並無顯著的 關係、半導體與電動車新車比例呈現高度的正相關,顯示台灣半導體在電動車產 業中將扮演不可忽視的角色。


    The text mining technique is mature and widely applied on management nowadays. As an advanced tool of management, text mining brings us more insight. Different from the previous qualitative research that relies on the background knowledge and subjective judgment of researchers, text mining has a large amount of data as the research object, which can be used as a bridge between qualitative research and quantitative research to provide management with different perspectives and results.
    We use topic modeling to analysis the news about electric vehicle on the website. We crawl the news about EVs in recent ten years. We model the texture data and obtain the topic distribution of the document through the Latent Dirichlet Distribution (LDA), and compare the topic distribution over time in Taiwan and USA. We also analyze the topic distribution and the number of EV sales. Spearman’s rank correlation coefficient analysis was performed to find the relationship between potential themes and EV sales.
    The results show that the correlation between three topic frames and the proportion of electric vehicle is negative significantly, they are policy, autonomous and charging issue. Meanwhile we find other topic related to the industry of EV; the impact of Covid -19 doesn’t affect the EV sales and the topics of semiconductor are highly positive correlated to proportion of EV sales.

    中文摘要 ....................................................................................................................I ABSTRACT ..................................................................................................................II 目錄 ..........................................................................................................................III 第一章 緒論 .............................................................................................................1 1-1 研究背景與動機.............................................................................................1 1-2 研究目的.........................................................................................................2 1-3 研究架構.........................................................................................................2 第二章 文獻探討......................................................................................................3 2-1 電動車發展與現況 .........................................................................................3 2-2 自然語言處理與資料預處理..........................................................................3 2-3 主題模型.........................................................................................................5 2-4 隱含狄利克雷分佈 .........................................................................................5 2-4-1 狄利克雷分佈 ........................................................................................ 6 2-4-2 隱含狄利克雷分佈前身模型.................................................................8 2-4-3 隱含狄利克雷分佈原理.......................................................................10 2-4-4 參數選擇 .............................................................................................. 12 2-5 主題模型應用...............................................................................................13 第三章 研究方法....................................................................................................14 3-1 實驗流程.......................................................................................................14 3-2 資料來源.......................................................................................................15 3-3 資料預處理...................................................................................................15 3-3-1 英文預處理 .......................................................................................... 15 iii 3-3-2 中文預處理 .......................................................................................... 16 3-4 LDA 模型 ........................................................................................................ 16 3-4-1 建模參數、主題數量確定 ................................................................... 17 3-4-2 生成主題分佈 ...................................................................................... 17 3-4-3 主題框架 .............................................................................................. 18 3-5 斯皮爾曼等級相關係數................................................................................20 第四章 實驗成果....................................................................................................21 4-1 主題數量.......................................................................................................21 4-2 主題框架.......................................................................................................22 4-3 兩地區主題比較...........................................................................................24 4-4 主題與電動車新車掛牌數量之關係 ............................................................28 4-4-1 電動車主題框架 ..................................................................................28 4-4-2 其他主題 .............................................................................................. 29 4-5 小結 ..............................................................................................................29 第五章 總結 ...........................................................................................................31 5-1 結論 ..............................................................................................................31 5-2 限制與未來展望...........................................................................................31 第六章 參考文獻....................................................................................................32 外文部分 ............................................................................................................. 32 中文部分 ............................................................................................................. 33 附 錄 一 .................................................................................................................34 附錄二 ..................................................................................................................... 39

    外文部分
    Bhalla, P., Ali, I. S., & Nazneen, A. (2018). A study of consumer perception and purchase intention of electric vehicles. European Journal of Scientific Research, 149(4), 362-368.
    Blei, D. M. (2012). Probabilistic topic models. Communications of the ACM, 55(4), 77-84. Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. Journal of machine
    Learning research, 3(Jan), 993-1022.
    Chang, J., Gerrish, S., Wang, C., Boyd-Graber, J., & Blei, D. (2009). Reading tea leaves: How
    humans interpret topic models. Advances in neural information processing systems, 22. Chowdhury, G. G. (2003). Natural Language Processing. Annual Review of Information Science
    and Technology (ARIST), 37, 51-89.
    Deerwester, S., Dumais, S. T., Furnas, G. W., Landauer, T. K., & Harshman, R. (1990). Indexing
    by latent semantic analysis. Journal of the American society for information science,
    41(6), 391-407.
    DiMaggio, P., Nag, M., & Blei, D. (2013). Exploiting affinities between topic modeling and the
    sociological perspective on culture: Application to newspaper coverage of US
    government arts funding. Poetics, 41(6), 570-606.
    Feldman, R., & Sanger, J. (2007). The text mining handbook: advanced approaches in
    analyzing unstructured data. Cambridge university press.
    Fligstein, N., Stuart Brundage, J., & Schultz, M. (2017). Seeing like the Fed: Culture, cognition,
    and framing in the failure to anticipate the financial crisis of 2008. American
    Sociological Review, 82(5), 879-909.
    Frigyik, B. A., Kapila, A., & Gupta, M. R. (2010). Introduction to the Dirichlet distribution and
    related processes. Department of Electrical Engineering, University of Washignton,
    UWEETR-2010-0006(0006), 1-27.
    Gkartzonikas, C., & Gkritza, K. (2019). What have we learned? A review of stated preference
    and choice studies on autonomous vehicles. Transportation Research Part C: Emerging
    Technologies, 98, 323-337.
    Griffiths, T. L., & Steyvers, M. (2004). Finding scientific topics. Proceedings of the National
    academy of Sciences, 101(suppl 1), 5228-5235.
    Hofmann, T. (1999). Probabilistic latent semantic indexing. Proceedings of the 22nd annual
    international ACM SIGIR conference on Research and development in information
    retrieval,
    International energy agency (IEA). (2022, May). Global EV Outlook 2022. https://www.iea.org/reports/global-ev-outlook-2022
    Kannan, S., Gurusamy, V., Vijayarani, S., Ilamathi, J., Nithya, M., Kannan, S., & Gurusamy, V.
    32
    (2014). Preprocessing techniques for text mining. International Journal of Computer
    Science & Communication Networks, 5(1), 7-16.
    Khalid, M. R., Khan, I. A., Hameed, S., Asghar, M. S. J., & Ro, J.-S. (2021). A comprehensive
    review on structural topologies, power levels, energy storage systems, and standards for electric vehicle charging stations and their impacts on grid. IEEE Access, 9, 128069- 128094.
    Newman, D., Lau, J. H., Grieser, K., & Baldwin, T. (2010). Automatic evaluation of topic coherence. Human language technologies: The 2010 annual conference of the North American chapter of the association for computational linguistics,
    Toman, M., Tesar, R., & Jezek, K. (2006). Influence of word normalization on text classification. Proceedings of InSciT, 4, 354-358.
    Tu, J.-C., & Yang, C. (2019). Key factors influencing consumers’ purchase of electric vehicles. Sustainability, 11(14), 3863.
    中文部分
    黃以辰(2020)。比較晶圓雙雄的策略變化:主題模型方法的應用。國立清華大學科技 管理研究所碩士論文,新竹市。 取自 https://hdl.handle.net/11296/y28t62
    黃郁青, 陳治均, & 葛復光. (2018). 電動車的發展對我國電網級儲能系統之影響. 台灣 能源期刊, 5(3), 233–249. https://km.twenergy.org.tw/Publication/thesis_more?id=191
    嚴建國(2022)。翻轉世界,電動車廠商經營發展策略分析—以特斯拉公司(Tesla Inc.)
    為例。國立臺灣大學資訊管理組碩士論文,台北市。 取自

    QR CODE
    :::