跳到主要內容

簡易檢索 / 詳目顯示

研究生: 張鈺鴻
Yu-Hung Chang
論文名稱: 應用文字探勘技術建構預測客訴問題類別機器學習模型
指導教授: 胡雅涵
Ya-Han Hu
口試委員:
學位類別: 碩士
Master
系所名稱: 管理學院 - 資訊管理學系在職專班
Executive Master of Information Management
論文出版年: 2021
畢業學年度: 109
語文別: 中文
論文頁數: 122
中文關鍵詞: 文字探勘分類預測監督式機器學習
相關次數: 點閱:15下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 隨著科技的進步,顧客或消費者可以通過各種不同的渠道來發表或分享對該產品質量、服務優缺點;當負面的客訴評價出現時,接著會有許多的網友跟隨回應,有時議題也會因為這樣而引發漣渏效應進而受到群眾注意,這些負面的評價我們可以稱之為客訴。目前服務的企業對於社交平台上顧客抱怨(又稱客訴)的處理大多是客戶服務中心人員以人工方式來取得顧客抱怨評價留言進而進一步處理,在時效性上常會緩不濟急。客訴的留言通常也具有高度可用可提取的信息,這些客訴通常帶有不滿的情緒或者對於希望該產品求好的心態,分析這些客訴這對於組織而言是很重要的。
    我們通過Google Play平台的取得評價留言資料集做為本次研究的資料集,該資料集的期限區間從2014年1月1日至2020年4月30日之間共有31401筆數據,將這些非結構化的客訴留言使用監督式機器學習方式來逐一進行本文探勘(Text Mining)、特徵詞萃取 (Feature Extraction) ,以Orange探勘工具分析特徵詞,並建立關鍵字詞庫 (Bag Words) 接著進行建模(Topic Model) 、標記(Labeled)、以樸素貝葉斯(Naïve Bayes, NB) 、k最近鄰居法(k-nearest neighbors, KNN)、隨機森林(Random Forest, RF)、支持向量機 (support vector machine, SVM) 等四種研究上較常應用在分類預測等研究演算法來對這些客訴問題進行分析以及問題類型分類預測,模型主要分為六個模型(Topic Model),研究發現在六分類方法 (Multi-Class Classification) 上複合詞性的語料庫較預測準確率比單一詞性語料庫較佳,而二分類方法 (Binary Classification) 則以單一詞性語料庫中的動作及物動詞準確度較佳,證實本研究可有效的預測客訴問題分類(Prediction customer complaint Classification),可節省人工對客訴問題分類的時間。
    關鍵字:文字探勘、分類預測、監督式機器學習


    With the advancement of technology, customers or consumers can publish or
    share the pros and cons of the product or service through a variety of different
    channels; when negative customer opinions or evaluations appear, many natives will
    follow to respond, and sometimes issues will also be discussed. Because of this, ripple
    effect is caused and attention of the masses is attracted. These negative comments can
    be called guest complaints. At present, the service companies deal with so called
    customer complaints on social platforms, and most of them use customer service
    center personnel to manually obtain customer complaints and further process them,
    which often slows down in timeliness. The messages of customer complaints usually
    also have a high degree of useful information. These customer complaints usually
    contain dissatisfactions and hopes for improvements, which is very important for the
    organization.
    Data sets used in this research are gathered from user reviews between January 1,
    2014 and April 30th, 2020 on Google Play platform, 31,401 data sets in total. A In
    this article, customer complaints analyzation and problem category prediction are
    accomplished based on Supervised Machine Learning Methods, for instance, Naive
    Bayesian Calculations. After feature words extracted from unstructured user
    complaints and analyzed with Orange exploration tools, a keyword vocabulary was
    built, modelled and labelled, which includes six main dimensions.This research shows
    that Multi-Class Classification has higher prediction accuracy on compound keyword
    database, comparing with Binary Classification, which has higher accuracy when
    applied on keyword database with single transitive verbs. It is also proved that
    customer complaints could be efficiently classified and saved time from manual
    classifications.
    Keywords: Text Mining, classification prediction, supervised learning

    圖目錄 表目錄 第一章 緒論 1 1.1研究背景 1 1.2研究動機 2 1.3研究目的 3 第二章 文獻探討 5 2.1文字探勘與自然語言處理 (Natural Language Processing (NLP)) 5 2.2客訴留言問題分類預測機器學習方法相關文獻研究 6 2.3探討客訴議題及客訴議題本文分析相關文獻 18 2.4總結 24 第三章 研究方法 25 3.1資料集資料來源 26 3.2資料集預處理 30 3.3中文斷詞及特徵選取 31 3.4文字向量 33 3.5本文探勘 40 3.6研究變數 40 3.7分析方法及工具套件 44 3.8實驗設計與評估 47 第四章 實證結果分析 49 4.1實驗資料 49 4.2實驗結果 51 4.3變項重要性排序及Ranks 57 4.4綜合討論 69 第五章 結論與建議 71 5.1研究結論與貢獻 71 5.2研究限制 73 參考文獻 76 中文部份 76 英文部份 77 附錄 81

    中文部份
    林名彥 (2015) 「應用文字探勘技術於客訴資料之研究-以 台大 PPT 論壇為例」.
    程致中 (2015) 「運用文字探勘技術探討旅館領域客戶評價之研究」
    陳世榮 (2015). 「社會科學研究中的文字探勘應用: 以文意為基礎的文件分類及其問題. 人文及社會科學集刊, 27(4), 683-718. 」.
    許中銓 (2015). 「以 文 字 探 勘 探 討 汽 車 美 容 業 消 費 者 網 路 評 價」.
    莊正棟 (2016). 「文字探勘技術於電商網站 Facebook 粉絲專頁貼文成功關鍵之研究」.


    英文部份
    Amsury, F., Ruhyana, N., Saputra, I., & Sulistyowati, D. N. (2020). Classification Of Customer Complaints On INSTAGRAM Comments Using NAÏVE BAYES ALGORITHM WITH N-GRAM FEATURE EXTENSION. Jurnal Techno Nusa Mandiri, 17(2), 109-116
    Arifianto, A., Suyanto, S., Sirwan, A., Desrul, D. R. K., Prakoso, I. D., Guntara, F. F., ... & Murti, R. S. (2020, August). Developing an LSTM-based Classification Model of IndiHome Customer Feedbacks. In 2020 International Conference on Data Science and Its Applications (ICoDSA) (pp. 1-5). IEEE.
    Bayrak, A. T., Türker, B. B., Özbek, E. E., & Yıldız, E. SikayetIçeren Müsteri Yorumlarının Tespiti ve Sınıflandırılması Complaint Detection and Classification of Customer.
    Cho, Y., Im, I., Hiltz, R., & Fjermestad, J. (2002, January). An analysis of online customer complaints: implications for web complaint management. In Proceedings of the 35th Annual Hawaii International Conference on System Sciences (pp. 2308-2317). IEEE.
    Choe, P., Lehto, M. R., Shin, G. C., & Choi, K. Y. (2013). Semiautomated identification and classification of customer complaints. Human Factors and Ergonomics in Manufacturing & Service Industries, 23(2), 149-162.
    Coussement, K., & Van den Poel, D. (2008). Improving customer complaint management by automatic email classification using linguistic style features as predictors. Decision Support Systems, 44(4), 870-882.
    Gajbhiye, K., & Gupta, N. (2018, December).
    Real Time Twitter Sentiment Analysis for Product Reviews Using Naive Bayes Classifier. In International conference on Computer Networks, Big data and IoT (pp. 342-350). Springer, Cham.
    Galitsky, B. A., González, M. P., & Chesñevar, C. I. (2009). A novel approach for classifying customer complaints through graphs similarities in argumentative dialogues. Decision Support Systems, 46(3), 717-729.
    Ghazzawi, A., & Alharbi, B. (2019). Analysis of customer complaints data using data mining techniques. Procedia Computer Science, 163, 62-69.
    Gunawan, D., Siregar, R. P., Rahmat, R. F., & Amalia, A. (2018, March). Building automatic customer complaints filtering application based on Twitter in Bahasa Indonesia. In Journal of Physics: Conference Series (Vol. 978, No. 1, p. 012119). IOP Publishing.
    Gupta, N., Gilbert, M., & Fabbrizio, G. D. (2013). Emotion detection in email customer care. Computational Intelligence, 29(3), 489-505.
    HaCohen-Kerner, Y., Dilmon, R., Hone, M., & Ben-Basan, M. A. (2019). Automatic classification of complaint letters according to service provider categories. Information Processing & Management, 56(6), 102102.
    Hu, N., Zhang, T., Gao, B., & Bose, I. (2019). What do hotel customers complain about? Text analysis using structural topic model. Tourism Management, 72, 417-426.
    Joung, J., Jung, K., Ko, S., & Kim, K. (2018). Customer Complaints Analysis Using Text Mining and Outcome-Driven Innovation Method for Market-Oriented Product Development. Sustainability, 11(1), 1-14.
    Kano, E., Fujita, Y., & Tsuda, K. (2019).
    A Method of Extracting and Classifying Local Community Problems from Citizen-Report Data using Text Mining. Procedia Computer Science, 159, 1347-1356Khedkar, S. A., & Shinde, S. K. (2018, December). Customer review analytics for business intelligence. In 2018 IEEE International Conference on Computational Intelligence and Computing Research (ICCIC) (pp. 1-5). IEEE.
    Khedkar, S., & Shinde, S. (2020). Deep learning-based approach to classify praises or complaints from customer reviews. In Proceeding of International Conference on Computational Science and Applications (pp. 391-402). Springer, Singapore.
    Khedkar, S., & Shinde, S. (2020). Deep Learning and Ensemble Approach for Praise or Complaint Classification. Procedia Computer Science, 167, 449-458.
    Kim, H., Lee, T., Ryu, S., & Kim, N. (2018).
    A Study on Text Mining Methods to Analyze Civil Complaints: Structured Association Analysis. Journal of the Korea Industrial Information Systems Research, 23(3), 13-24.
    Lee, C. C., & Hu, C. (2005). Analyzing Hotel customers' E-complaints from an internet complaint forum. Journal of Travel & Tourism Marketing, 17(2-3), 167-181.
    Li, J., Lowe, D., Wayment, L., & Huang, Q. (2020). Text mining datasets of β-hydroxybutyrate (BHB) supplement products’ consumer online reviews. Data in brief, 30, 105385.
    Putong, M. W. (2020). Classification model of contact center customers emails using machine learning. Advances in Science, Technology and Engineering Systems, 5(1), 174-182.
    Ribeiro, J., Duarte, J., Portela, F., & Santos, M. F. (2019).
    Automatically detect diagnostic patterns based on clinical notes through Text Mining. Procedia Computer Science, 160, 684-689.
    Wang, Z., & Zhong, Y. (2020).
    What were residents’ petitions in Beijing-based on text mining. Journal of Urban Management, 9(2), 228-237.
    Wu, D. (2017, December). A big data analytics framework for forecasting rare customer complaints: A use case of predicting MA members' complaints to CMS. In 2017 IEEE International Conference on Big Data (Big Data) (pp. 3965-3967). IEEE.
    Yang, Y., Xu, D. L., Yang, J. B., & Chen, Y. W. (2018). An evidential reasoning-based decision support system for handling customer complaints in mobile telecommunications. Knowledge-Based Systems, 162, 202-210.

    QR CODE
    :::