跳到主要內容

簡易檢索 / 詳目顯示

研究生: 魏晨珊
Chen-Shan Wei
論文名稱: 適用於多領域虛假評論之判斷模型
Devising a cross- domain model to detect deceptive review comments
指導教授: 許秉瑜
口試委員:
學位類別: 碩士
Master
系所名稱: 管理學院 - 企業管理學系
Department of Business Administration
論文出版年: 2020
畢業學年度: 108
語文別: 中文
論文頁數: 53
中文關鍵詞: 判斷虛假評論Stimuli-Organism-Response (S-O-R) 框架word2vec
相關次數: 點閱:10下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 網路購物中評論的影響力對消費者與店家銷售策略已經產生巨大影響,其中,正
    向的評論會對於消費者有積極的購買行為。因此許多店家為提升銷售量,會徵求許多
    寫手編寫正向虛假評論,混淆消費者的資訊推銷產品。目前辨別真假評論的研究中,
    若使用語言類別萃取特定評論的特徵,將導致原先表現良好的辨別方法換成另一批資
    料測試時,準確率就會大幅下降。
    至今相關研究逐漸由單一領域中辨別虛假評論進一步探討跨領域中辨別虛假評
    論,例如:Li, Ott, Cardie, and Hovy (2014)、Ren and Ji (2017)、W. Liu, Jing, and Li
    (2019)。無論是使用評論之語言特徵或類神經網路等綜合方法建立辨別模型,皆面臨精
    準度降低的問題,其中,也並未明確解釋為何字詞可以應用在跨領域的預測上。
    本論文使用:Ott et al.(2011)及 Li, Ott, Cardie, and Hovy (2014)所搜集的三個領域
    (hotel、restaurant、doctor)真實與虛假評論資料,利用心理學理論,Stimuli Organism
    Response (S-O-R)框架為基礎結合 LIWC (Linguistic Inquiry and Word Count),建立一個
    跨領使用的分類模型,再加上透過 word2vec 詞向量頻繁特徵建萃取,克服過去論文跨
    域辨別精準度大幅降低的狀況。
    實驗結果得出若使用方法一,SOR 與評論之特徵權重進行分類演算法計算,表現最
    佳的 DNN 方法中準確度達 63.6%。方法二,詞向量頻繁特徵進行分類演算法計算,表
    現最佳的 random forest 準確度達 73.75%。


    The online reviews not only have huge impact on consumer shopping behavior but also
    online stores’ marketing strategy. Positive reviews will have positive influence for consumer’s
    buying decision. Therefore, some sellers want to boost their sales volume. They will hire
    spammers to write undeserving positive reviews to promote their products. Currently, some of
    the researches related to detection of fake reviews based on the text feature, the model will
    reach to high accuracy. However, the same model test on the other dataset the accuracy
    decrease sharply.
    Relevant researches have gradually explored the identification of false reviews through
    field. For example, Li, Ott, Cardie, and Hovy (2014);Ren and Ji (2017)、W. Liu, Jing, and
    Li (2019). Whether the model built using comprehensive methods such as text features or
    neural networks, encountering the decreasing of accuracy. On the other hand, the method
    didn’t explain why the model can be applied to cross-domain predictions.
    In our research, we using the fake reviews and truthful reviews from Ott et al.(2011) and
    Li, Ott, Cardie, and Hovy (2014) in the three domain (hotel, restaurant, doctor). The cross
    domain detect model based on Stimuli Organism Response (S-O-R) combine LIWC
    (Linguistic Inquiry and Word Count), add word2vec quantization feature, overcoming the
    decreasing accuracy situation.
    According to the research result, in the method one SOR calculation of feature weight of
    reviews, the DNN classification algorithm accuracy is 63.6%. In the method two, calculation
    of frequent features of word vectors, the random forest classification algorithm accuracy is
    73.75%.

    目錄 內容 中文摘要 ................................................................. i Abstract.................................................................. ii 目錄 ................................................................... iii 圖目錄 ................................................................... v 表目錄 .................................................................. vi 一、 緒論 ................................................................ 1 1-1 研究背景與動機 ................................................... 1 1-2 研究方法與目的 ................................................... 2 1-3 效益與貢獻 ....................................................... 4 1-4 研究架構 ......................................................... 5 二、 文獻探討 ............................................................ 6 2-1 線上評論之相關研究 ............................................... 6 2-2 辨別虛假評論之相關研究 ........................................... 7 2-3 Stimulus-Organism-Response(S-O-R)框架 ........................... 11 三、 研究方法與設計 ..................................................... 13 3-1 研究架構與步驟 .................................................. 13 3-2 SOR 類別篩選方法 ................................................. 15 3-3 方法一:SOR 與評論之特徵權重 ..................................... 16 3-4 方法二:詞向量頻繁特徵 .......................................... 16 四、 研究實驗 ........................................................... 18 4-1 實驗資料 ........................................................ 18 4-1-1 資料預處理 ................................................ 19 4-2 SOR 類別資料 ..................................................... 19 iv 4-3 評論與 SOR 詞特徵權重 ............................................ 20 4-4 實驗一:SOR 與評論之特徵權重 ..................................... 25 4-5 實驗二:詞向量頻繁特徵 .......................................... 26 五、 結論與建議 ......................................................... 32 5-1 研究結果 ....................................................... 32 5-2 未來建議 ....................................................... 32 參考文獻 ................................................................ 33 附錄一 .................................................................. 37 附錄二 .................................................................. 38 附錄三 .................................................................. 41

    參考文獻
    [1]. 網路資料 online resources︰iSURVEY 東方線上:消費者相信誰?線上評論如何影
    響消費者線上/線下購買決策。2018 年 10 月 21 日,取自
    https://www.smartm.com.tw/article/35343537cea3
    [2]. 網路資料 online resources︰BuzzFeed.News, Her Amazon Purchases Are Real. The
    Reviews Are Fake. November 20, 2019, 取自
    https://www.buzzfeednews.com/article/nicolenguyen/her-amazon-purchases-are-real-the
    reviews-are-fake
    [3]. 網路資料 on line resources︰《Boston 25 News》: Shopping on Amazon, how to tell if
    reviews are fake.取自 https://www.boston25news.com/news/consumer/shopping-on
    amazon-how-to-tell-if-reviews-are-fake-1/694913717, November 23, 2019,
    [4]. Adelaar, T., Chang, S., Lancendorfer, K. M., Lee, B., & Morimoto, M. (2003). Effects of
    Media Formats on Emotions and Impulse Buying Intent. Journal of Information
    Technology, 18(4), 247-266. doi:10.1080/0268396032000150799
    [5]. Aslam, U., Jayabalan, M., Ilyas, H., & Suhail, A. (2019). A survey on opinion spam
    detection methods. International Journal of Scientific and Technology Research, 8(9).
    [6]. Banerjee, S., & Chua, A. Y. (2014). Applauses in hotel reviews: Genuine or deceptive?
    Paper presented at the 2014 Science and Information Conference.
    [7]. Björk, P., Bosnjak, M., & Osti, L. (2010). Atmospherics on tour operators’ websites:
    Website features that stimulate emotional response. Journal of Vacation Marketing,
    16(4), 283-296. doi:10.1177/1356766710372243
    [8]. Boujbel, L., & d'Astous, A. (2015). Exploring the Feelings and Thoughts that
    Accompany the Experience of Consumption Desires. Psychology & Marketing, 32(2),
    219-231. doi:10.1002/mar.20774
    34

    [9]. Cagnina, L. C., & Rosso, P. (2017). Detecting deceptive opinions: intra and cross
    domain classification using an efficient representation. International Journal of
    Uncertainty, Fuzziness and Knowledge-Based Systems, 25(Suppl. 2), 151-174.
    [10]. Chang, H.-J., Eckman, M., & Yan, R.-N. (2011). Application of the Stimulus-Organism
    Response model to the retail environment: the role of hedonic motivation in impulse
    buying behavior. The International Review of Retail, Distribution and Consumer
    Research, 21(3), 233-249.
    [11]. Chevalier, J. A., & Mayzlin, D. (2006). The effect of word of mouth on sales: Online
    book reviews. Journal of marketing research, 43(3), 345-354.
    [12]. Cui, G., Lui, H.-K., & Guo, X. (2012). The effect of online consumer reviews on new
    product sales. International Journal of Electronic Commerce, 17(1), 39-58.
    [13]. Eroglu, S. A., Machleit, K. A., & Davis, L. M. (2001). Atmospheric qualities of online
    retailing: A conceptual model and implications. Journal of Business Research, 54(2),
    177-184.
    [14]. Ettis, S. A. (2017). Examining the relationships between online store atmospheric color,
    flow experience and consumer behavior. Journal of Retailing and Consumer Services,
    37, 43-55.
    [15]. Feng, S., Banerjee, R., & Choi, Y. (2012). Syntactic stylometry for deception detection.
    Paper presented at the Proceedings of the 50th Annual Meeting of the Association for
    Computational Linguistics: Short Papers-Volume 2.
    [16]. Gatautis, R., & Vaiciukynaite, E. (2013). WEBSITE ATMOSPHERE: TOWARDS
    REVISITED TAXONOMY OF WEBSITE ELEMENTS. Economics & Management,
    18(3).
    [17]. Hennig-Thurau, T., Gwinner, K. P., Walsh, G., & Gremler, D. D. (2004). Electronic
    word-of-mouth via consumer-opinion platforms: what motivates consumers to articulate
    themselves on the internet? Journal of interactive marketing, 18(1), 38-52.
    35

    [18]. Jindal, N., & Liu, B. (2007). Analyzing and Detecting Review Spam. 547-552.
    doi:10.1109/icdm.2007.68
    [19]. Kavanagh, D. J., Andrade, J., & May, J. (2005). Imaginary relish and exquisite torture:
    the elaborated intrusion theory of desire. Psychological review, 112(2), 446.
    [20]. Kim, S., Lee, S., Park, D., & Kang, J. (2017). Constructing and evaluating a novel
    crowdsourcing-based paraphrased opinion spam dataset. Paper presented at the
    Proceedings of the 26th International Conference on World Wide Web.
    [21]. Klaus, T., & Changchit, C. (2017). Toward an Understanding of Consumer Attitudes on
    Online Review Usage. Journal of Computer Information Systems, 59(3), 277-286.
    doi:10.1080/08874417.2017.1348916
    [22]. Li, J., Ott, M., Cardie, C., & Hovy, E. (2014). Towards a general rule for identifying
    deceptive opinion spam. Paper presented at the Proceedings of the 52nd Annual Meeting
    of the Association for Computational Linguistics (Volume 1: Long Papers).
    [23]. Liu, P., Xu, Z., Ai, J., & Wang, F. (2017). Identifying Indicators of Fake Reviews Based
    on Spammer's Behavior Features. Paper presented at the 2017 IEEE International
    Conference on Software Quality, Reliability and Security Companion (QRS-C).
    [24]. Liu, W., Jing, W., & Li, Y. (2019). Incorporating feature representation into BiLSTM for
    deceptive review detection. Computing, 1-15.
    [25]. Menon, S., & Kahn, B. (2002). Cross-category effects of induced arousal and pleasure on
    the Internet shopping experience. Journal of retailing, 78(1), 31-40.
    [26]. Mukherjee, A., Venkataraman, V., Liu, B., & Glance, N. (2013). What yelp fake review
    filter might be doing? Paper presented at the Seventh international AAAI conference on
    weblogs and social media.
    [27]. Mummalaneni, V. (2005). An empirical investigation of Web site characteristics,
    consumer emotional states and on-line shopping behaviors. Journal of Business
    Research, 58(4), 526-532. doi:10.1016/s0148-2963(03)00143-7
    36

    [28]. Oh, J., Fiorito, S. S., Cho, H., & Hofacker, C. F. (2008). Effects of design factors on store
    image and expectation of merchandise quality in web-based stores. Journal of Retailing
    and Consumer Services, 15(4), 237-249. doi:10.1016/j.jretconser.2007.03.004
    [29]. Ott, M., Cardie, C., & Hancock, J. T. (2013). Negative deceptive opinion spam. Paper
    presented at the Proceedings of the 2013 conference of the north american chapter of the
    association for computational linguistics: human language technologies.
    [30]. Ott, M., Choi, Y., Cardie, C., & Hancock, J. T. (2011). Finding deceptive opinion spam
    by any stretch of the imagination. Paper presented at the Proceedings of the 49th annual
    meeting of the association for computational linguistics: Human language technologies
    volume 1.
    [31]. Ren, Y., & Ji, D. (2017). Neural networks for deceptive opinion spam detection: An
    empirical study. Information Sciences, 385-386, 213-224. doi:10.1016/j.ins.2017.01.015
    [32]. Savage, D., Zhang, X., Yu, X., Chou, P., & Wang, Q. (2015). Detection of opinion spam
    based on anomalous rating deviation. Expert Systems with Applications, 42(22), 8650
    8657.
    [33]. Vermeulen, I. E., & Seegers, D. (2009). Tried and tested: The impact of online hotel
    reviews on consumer consideration. Tourism management, 30(1), 123-127.

    QR CODE
    :::