| 研究生: |
陳正揚 Cheng-Yang Chen |
|---|---|
| 論文名稱: |
網頁意見特徵評價化之應用於旅館比較 Feature Appraisal for Hotel Comparsion |
| 指導教授: |
張嘉惠
Chia-Hui Chang |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
資訊電機學院 - 資訊工程學系 Department of Computer Science & Information Engineering |
| 畢業學年度: | 96 |
| 語文別: | 英文 |
| 論文頁數: | 39 |
| 中文關鍵詞: | 資料探勘 、意見擷取 、語意分析 |
| 外文關鍵詞: | data mining, sentiment classification, opinino extraction |
| 相關次數: | 點閱:12 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
在最近幾年的研究當中,關於網頁內容的研究數量相當的多。對於網站上的意見分析也顯得越來越重要。人們在網路上對於某個產品的評價對於我們在網路上購物或者下決策時是相當重要的參考依據。舉例來說,一個在網路上的使用者在選擇要入住哪家旅館時,通常會透過參考入口網站中有關於使用者想比較的旅館的意見來做為決定的因素。因此,在本篇論文當中,我們使用了自然語言處理以及資料探勘中的技巧來分析入口網站中的網頁並擷取出有關於擷取的網頁的旅館的特徵,並將每一個特徵透過搜尋引擎來計算特徵的分數。並且設計了一個線上的系統,讓使用者可以透過我們的介面來做比較。相信透過這樣的方式,使用者在比較網頁中大量的意見時可以更清楚、簡單、直覺的比較並更快速的做出決定。
In recent years, a considerable of number of studies have been conducted on the effects of comments on the web. People’s appraisal has high confidence to express their behaviors and aspects. Thus, the appraisals on the web are significant information for customer making their decision. For instance, people could select the best hotel according to the existing appraisals on the web. In this paper, we are concerned with the difference between reviews of products. An unsupervised learning approach can model this information perfectly. Furthermore, we simply combined the data mining with natural language processing techniques to extract features from a large of appraisals. For each feature, we generalized a corresponding appraisal by using probabilistic measure and we designed a web interface to let user compare online. We adopted the Hotel information that was collected form Yahoo as our dataset. Lastly, our comparative results were clearly represented in a visual way.
1. Bourigault, D. Lexter: A terminology extraction software for knowledge acquisition from texts. In proceedings of KAW’1995.
2. Church, K.W., Hanks, P. Word association norms, mutual information and lexicography. In proceedings of 27th Annual Conference of the ACL’1989, pp. 76-83.
3. Das, S. and Chen, M., 2001. Yahoo! for Amazon: Extracting market sentiment from stock message boards. In proceedings of APFA’2001
4. Dave, K., Lawrence, S. and Pennock, D. Mining the Peanut Gallery: Opinion Extraction and Semantic Classification of Product Reviews. In proceedings of WWW’2003, pp.519-528.
5. DeJong, G. 1982. An Overview of the FRUMP System. Strategies for Natural Language Parsing, pp.149-176.
6. Goldstein, J., Kantrowitz, M., Mittal, V. and Carbonell, J. Summarizing Text Documents: Sentence Selection and Evaluation Metrics. In the proceedings of SIGIR’1999, pp.121-128.
7. Ohno, H., Kusumura, Y., Hijikata, Y., Nishida, S. Social summarization of text feedback for online auctions and interactive presentation of the summary. In proceedings of IUI’2006, pp.242-249.
8. Hatzivassiloglou, V. and McKeown, K. Predicting the Semantic Orientation of Adjectives. In proceedings of ACL 1997, pp. 174-181.
9. Hatzivassiloglou, V., and Wiebe, J. Effects of adjective orientation and gradability on sentence subjectivity. In proceedings of COLING’2000, pp.299-305.
10. Hearst, M. Direction-based Text Interpretation as an Information Access Refinement. In P. Jacobs, editor, Text-Based Intelligent Systems. Lawrence Erlbaum Associates 1992.
11. Hu, M. and Liu, B. Mining and Summarizing Customer Reviews. In proceedings of KDD’2004.
12. Hulth, A. Improved automatic keyword extraction given more linguistic knowledge. In proceedings of EMNLP’2003.
13. Jacquemin, C., Bousigault, D. Term extraction and automatic indexing. In R. Mitkov, editor, Handbook of Computational Linguistic. Oxford University Press.
14. Kupiec, J., Pedersen, J., and Chen, F. 1995. A Trainable Document Summarizer. In proceedings of SIGIR’1995, pp.68-73.
15. Liu, B., Hsu, W., Ma, Y. Integrating Classification Association Rule Mining. In proceedings of KDD’1998, pp.80-86.
16. Liu, B., Hu, M., Cheng, J. Opinion Observer: Analyzing and Comparing Opinions on the Web. In proceedings of WWW’2005, pp.342-351.
17. Morinaga, S., Yamanishi, K., Tateishi, K., and Fukushima, T. Mining product reputations on the web. In proceedings of KDD’2002.
18. Nasukawa, T. and Yi, J. 2003. Sentiment analysis: Capturing favorability using natural language processing. In Proceedings of K-CA’2003.
19. Osgood, C.E., Succi, G.J. and Tannenbaum, P. 1957. The Measurement of Meaning.University of Illinois.
20. Paice, C. D. 1990. Constructing Literature Abstracts by Computer: Techniques and Prospects. Information Processing and Management 26, pp.171-186.
21. Pang, B., Lee, L. and Vaithyanathan, S. Thumbs up? Sentiment classification using machine learning techniques. In proceedings of EMNLP’2002.
22. Pang, B. and Lee, L. A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts. In proceedings of ACL’2004.
23. Popescu A.M. and Etzioni O. Extracting product features and opinions from reviews. In proceedings of HLT-EMNLP’2005.
24. Rilooff, E. and Wiebe, J. Learning extraction patterns for subjective expressions. In the proceedings of HLT-EMNLP’2003, pp.25-32
25. Salton, G. Singhal, A. Buckley, C. and Mitra, M. Automatic Text Decomposition using Text Segments and Text Themes. In proceedings of ACM Conference on Hypertext 1996, pp.53-65.
26. Satoshi Sato and Yasuhiro Sasaki. Automatic Collection of Related Terms from the Web. In proceedings of ACL’03.
27. Subasic, P. and Huettner, A. Affect analysis of text using fuzzy semantic typing. IEEE Transaction On Fuzzy Systems Special Issue, 9, pp.483-496, August 2001.
28. Tait, J. 1983. Automatic Summarizing of English Texts. Ph.D. Dissertation, University of Cambridge.
29. Turney, P.D. 2001. Mining the Web for synonyms: PMI-IR versus LSA on TOEFL. In Proceedings of the Twelfth European Conference on Machine Learning pp. 491-502.
30. Turney, P.D. Thumbs up or Thumbs down? Semantic orientation applied to unsupervised classification of reviews. In proceedings of 40th Annual Meeting of the Association for Computational Linguistics, pp. 417-424, July2002.
31. Wilson, T., Wiebe, J., and Hwa, R. Just how mad are you? Finding strong and weak opinion clauses. In proceedings of AAAI’2004, pp.761-769.
32. Wu, Y.C. and Chang, C.H. Efficient Text Chunking using Linear Kernel with Mask Method, Knowledge Based Systems, Vol. 20, Issue 3, pp. 209-219, 2007.
33. Wu, Y.C., Fan, T.K., Lee, Y.S. Show-Jane Yen: Extracting Named Entities Using Support Vector Machines. In proceedings of KDLL 2006, pp. 91-103.
34. Wu, Y.C., Yang, J.C. and Lee, Y.S. An Approximate Approach for Training Polynomial Kernel SVMs in Linear Time. In proceedings of ACL’07 Poster.
35. Yi, J., Nasukawa, T., Bunescu, R. Niblack, W.. Sentiment Analyzer: Extracting Sentiments about a Given Topic using Natural Language Processing Techniques. In proceedings of ICDM’2003.
36. Zhuang, L., Jing, F. and Zhu, X.Y. Movie Review Mining and Summarization. In proceedings of CIKM’06, pp.43-50.