| 研究生: |
林圓皓 Yuan-Hau Lin |
|---|---|
| 論文名稱: |
Facebook活動事件擷取系統 Facebook Activity Event Extraction System |
| 指導教授: |
張嘉惠
Chia-Hui Chang |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
資訊電機學院 - 資訊工程學系 Department of Computer Science & Information Engineering |
| 論文出版年: | 2016 |
| 畢業學年度: | 104 |
| 語文別: | 中文 |
| 論文頁數: | 38 |
| 中文關鍵詞: | 活動事件擷取 、命名實體辨識 、社群媒體事件 |
| 外文關鍵詞: | Activity Event Extraction, Named Entity Recognition, Social Media Event |
| 相關次數: | 點閱:13 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
社群網路的普及使得不少人以Facebook為媒介來宣傳活動,因此本論文的目的即是建立一個Facebook的活動事件擷取系統,以幫助使用者快速地掌握活動的資訊。我們改善了黄等人的Web NER Model Generation工具,藉以建立活動名稱及地點擷取模型,再利用序列樣版探勘找出活動的起始、結束日期。此外,我們也嘗試以大量的Facebook打卡地點來改善地點辨識準確率。實驗測試了1,300篇人工標記答案的貼文,以評斷系統擷取活動事件的效能和命名實體辨識的效能,並將擷取出來的活動地點實際投射到經緯度座標上,以評估預測活動實際位置的準確度。實驗結果顯示活動名稱、地點以及開始、結束日期擷取的F_1-score分別為0.727, 0.694及0.865, 0.72,活動事件整體辨識率為0.708,顯示藉由此系統來統整Facebook上的活動事件並定位出事件發生的地點是相當可行的。
The popularity of social networks has made them a perfect medium for activity or advertising campaign promotion. Therefore, many people use Facebook pages to announce their advertising campaign. The purpose of this study is to extract activity events by constructing two named entity recognition models, namely activity name and location, via a Web NER model generation tool [1]. We enhance the tool by improving the tokenizer and alignment technique. In addition, we also use a large database of FB checkin places for location name recognition improvement. For entity relation extraction, we apply sequential pattern mining to find rules for start date, end date, and location coupling. We use 1,300 posts from Facebook to test the activity event extraction performance. The experimental results show 0.727, 0.694 F_1-score for activity name and location recognition; and 0.865, 0.72 F_1-score for start and end date extraction. Overall, the extraction performance for activity event extraction is 0.708.
[1] Y. Y. Huang and C.H. Chung, "A Tool for Web NER Model Generation Based on Google Snippets", National Central University graduated paper, 2015.
[2] A. Ritter, O. Etzioni, and S. Clark. Open domain event extraction from Twitter. In Proc. SIGKDD, pages 1104–1112, 2012.
[3] Wang, W.: Chinese news event 5w1h semantic elements extraction for event ontology population. In: Proceedings of the 21st International Conference Companion on World Wide Web, pp. 197–202. ACM (2012)
[4] N. Kanhabua, S. Romano, and A. Stewart. Identifying relevant temporal expressions for real-world events. In Proceedings of the SIGIR 2012 Workshop on Time-aware Information Access (TAIA ’12), 2012.
[5] Suthasinee Kuptabut and Ponrudee Netisopakul Event Extraction using Ontology Directed Semantic Grammar. Journal of Information Science and Engineering 32,79-96 (2016)
[6] Wallach, H.M. (2004) Conditional Random Fields: An Introduction.University of Pennsylvania CIS Technical Report MS-CIS-04-21.
[7] N. Dalvi, M. Olteanu, M. Raghavan, and P. Bohannon. Deduplicating a places database. In Proceedings of the 23rd international conference on World wide web, pages 409–418. International World Wide Web Conferences Steering Committee, 2014.
[8] Feng, Y., Huang, R., Sun, L.: Two Step Chinese Named Entity Recognition Based on Conditional Random Fields Models. In: Sixth SIGHAN Workshop on CLP, pp. 120–123. ACL Press, Hyderabad.
[9] T.-S. Chen, M.-C, Chen, C.-H, Chang, "基於頁面層級之快速網頁資料擷取與綱要驗證", Conference on Technologies and Applications of Artificial Intelligencester, 2014.
[10] Y.-S. Su, Associated Information Extraction for Enabling Entity Search on Electronic Map, National Central University, 2012.
[11] J. Strötgen and M. Gertz. Heideltime: High quality rule-based extraction and normalization of temporal expressions. In Proceedings of the 5th International Workshop on Semantic Evaluation, 2010.
[12] Yu-Yang Lin Author, Chia-Hui Chang Author, “網頁商家名稱擷取與地址配對之研究” (ROCLING 2014) , Chung-li, Taiwan, September 91-93, 2014.