| 研究生: |
沈敬軒 Ching-Hsuan Shen |
|---|---|
| 論文名稱: |
利用階層式權重字尾樹找出在天文觀測紀錄中變化相似的序列 Mining Similar Astronomical Sequence Pattern with Hierarchical Weighted Suffix Tree |
| 指導教授: |
蔡孟峰
Meng-Feng Tsai |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
資訊電機學院 - 資訊工程學系 Department of Computer Science & Information Engineering |
| 畢業學年度: | 99 |
| 語文別: | 中文 |
| 論文頁數: | 54 |
| 中文關鍵詞: | 泛星計畫 、關聯式規則 、資料探勘 、概念階層 、權重字尾樹 |
| 外文關鍵詞: | Pan-Starrs, Weighted Suffix Tree, Concept Hierarchy, Data Mining, Association Rule |
| 相關次數: | 點閱:9 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
隨著科技的進步以及儲存設備成本降低,泛星計畫(Panoramic Survey Telescope And Rapid Response System,Pan-STARRS)中的觀測資料得以大量且詳細的儲存,但對於普遍仰賴人力進行數據前處理與分析的天文學家,卻也得花上比以往更長一段時間才能將亮度變化規則相似的星體給挑選出來,顯示傳統方法並不足以應付現今大型且複雜的數據。因此,本論文著眼於以下之目標:
1. 建立自動化資料前處理系統:
由於星體觀測記錄數據資料龐雜,必須先將較需要的部分挑出使用,並解決像是觀測時間的錯誤記錄與雜訊訊號過多等問題。為此建立了自動化的資料前處理機制,以利後續的應用。
2. 引入關聯式規則之演算法:
在天文領域中,利用星體間的相似或相異特徵並將其分類是非常重要的一環。我們將概念階層的想法結合權重字尾樹,使得變化相似的星體能夠聚集在同一條分支上。最後提供使用者多元化的搜尋應用方法來幫助後續的分析動作。
透過自動化程式的運行,將使分析資料得以簡化,減少了在資料處理上所耗費的人力,在效率上也得到了明顯的提升,提供了研究人員在未來面對大量觀測資料時一個有效的解決方法。
Astronomical researchers have been manually registering and maintaining observation data for various analysis processes. But with the ongoing construction of observatories from Pan-Starrs projects, the size of observation data has exploded. Manually processing numerous of data each day becomes impractical. Responding to this challenge, we need to construct large scale information management system, as well as the efficient methodology for data analysis. We have the following goals to achieve in this project:
1. Constructing an automatic information preparation system:
Because of the movements of earth and astronomical objects, a complete set of observation records requires gathering data from world-wide observatories. Limited by factors such as hardware, weather, time, or temperature, we also need to calibrate and clarify heterogeneous data sources before data integration. Considering the rapidly growing data size, data preparation has to be processed automatically and efficiently. We will implement this preparation system with the accessibility of computer network and perform necessary calibration or transformation based on historical data features. The clarified data then can be integrated for further analysis and researches.
2. Develop astronomical time-series pattern mining and associated rule mining methodologies:
Discovering the similarities between astronomical objects, and accordingly classify those objects, is an important process for many astronomical researches. We then integrate concept hierarchy with weighted suffix tree, and made those similar variation trend objects gather in the same branch inside the tree structure. Furthermore, we also implement some functions to help user searching what they are interested in.
By using automatic program, the observation data can be simplified. Not only reduce the loading in data analysis, but also improve its efficiency and give those researchers a better solution to handle large data in the future.
〔1〕陳文屏,「天文觀測的新挑戰─談泛星計畫」,科儀新知,第30卷第3期,2008年。
〔2〕吳彥慶,「利用權重字尾樹中頻繁事件序改善入侵偵測系統」,國立中央大學,碩士論文,民國96年。
〔3〕李翊銘,「從交易資料庫中以自我推導方式探勘具有多層次FP-tree」,國立中央大學,碩士論文,民國95年。
〔4〕Wikipedia, “Gamma-ray burst,” http://en.wikipedia.org/wiki/Gamma-ray_burst, 2011.
〔5〕Micheline Kamber, Jian Pei, Data Mining: Concepts and Techniques, Second Edition, Elsevier Inc., San Francisco, 2006.
〔6〕Philip Bevington, D. Keith Robinson, Data Reduction and Error Analysis for the Physical Sciences, McGraw-Hill, United States, 2003.
〔7〕K.-Y. Whang, J. Jeon, K. Shim, J. Srivatava, “Position Coded Pre-order Linked WAP-Tree for Web Log Sequential Pattern Mining”, PAKDD, LNAI 2637, pp. 337–349, 2003.
〔8〕P. Weiner, “Linear pattern matching algorithm”. 14th Annual IEEE Symposium on Switching and Automata Theory, pp, 1–11, 1973.
〔9〕P.A. Evans, et al., “Methods and results of an automatic analysis of a complete sample of Swift-XRT observations of GRBs”, Astron. Soc. 000, 1–61, 2003
〔10〕J. L. Racusin, et al., “Jet Breaks and Energetics of Swift Gamma-Ray Burst X-Ray Afterglows”, The Astrophysical Journal, 698:43–74, 2009
〔11〕J. A. Nousek, et al., “Evidence for a Canonical Gamma-Ray Burst Afterglow Light Curve in the Swift Xrt Data”, The Astrophysical Journal, 642:389–400, 2006