| 研究生: |
呂理維 Li-Wei Lu |
|---|---|
| 論文名稱: |
重複序列資料庫效能之改進與基因資訊之整合應用 Performance Improvements on RSDB and Integration of Repetitive Elements with Genes |
| 指導教授: |
洪炯宗
Jorng-Tzong Horng |
| 口試委員: | |
| 學位類別: |
碩士 Master |
| 系所名稱: |
資訊電機學院 - 資訊工程學系 Department of Computer Science & Information Engineering |
| 畢業學年度: | 89 |
| 語文別: | 中文 |
| 論文頁數: | 90 |
| 外文關鍵詞: | RSDB, repetitive element, repeat |
| 相關次數: | 點閱:10 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
重複序列在基因體序列裡佔了相當大的比例,生物學家已從重複序列中找出大量的調控機制,藉由分析重複序列,可以進一步了解染色體結構的組成與基因和物種演化之間的關係。重複序列資料庫儲存了上億筆重複序列的資料,包含 direct, bi-directional, palindromic, interspersed 以及 tandem 重複序列。藉著以索引為組織的表格、鍵值壓縮、管線式的資料載入、資料倉儲、快取處理、以及 suffix arrays 等技術的使用,重複序列資料庫可以更有效率地存取如此大量的資料。此外,重複序列資料庫提供了對於所有重複序列的統計資料,並且整合基因的資料,以期能幫助生物學家發現更多更重要的訊息。
[1].Altschul,S.F., Gish,W., Miller,W., Myers,E.W. and Lipman,D.J. (1990) Basic local alignment search tool. J. Mol. Biol., 215, 403-410.
[2].Benson,D.A., Ilene Karsch-Mizrachi, Lipman,D.J., Ostell,J., Rapp,B.A. and Wheeler,D.L. (2000) GenBank. Nucleic Acids Research, 28, 15-18.
[3].Biaudet,V., Samson,F., and Bessieres,P. (1997) Micado--a network-oriented database for microbial genomes. Comput. Applic. Biosci., 13, 431-438.
[4].Cheang,I.K., Choi,Y.B. and Tang A. (1994) Overview of the Structures of Heterogeneous Genome Databases. Proceedings of the 27th Hawaii International Conference on System Sciences, Biotechnology Computing, 5, 15 —24.
[5].Courteau,J. (1991) Genome Databases. Science, 254, 201-207.
[6].Elmasri,R. and Navathe,S.B. (1994) Fundamentals of Database Systems Second Edition. Addison-Wesley Publishing Company, Menlo Park, CA.
[7].Etzold,T., Ulyanov,A. and Argos,P. (1996) SRS: Information Retrieval System for Molecular Biology Data Banks. Methods Enzymol., 266, 114-128.
[8].Gupta,H. (1997) Selection of Views to Materialized in a Data Warehouse. Proceedings of the 23rd VLDB Conference, Athens, Greece, 156-165.
[9].Gusfield,D. (1997) Algorithms on Strings, Trees, and Sequences. Cambridge University Press.
[10].Harger,C. et al. (1998) The Genome Sequence DataBase (GSDB): improving data quality and data access. Nucleic Acids Research, 26, 21-26.
[11].Horng,J.T., Lin,J.H. and Kao,C.Y. (2001) RSDB — A Database of Repetitive Elements in Complete Genomes. Proceedings of the Atlantic Symposium on Computational Biology and Genome Information Systems & Technology, Burham, NC, USA, 220-223.
[12].Letovskey,S.I., Cottingham,R.W., Porter,C.J. and Peter W.D. Li. (1998) GDB: the Human Genome Database. Nucleic Acids Research, 26, 94-99.
[13].Li,W.H., Gu,Z., Wang,H. and Nekrutenko,A. (2001) Evolutionary analyses of the human genome. Nature, 409, 847-849.
[14].Ruitberg,C.M., Reeder,D.J. and Butler,J.M. (2001) STRBase: a short tandem repeat DNA database for the human identity test community. Nucleic Acids Research, 29, 320-322.
[15].Sargent,R., Fuhrman,D., Critchlow,T., Sera,T.D., Mecklenburg,R., Lindstrom,G., Schuler,G.D., Epstein,J.A., Ohkawa,H. and Kans,J.A. (1996) Entrez: molecular biology database and retrieval system. Methods Enzymol., 266, 141-162.
[16].Stein,L.D. and Thierry-Mieg,J. (1999) AceDB: a genome database management system. Computing in Science & Engineering, 1-3, 44 —52.
[17].Stoesser,G., Baker,W., Alexandra van den Broek, Camon,E., Maria Garcia-Pastor, Kanz,C., Kulikova,T., Lombard,V., Lopez,R., Parkinson,H., Redaschi,N., Sterk,P., Stoehr,P. and Mary Ann Tuli. (2001) The EMBL nucleotide sequence database. Nucleic Acids Research, 29, 17-21.
[18].Tateno,Y., Miyazaki,S., Ota,M., Sugawara,H. and Gojobori,T. (2000) DNA Data Bank of Japan (DDBJ) in collaboration with mass sequencing teams. Nucleic Acids Research, 28, 24-26.
[19].Wall,L., Christiansen,T. and Schwartz,R.L. (1996) Programming Perl, Second Edition. O’Reilly & Associates, Inc.
[20].Widom,J. (1995) Research Problems in Data Warehousing. Proc. of 4th Int’l Conference on Information and Knowledge Management (CIKM), 25-30.