蛋白質重複序列分析工具｜國立中央大學博碩士論文系統

簡易檢索 / 詳目顯示

回結果列表

研究生：	鄧仕祥 Shr-Shiang Deng
論文名稱：	蛋白質重複序列分析工具 Protein Repeats Finder
指導教授：	洪炯宗 Jorng-Tzong Horng
口試委員:
學位類別：	碩士 Master
系所名稱：	資訊電機學院 - 資訊工程學系 Department of Computer Science & Information Engineering
畢業學年度：	91
語文別：	英文
論文頁數：	58
中文關鍵詞：	蛋白質、重複序列
外文關鍵詞：	repeats, protein
相關次數：	點閱：10 下載：0
分享至:	分享至facebook 分享至twitter

查詢本校圖書館目錄查詢臺灣博碩士論文知識加值系統勘誤回報

了解蛋白質的重複序列對於分析蛋白質的空間，功能，結構，以及其彼此間的相互作用是有助益的。本研究試著開發一套蛋白質序列的分析工具，這套工具不僅能找出連續性重複序列及週期性重複的胺基酸,並且最主要的是這套工具可以對分散於蛋白質序列上的相近重複序列做分析。本研究開發之蛋白質重複序列分析工具，簡稱PRF，用來做蛋白質序列分析的工具，使用者可在網頁上直接來使用，同時我們分析SWISS-PROT的12萬條蛋白質序列並將其建成資料庫，進而提供使用者可以藉由網路線上查詢到所需要的資訊。有關PRF的詳情資訊，請參考 URL: http://140.115.155.94.

Protein repeated sequence patterns may be a mechanism which provides regular arrays of spatial and functional groups, useful for structural packing or for one to one interactions with target molecules, and many large proteins have evolved by internal duplication and many internal sequence repeats correspond to functional and structural units.
In this study, we purpose to develop a protein repeat analysis tool that can find kinds of protein repeats such as tandem repeats, periodically conserved single amino acid repeats, and approximate repeats. We also provide a variety of statistics the repeats we found. Protein Repeats Finder (PRF) is developed to find kinds of protein repeats in a given protein sequence. Users can use this tool on line to query protein repeats they need. The web site is now available at URL: http://140.115.155.94.

Contents
Chapter 1 Introduction	1
1.1 Motivation	1
1.2 Goal	2
Chapter 2 Related Work	4
2.1 PAM (Percent/Point Accepted Mutation)	4
2.2 Swiss-Prot	4
2.3 CLUSTAL W	4
2.4 Suffix Array	5
2.5 TRIPS	5
2.6 Radar	5
2.7 Dotlet	6
Chapter 3 Materials and Methods	7
3.1 Data Sets	7
3.2 Develop Environment	7
3.3 Main Ideals	7
3.4 Approach	8
(i) Exact repeats	8
(ii) Maximal repeat	8
(iii) Seeds	9
(iv) L-extend	10
(v) R-extend	10
(vi) LR-extend	11
(vii) Relation Matrix	11
3.5 Algorithm	15
3.6 Database Schema	24
Chapter 4 Results	29
4.1 Statistics of Different Protein Repeats	29
A. Protein Length Distribution	29
B. Approximate Repeats Distribution	31
C. Approximate Repeats Length Distribution	32
D. Ratio of (Repeats/Protein Length)	33
E. Ratio of (Length of Repeats/Protein Length)	34
4.2 A Database of Repeatitive Elements in Proteins	36
4.3 Query Interface	41
Chapter 5 Discussion	44
5.1 Relations of Structure	44
5.2 A Comparison of Different Tools	54
5.3 Summary	56
5.4 Future Work	56
References	57

                                

Adebiyi, E. F., T. Jiang, et al. (2001). "An efficient algorithm for finding short approximate non-tandem repeats." Bioinformatics 17 Suppl 1: S5-S12.
Andrade, M. A., C. P. Ponting, et al. (2000). "Homology-based method for identification of protein repeats using statistical significance estimates." J Mol Biol 298(3): 521-37.
Batchelor, A. H., D. E. Piper, et al. (1998). "The structure of GABPalpha/beta: an ETS domain- ankyrin repeat heterodimer bound to DNA." Science 279(5353): 1037-41.
Delcher, A. L., S. Kasif, et al. (1999). "Alignment of whole genomes." Nucleic Acids Res 27(11): 2369-76.
Gusfield, D. (1997). Algorithms on Strings, Trees and Sequences.
Heger, A. and L. Holm (2000). "Rapid automatic detection and alignment of repeats in protein sequences." Proteins 41(2): 224-37.
Henikoff, S. and J. G. Henikoff (1992). "Amino acid substitution matrices from protein blocks." Proc Natl Acad Sci U S A 89(22): 10915-9.
Junier, T. and M. Pagni (2000). "Dotlet: diagonal plots in a web browser." Bioinformatics 16(2): 178-9.
Kajava, A. V. (1998). "Structural diversity of leucine-rich repeat proteins." J Mol Biol 277(3): 519-27.
Kajava, A. V. (2001). "Review: proteins with repeated sequence--structural prediction and modeling." J Struct Biol 134(2-3): 132-44.
Katti, M. V., R. Sami-Subbu, et al. (2000). "Amino acid repeat patterns in protein sequences: their diversity and structural-functional implications." Protein Sci 9(6): 1203-9.
Kobe, B. and J. Deisenhofer (1995). "A structural basis of the interactions between leucine-rich repeats and protein ligands." Nature 374(6518): 183-6.
Kobe, B. and A. V. Kajava (2001). "The leucine-rich repeat as a protein recognition motif." Curr Opin Struct Biol 11(6): 725-32.
Kohl, A., H. K. Binz, et al. (2003). "Designed to be stable: crystal structure of a consensus ankyrin repeat protein." Proc Natl Acad Sci U S A 100(4): 1700-5.
Kurtz, S., E. Ohlebusch, et al. (2000). "Computation and visualization of degenerate repeats in complete genomes." Proc Int Conf Intell Syst Mol Biol 8: 228-38.
Kurtz, S. and C. Schleiermacher (1999). "REPuter: fast computation of maximal repeats in complete genomes." Bioinformatics 15(5): 426-7.
Lux, S. E., K. M. John, et al. (1990). "Analysis of cDNA for human erythrocyte ankyrin indicates a repeated structure with homology to tissue-differentiation and cell-cycle control proteins." Nature 344(6261): 36-42.
Myers, E. (1994). "A sub-linear algorithm for approximate keyword matching." Algorithmica 12(4-5): 345-347.
Notredame, C. (2001). "Mocca: semi-automatic method for domain hunting." Bioinformatics 17(4): 373-4.
Pellegrini, M., E. M. Marcotte, et al. (1999). "A fast algorithm for genome-wide analysis of proteins with repeated sequences." Proteins 35(4): 440-6.
Sagot, M.-F. (1998). "Spelling approximate repeated or common motifs using a suffix tree." LNCS 1380: 111-127.
Sedgwick, S. G. and S. J. Smerdon (1999). "The ankyrin repeat: a diversity of interactions on a common structural framework." Trends Biochem Sci 24(8): 311-6.
Thompson, J. D., D. G. Higgins, et al. (1994). "CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice." Nucleic Acids Res 22(22): 4673-80.
TIGER (1999). "Repeat-finder." [http://www.tigre.org/tdb/rice/repeatinfo-MUMmer.shtml].
Ukkonen, E. (1985). "Algorithms for approximate string matching." Information and Control 64: 100-118.
Walker, R. G., A. T. Willingham, et al. (2000). "A Drosophila mechanosensory transduction channel." Science 287(5461): 2229-34.

簡易檢索 / 詳目顯示

相關論文