跳到主要內容

簡易檢索 / 詳目顯示

研究生: 李宗夷
Tzong-Yi Li
論文名稱: 蛋白質群組之辯識與交互關係分析系統
A System to Identify Proteins Associated with Protein-Protein Interaction
指導教授: 洪炯宗
Jorng-Tzong Horng
黃憲達
Hsien-Da Huang
口試委員:
學位類別: 碩士
Master
系所名稱: 資訊電機學院 - 資訊工程學系
Department of Computer Science & Information Engineering
畢業學年度: 92
語文別: 英文
論文頁數: 54
中文關鍵詞: 蛋白質與蛋白質的交互關係蛋白質辨識
外文關鍵詞: protein-protein interaction, protein identification
相關次數: 點閱:13下載:0
分享至:
查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報
  • 蛋白質辨識是蛋白質體學中一項很重要的研究,近年來質譜儀(Mass Spectrum)廣泛的應用在蛋白質體學相關的研究,尤其是應用在蛋白質辨識。蛋白質體學近年來成為生物科技領域的一股熱潮,主要原因是它改變過去只鑽研在一個蛋白質的研究,而著重在同時多個蛋白質之間的相互關係,因為蛋白質間通常會相互作用而產生它該有的功能。
    雖然已經有許多的蛋白質辨識工具可以達到相當程度的辨識率(如Mascot),但是目前卻沒有任何蛋白質辨識工具可以讓使用者輸入多個MS spectra。因此在這個研究中設計了一個系統叫做MultiProtIdent,這個系統可以同時執行多組Peptide Mass Fingerprint(PMF)的辨識分析工作,並且利用蛋白質間的作用(Protein-Protein Interaction)和功能相關性(functional association)的資訊來分析每個PMF所對應的蛋白質間是否有某種關聯性。MultiProtIdent可以幫助蛋白質的研究者非常方便的交互作用分析。透過幾組protein complex實驗的測試,MultiProtIdent都可以提供非常好的結果。


    Protein identification is an important task in proteomics. Proteomic analyses based on the Mass Spectrum [1] are now key methods to determine the components in protein complexes. Proteins “work together” by actually binding to form multi-component complexes that carry out specific functions. Although several protein identification tools such as Mascot have high accuracy of identification, these tools do not have the facility of identifying multiple proteins simultaneously with the assistant of the protein-protein interaction or functional association information. In this study, we develop a novel tool, namely MultiProtIdent, which is able to identify proteins using additional information of protein-protein interactions and protein functional associations. The input of MultiProIdent is multiple PMFs and the identification results are proteins and possible relationships among identified proteins. Experiments using protein complexes as input show that MultiProtIdent is promising.

    Chapter 1 Introduction 1 1.1 Background 1 1.1.1 Protein Identification 1 1.1.2 Peptide Mass Fingerprinting 2 1.1.3 Scoring Function 5 1.2 Motivation 6 1.3 Goal 7 Chapter 2 Related Works 8 2.1 Protein Identification Tools using MS spectra 8 2.2 Protein Identification Using MS/MS spectra 10 2.3 Identifying the Components of Protein Complex 12 Chapter 3 Materials and Methods 14 3.1 Materials 14 3.1.1 Protein Sequence Database 14 3.1.2 Protein – Protein Interaction Databases 14 3.1.3 Gene and Protein Function Databases 17 3.2 Methods 19 3.2.1 System Flow of MultiProtIdent 20 3.2.2 Stage 1 of MultiProtIdent 21 3.2.3 Stage 2 of MultiProtIdent 23 3.2.4 Mining Associated Proteins 26 Chapter 4 Implementation 28 4.1 Data Preprocessing and Storage of Knowledgebase 28 4.2 Agent-based System 29 4.3 Web Interface 31 Chapter 5 Results and Case Study 36 5.1 MS Spectra 36 5.1.1 Result of Test_1 36 5.1.2 Result of Test_2 39 5.2 Known Cellular Complex 42 5.2.1 Simulation of Peptide Mass Fingerprint 42 5.2.2 Result of Second Test Data 44 Chapter 6 Summary 46 6.1 Discussion 46 6.2 Future Work 47 References 50 Appendix 52

    1. Bader, G.D., D. Betel, and C.W. Hogue, BIND: the Biomolecular Interaction Network Database. Nucleic Acids Res, 2003. 31(1): p. 248-50.
    2. Wilkins, M.R. and K.L. Williams, Cross-species protein identification using amino acid composition, peptide mass fingerprinting, isoelectric point and molecular mass: a theoretical evaluation. J Theor Biol, 1997. 186(1): p. 7-15.
    3. Fenyo, D., Identifying the proteome: software tools. Curr Opin Biotechnol, 2000. 11(4): p. 391-5.
    4. Daniel C. Liebler , p., Introduction to Proteomics. Tools for the New Biology. 2002, Totowa, New Jersey: Humana Press Inc.
    5. Apweiler, R., et al., UniProt: the Universal Protein knowledgebase. Nucleic Acids Res, 2004. 32 Database issue: p. D115-9.
    6. Ding, Q., et al., Unmatched masses in peptide mass fingerprints caused by cross-contamination: an updated statistical result. Proteomics, 2003. 3(7): p. 1313-7.
    7. Clauser, K.R., P. Baker, and A.L. Burlingame, Role of accurate mass measurement (+/- 10 ppm) in protein identification strategies employing MS or MS/MS and database searching. Anal Chem, 1999. 71(14): p. 2871-82.
    8. Ronald C. Beavis, D.F., Database searching with mass-spectrometric information. Proteomics, 2000.
    9. Mann, M., P. Hojrup, and P. Roepstorff, Use of mass spectrometric molecular weight information to identify proteins in sequence databases. Biol Mass Spectrom, 1993. 22(6): p. 338-45.
    10. Fenyo, D., J. Qin, and B.T. Chait, Protein identification using mass spectrometric information. Electrophoresis, 1998. 19(6): p. 998-1005.
    11. Pappin, D.J.C., Hojrup, P., Bleasby, A.J., Rapid identification of proteins by peptide-mass
    fingerprinting. Current. Biology, 1993. 3: p. 327-332.
    12. Perkins, D.N., et al., Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis, 1999. 20(18): p. 3551-67.
    13. Ho, Y., et al., Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature, 2002. 415(6868): p. 180-3.
    14. Gavin, A.C., et al., Functional organization of the yeast proteome by systematic analysis of protein complexes. Nature, 2002. 415(6868): p. 141-7.
    15. Xenarios, I., et al., DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions. Nucleic Acids Res, 2002. 30(1): p. 303-5.
    16. Christian von Mering, M.H., Daniel Jaeggi, Steffen Schmidt, Peer Bork, and Berend Snel, STRING: a database of pridicted functional associations between proteins. Nucleic Acids Res, 2003. 31(1): p. 258-261.
    17. Zanzoni, A., et al., MINT: a Molecular INTeraction database. FEBS Lett, 2002. 513(1): p. 135-40.
    18. Tatusov, R.L., et al., The COG database: new developments in phylogenetic classification of proteins from complete genomes. Nucleic Acids Res, 2001. 29(1): p. 22-8.
    19. Tatusov, R.L., et al., The COG database: an updated version includes eukaryotes. BMC Bioinformatics, 2003. 4(1): p. 41.
    20. Han, K. and B.H. Ju, A fast layout algorithm for protein interaction networks. Bioinformatics, 2003. 19(15): p. 1882-8.
    21. Gras, R., et al., Improving protein identification from peptide mass fingerprinting through a parameterized multi-level scoring algorithm and an optimized peak detection. Electrophoresis, 1999. 20(18): p. 3535-50.
    22. Mewes, H.W., et al., MIPS: analysis and annotation of proteins from whole genomes. Nucleic Acids Res, 2004. 32(1): p. D41-4.
    23. Christophe Masselon, L.P.-T., Sang-Won Lee, Lingjun Li, Gordon A. Anderson, Richard Harkewicz, Richard D. Smith, Identification of tryptic peptides from large databases using multiplexed tandem mass spectrometry: simulations and experimental results. Proteomics, 2003. 3: p. 1279-1286.
    24. Kanehisa, M., et al., The KEGG databases at GenomeNet. Nucleic Acids Res, 2002. 30(1): p. 42-6.
    25. Becker, M.Y. and I. Rojas, A graph layout algorithm for drawing metabolic pathways. Bioinformatics, 2001. 17(5): p. 461-7.

    QR CODE
    :::