| 研究生: |
謝瑞民 Dray-ming shien |
|---|---|
| 論文名稱: |
蛋白質甲基化位置之研究與辨識 Investigation and Identification of Protein Methylation Sites |
| 指導教授: |
洪炯宗
Jorng-tzong Horng |
| 口試委員: | |
| 學位類別: |
博士 Doctor |
| 系所名稱: |
資訊電機學院 - 資訊工程學系 Department of Computer Science & Information Engineering |
| 畢業學年度: | 99 |
| 語文別: | 英文 |
| 論文頁數: | 57 |
| 中文關鍵詞: | solvent accessible surface area (ASA) 、protein methylation 、amino acid |
| 外文關鍵詞: | 水溶液親合的表面積, 蛋白質甲基化, 氨基酸 |
| 相關次數: | 點閱:7 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
近幾年來許多研究已經發現在histone及其他的蛋白質上的甲基化會參與基因轉錄的調控,針對lysine及arginine而言,已有些生物資訊方法被發展出來,可辨識蛋白質潛在的甲基化位置。蛋白質三級結構的研究上已經證實有關蛋白質甲基化的位置可能較傾向發生在蛋白質結構的表面容易被分子附著的區域。然而,先前的方法並沒有考慮到在甲基化位置週圍的水分子的親合性(ASA,solvent Accessible Surface Area)的特性。因此,本研究提出一套方法(MASA),主要針對四種發生甲基化的胺基酸lysine, arginine, glutamate及asparagine,整合SVM(support vector machine)及採用蛋白質序列及結構上的特徵,用來辨識甲基化的位置。然而現今大部份實驗上已經被證實有甲基化的位置的蛋白質資料,於PDB上並沒有存在相對應的三級結構資訊;對此而言,可利用軟體工具有效預測amino acid上的ASA值,經由cross-validation 計算評估,有採用ASA值的模型可有效改進預測的準確度。另外,獨立測試也顯示出甲基化在lysine上及arginine 上分別可到80.8%及85.0%。最後,本方法亦實作出一網頁系統,網址為http://MASA.mbc.nctu.edu.tw,讓使用者透過web server方便操作,此網站可有效地協助生物學家辨識蛋白質甲基化位置。
Studies over the last few years have identified protein methylation on histones and other proteins that are involved in the regulation of gene transcription. Several works have developed approaches to identify computationally the potential methylation sites on lysine and arginine. Studies of protein tertiary structure have demonstrated that the sites of protein methylation are preferentially in regions that are easily accessible. However, previous studies have not taken into account the solvent-accessible surface area (ASA) that surrounds the methylation sites. This work presents a method named MASA that combines the support vector machine (SVM) with the sequence and structural characteristics of proteins to identify methylation sites on lysine, arginine, glutamate and asparagine. Since most experimental methylation sites are not associated with corresponding protein tertiary structures in the Protein Data Bank (PDB), the effective solvent-accessible prediction tools have been adopted to determine the potential ASA values of amino acids in proteins. Evaluation of predictive performance by cross-validation indicates that the ASA values around the methylation sites can improve the accuracy of prediction. Additionally, an independent test reveals that the prediction accuracies for methylated lysine and arginine are 80.8% and 85.0%, respectively. Finally, the proposed method is implemented as an effective system for identifying protein methylation sites. The developed web server is freely available at http://MASA.mbc.nctu.edu.tw/.
1. Diella F, Cameron S, Gemund C, Linding R, Via A, Kuster B, Sicheritz-Ponten T, Blom N, Gibson TJ: Phospho.ELM: a database of experimentally verified phosphorylation sites in eukaryotic proteins. BMC Bioinformatics 2004, 5(1):79.
2. Farriol-Mathis N, Garavelli JS, Boeckmann B, Duvaud S, Gasteiger E, Gateau A, Veuthey AL, Bairoch A: Annotation of post-translational modifications in the Swiss-Prot knowledge base. Proteomics 2004, 4(6):1537-1550.
3. Mann M, Jensen ON: Proteomic analysis of post-translational modifications. Nat Biotechnol 2003, 21(3):255-261.
4. Paik WK, Kim S: Enzymatic methylation of protein fractions from calf thymus nuclei. Biochem Biophys Res Commun 1967, 29(1):14-20.
5. Beausoleil SA, Jedrychowski M, Schwartz D, Elias JE, Villen J, Li J, Cohn MA, Cantley LC, Gygi SP: Large-scale characterization of HeLa cell nuclear phosphoproteins. Proc Natl Acad Sci U S A 2004, 101(33):12130-12135.
6. Sayegh J, Webb K, Cheng D, Bedford MT, Clarke SG: Regulation of Protein Arginine Methyltransferase 8 (PRMT8) Activity by Its N-terminal Domain. J Biol Chem 2007, 282(50):36444-36453.
7. Martin C, Zhang Y: The diverse functions of histone lysine methylation. Nat Rev Mol Cell Biol 2005, 6(11):838-849.
8. Bannister AJ, Kouzarides T: Reversing histone methylation. Nature 2005, 436(7054):1103-1106.
9. Lee DY, Teyssier C, Strahl BD, Stallcup MR: Role of protein methylation in regulation of transcription. Endocr Rev 2005, 26(2):147-170.
10. Paik WK, DiMaria P: Enzymatic methylation and demethylation of protein-bound lysine residues. Methods Enzymol 1984, 106:274-287.
11. Predel R, Brandt W, Kellner R, Rapus J, Nachman RJ, Gade G: Post-translational modifications of the insect sulfakinins: sulfation, pyroglutamate-formation and O-methylation of glutamic acid. Eur J Biochem 1999, 263(2):552-560.
12. Lapko VN, Cerny RL, Smith DL, Smith JB: Modifications of human betaA1/betaA3-crystallins include S-methylation, glutathiolation, and truncation. Protein Sci 2005, 14(1):45-54.
13. Aletta JM, Cimato TR, Ettinger MJ: Protein methylation: a signal event in post-translational modification. Trends Biochem Sci 1998, 23(3):89-91.
14. Bedford MT, Richard S: Arginine methylation an emerging regulator of protein function. Mol Cell 2005, 18(3):263-272.
15. Stallcup MR: Role of protein methylation in chromatin remodeling and transcriptional regulation. Oncogene 2001, 20(24):3014-3020.
16. Murray K: The occurrence of var epsilon-n-methyl lysine in histones. Biochemistry 1964, 3:10-15.
17. Paik WK, Kim S: Enzymology of protein methylation. Yonsei Med J 1986, 27(3):159-177.
18. Kouskouti A, Scheer E, Staub A, Tora L, Talianidis I: Gene-specific modulation of TAF10 function by SET9-mediated methylation. Mol Cell 2004, 14(2):175-182.
19. Aebersold R, Mann M: Mass spectrometry-based proteomics. Nature 2003, 422(6928):198-207.
20. Fenn JB, Mann M, Meng CK, Wong SF, Whitehouse CM: Electrospray ionization for mass spectrometry of large biomolecules. Science 1989, 246(4926):64-71.
21. Aebersold R: A mass spectrometric journey into protein and proteome research. J Am Soc Mass Spectrom 2003, 14(7):685-695.
22. Pang CN, Hayen A, Wilkins MR: Surface accessibility of protein post-translational modifications. J Proteome Res 2007, 6(5):1833-1845.
23. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE: The Protein Data Bank. Nucleic Acids Res 2000, 28(1):235-242.
24. Kabsch W, Sander C: Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers 1983, 22(12):2577-2637.
25. Shien DM, Lee TY, Chang WC, Hsu JB, Horng JT, Hsu PC, Wang TY, Huang HD: Incorporating structural characteristics for identification of protein methylation sites. J Comput Chem 2009, 30(9):1532-1543.
26. Daily KM RP, Dunker AK: Intrinsic disorder and protein modifications: building an SVM predictor for methylation. IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology 2005:475-481.
27. Chen H, Xue Y, Huang N, Yao X, Sun Z: MeMo: a web tool for prediction of protein methylation modifications. Nucleic Acids Res 2006, 34(Web Server issue):W249-253.
28. Lee TY, Huang HD, Hung JH, Huang HY, Yang YS, Wang TH: dbPTM: an information repository of protein post-translational modification. Nucleic Acids Res 2006, 34(Database issue):D622-627.
29. Arthur JW, Sanchez-Perez A, Cook DI: Scoring of predicted GRK2 phosphorylation sites in Nedd4-2. Bioinformatics 2006, 22(18):2192-2195.
30. Ahmad S, Gromiha MM, Sarai A: RVP-net: online prediction of real valued accessible surface area of proteins from single sequences. Bioinformatics 2003, 19(14):1849-1851.
31. Ahmad S, Gromiha MM, Sarai A: Real value prediction of solvent accessibility from amino acid sequence. Proteins 2003, 50(4):629-635.
32. McGuffin LJ, Bryson K, Jones DT: The PSIPRED protein structure prediction server. Bioinformatics 2000, 16(4):404-405.
33. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res 1997, 25(17):3389-3402.
34. Bryson K, McGuffin LJ, Marsden RL, Ward JJ, Sodhi JS, Jones DT: Protein structure prediction servers at University College London. Nucleic Acids Res 2005, 33(Web Server issue):W36-38.
35. Cortes CaVV: Support-vector networks. Machine Learning 1995, 20:273-297.
36. Yu CS, Chen YC, Lu CH, Hwang JK: Prediction of protein subcellular localization. Proteins 2006, 64(3):643-651.
37. Nguyen MN, Rajapakse JC: Two-stage multi-class support vector machines to protein secondary structure prediction. Pac Symp Biocomput 2005:346-357.
38. Williams RD, Hing SN, Greer BT, Whiteford CC, Wei JS, Natrajan R, Kelsey A, Rogers S, Campbell C, Pritchard-Jones K et al: Prognostic classification of relapsing favorable histology Wilms tumor using cDNA microarray expression profiling and support vector machines. Genes Chromosomes Cancer 2004, 41(1):65-79.
39. Kim JH, Lee J, Oh B, Kimm K, Koh I: Prediction of phosphorylation sites using SVMs. Bioinformatics 2004, 20(17):3179-3184.
40. Chang C-C, Lin C-J: LIBSVM : a library for support vector machines. Software available at http://wwwcsientuedutw/~cjlin/libsvm 2001.
41. Huang HD, Lee TY, Tzeng SW, Wu LC, Horng JT, Tsou AP, Huang KT: Incorporating hidden Markov models for identifying protein kinase-specific phosphorylation sites. J Comput Chem 2005, 26(10):1032-1041.
42. Eddy SR: Profile hidden Markov models. Bioinformatics 1998, 14(9):755-763.
43. Chou KC, Shen HB: Recent progress in protein subcellular location prediction. Anal Biochem 2007, 370(1):1-16.
44. Chou KC, Zhang CT: Prediction of protein structural classes. Crit Rev Biochem Mol Biol 1995, 30(4):275-349.
45. Crooks GE, Hon G, Chandonia JM, Brenner SE: WebLogo: a sequence logo generator. Genome Res 2004, 14(6):1188-1190.
46. Schneider TD, Stephens RM: Sequence logos: a new way to display consensus sequences. Nucleic Acids Res 1990, 18(20):6097-6100.
47. An W: Histone acetylation and methylation: combinatorial players for transcriptional regulation. Subcell Biochem 2007, 41:351-369.
48. Rice JC, Allis CD: Histone methylation versus histone acetylation: new insights into epigenetic regulation. Curr Opin Cell Biol 2001, 13(3):263-273.
49. Zanzoni A, Ausiello G, Via A, Gherardini PF, Helmer-Citterich M: Phospho3D: a database of three-dimensional structures of protein phosphorylation sites. Nucleic Acids Res 2007, 35(Database issue):D229-231.