| 研究生: |
范文郎 Wen-lang Fan |
|---|---|
| 論文名稱: |
基因體的普適性之成長模型 Modeling Genome Growth Based on Universal Properties of Whole Genome |
| 指導教授: |
李弘謙
Hoong-chien Lee |
| 口試委員: | |
| 學位類別: |
博士 Doctor |
| 系所名稱: |
理學院 - 物理學系 Department of Physics |
| 畢業學年度: | 97 |
| 語文別: | 中文 |
| 論文頁數: | 51 |
| 中文關鍵詞: | 片段複製 、基因體成長模型 、長程關聯 、倒對稱 、等價長度 |
| 外文關鍵詞: | long-range variation, inverse symmetry, Segmental duplication, genome growth model, equivalent length |
| 相關次數: | 點閱:12 下載:0 |
| 分享至: |
| 查詢本校圖書館目錄 查詢臺灣博碩士論文知識加值系統 勘誤回報 |
片段複製被認為是生物基因體演化與成長的最主要驅動力,但對於生物複雜性與基因體長度的明確關係卻尚未明瞭。利用寡核苷酸頻率法分析了865條完整的染色體後,歸納出基因體全序列具有等價長度、倒對稱與長程關聯等普適特徵。藉由這些普適的統計性質,我們提出以片段複製為基礎的成長機制,並建立模型來重現了生物基因體的普適特徵。模型由隨機片段複製後,以隨機插入、倒字串插入或連接插入等方式來成長,並加入單點突變來架構整個模型的成長機制。在要求吻合基因體普適性來探索模型參數值的過程中,我們發現模型序列的特性與複製片段的長度沒有明顯關聯,而對單點突變有較強的依靠性。在選擇適當的參數下,模型序列具有與基因體全序列相同的統計性質,驗證了模型成長機制可能就是生物基因體成長的主要生化機能。
Segmental duplication has long been considered to be an important driving force in genome growth and evolution. But a quantitative description of the nature of the duplication process and its relation to the complexity of genome structure has been lacking. We use word frequency to analyze complete genomes and use non-trivial universal statistical properties of genomes – equivalent length, inverse symmetry and long-range variation – as clues for specifying the nature of the segmental duplication process. We use a minimal genome growth model based on random segmental duplication (RSD) to generate genome-length sequences and compare their statistical properties with those of real genomes. With a few biologically meaningful universal parameters the RSD model can well describe most of the prominent and non-trivial statistical properties of genomes, including the universality of their equivalent lengths, and their patterns of long-range variation and inverse symmetry. Neutral and mostly random segmental duplication (RSD) is a dominant characteristic of genome growth, with the typical length of duplicated segments (DS) being 500 to 5000 nucleotides long. About 70% of the duplication events are “tandem” – DS is proximal to its origin – and about 30% are inverse – DS is made from one strand to the other. Occasionally a whole genome is inversely duplicated.
參考文獻
1. Sanger, F., S. Nicklen, and A.R. Coulson, DNA sequencing with chain-terminating inhibitors. Proc Natl Acad Sci U S A, 1977. 74(12): p. 5463-7.
2. Fleischmann, R.D., et al., Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science, 1995. 269(5223): p. 496-512.
3. Lander, E.S., et al., Initial sequencing and analysis of the human genome. Nature, 2001. 409(6822): p. 860-921.
4. Venter, J.C., et al., The sequence of the human genome. Science, 2001. 291(5507): p. 1304-1351.
5. Li, W.H., et al., Evolutionary analyses of the human genome. Nature, 2001. 409(6822): p. 847-849.
6. Brown, T.A., Genomes 3. 3rd ed. 2006, New York: Garland Science Pub. p.
7. Blattner, F.R., et al., The complete genome sequence of Escherichia coli K-12. Science, 1997. 277(5331): p. 1453-&.
8. Stolc, V., et al., A gene expression map for the euchromatic genome of Drosophila melanogaster. Science, 2004. 306(5696): p. 655-660.
9. Adams, M.D., et al., The genome sequence of Drosophila melanogaster. Science, 2000. 287(5461): p. 2185-2195.
10. Kaul, S., et al., Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature, 2000. 408(6814): p. 796-815.
11. Volfovsky, N., et al., Genome and gene alterations by insertions and deletions in the evolution of human and chimpanzee chromosome 22. Bmc Genomics, 2009. 10: p. 13.
12. Messer, P.W. and P.F. Arndt, The majority of recent short DNA insertions in the human genome are tandem duplications. Molecular Biology and Evolution, 2007. 24(5): p. 1190-1197.
13. de la Chaux, N., P.W. Messer, and P.F. Arndt, DNA indels in coding regions reveal selective constraints on protein evolution in the human lineage. Bmc Evolutionary Biology, 2007. 7: p. 13.
14. Watson, J.D. and F.H. Crick, Molecular structure of nucleic acids; a structure for deoxyribose nucleic acid. Nature, 1953. 171(4356): p. 737-8.
15. Parkhill, J., et al., Complete DNA sequence of a serogroup A strain of Neisseria meningitidis Z2491. Nature, 2000. 404(6777): p. 502-506.
16. Fischer, G., et al., Chromosomal evolution in Saccharomyces. Nature, 2000. 405(6785): p. 451-454.
17. Nadeau, J.H. and D. Sankoff, Comparable rates of gene loss and functional divergence after genome duplications early in vertebrate evolution. Genetics, 1997. 147(3): p. 1259-1266.
18. Grant, D., P. Cregan, and R.C. Shoemaker, Genome organization in dicots: Genome duplication in Arabidopsis and synteny between soybean and Arabidopsis. Proceedings of the National Academy of Sciences of the United States of America, 2000. 97(8): p. 4168-4173.
19. Friedman, D.I., M.J. Imperiale, and S.L. Adhya, RNA 3'' end formation in the control of gene expression. Annu Rev Genet, 1987. 21: p. 453-88.
20. Burke, J., et al., Structural conventions for group I introns. Nucleic Acids Res., 1987. 18(15(18):7217-21).
21. Robertson, M.P. and A.D. Ellington, Ribozymes. How to make a nucleotide. Nature, 1998. 395(6699): p. 223-5.
22. Bartel, D.P. and P.J. Unrau, Constructing an RNA world (Reprinted from Trends in Biochemical Science, vol 12, Dec., 1999). Trends in Cell Biology, 1999. 9(12): p. M9-M13.
23. Krylov, D.M., et al., Gene loss, protein sequence divergence, gene dispensability, expression level, and interactivity are correlated in eukaryotic evolution. Genome Research, 2003. 13(10): p. 2229-2235.
24. Bird, A.P., GENE NUMBER, NOISE-REDUCTION AND BIOLOGICAL COMPLEXITY. Trends in Genetics, 1995. 11(3): p. 94-100.
25. Wolfe, K.H. and D.C. Shields, Molecular evidence for an ancient duplication of the entire yeast genome. Nature, 1997. 387(6634): p. 708-713.
26. Krings, M., et al., Neandertal DNA sequences and the origin of modern humans. Cell, 1997. 90(1): p. 19-30.
27. Hart, D.L., A.R. Lohe, and E.R. Lozovskaya, Modern thoughts on an ancyent marinere: Function, evolution, regulation. Annual Review of Genetics, 1997. 31: p. 337-358.
28. Robertson, H.M., et al., Reconstructing the ancient mariners of humans. Nature Genetics, 1996. 12(4): p. 360-361.
29. Rubin, G.M., et al., Comparative genomics of the eukaryotes. Science, 2000. 287(5461): p. 2204-2215.
30. Lynch, M. and J.S. Conery, The origins of genome complexity. Science, 2003. 302(5649): p. 1401-1404.
31. Smith, H.O., et al., Frequency and distribution of DNA uptake signal sequences in the Haemophilus influenzae Rd genome. Science, 1995.
32. Karlin, S., C. Burge, and A.M. Campbell, Statistical analyses of counts and distributions of restriction sites in DNA sequences. Nucleic Acids Res, 1992. 20(6): p. 1363-70.
33. Colbert, T., A.F. Taylor, and G.R. Smith, Genomics, Chi sites and codons: [`]islands of preferred DNA pairing'' are oceans of ORFs. Trends in Genetics, 1998. 14(12): p. 485-488.
34. Hsieh, L.-C. and H.C. Lee, Universal Lengths of Bacterial Genomes and Model for Genome Growth. NCU Ph.D. thesis, 2002.
35. Chen, H.-D., et al., Divergence and Shannon Information in Genomes. Physical Review Letters, 2005. 94(17): p. 178103.
36. Bailey, J.A., et al., Recent segmental duplications in the human genome. Science, 2002. 297(5583): p. 1003-1007.
37. Liu, G., et al., Analysis of primate genomic variation reveals a repeat-driven expansion of the human genome. Genome Research, 2003. 13(3): p. 358-368.
38. Saakian, D.B., Evolution models with base substitutions, insertions, deletions, and selection. Physical Review E, 2008. 78(6): p. 6.
39. Saakian, D.B., Z. Kirakosyan, and C.K. Hu, Diploid biological evolution models with general smooth fitness landscapes and recombination. Physical Review E, 2008. 77(6): p. 10.
40. Saakian, D.B., A new method for the solution of models of biological evolution: Derivation of exact steady-state distributions. Journal of Statistical Physics, 2007. 128(3): p. 781-798.
41. Messer, P.W., P.F. Arndt, and M. Lassig, Solvable sequence evolution models and genomic correlations. Physical Review Letters, 2005. 94(13): p. 4.
42. Nikolaou, C. and Y. Almirantis, Deviations from Chargaff''s second parity rule in organellar DNA - Insights into the evolution of organellar genomes. Gene, 2006. 381: p. 34-41.
43. Albrecht-Buehler, G., Asymptotically increasing compliance of genomes with Chargaff''s second parity rules through inversions and inverted transpositions. Proceedings of the National Academy of Sciences of the United States of America, 2006. 103(47): p. 17828-17833.
44. Tillier, E.R.M. and R.A. Collins, Genome rearrangement by replication-directed translocation. Nature Genetics, 2000. 26(2): p. 195-197.
45. Forsdyke, D.R. and S.J. Bell, Purine loading, stem-loops and Chargaff''s second parity rule: a discussion of the application of elementary principles to early chemical observations. Appl Bioinformatics, 2004. 3(1): p. 3-8.
46. Frank, A.C. and J.R. Lobry, Asymmetric substitution patterns: a review of possible underlying mutational or selective mechanisms. Gene, 1999. 238(1): p. 65-77.
47. Kong, S.G., et al., Inverse symmetry in complete genomes and whole-genome inverse duplication. 2009.
48. Necsulea, A. and J.R. Lobry, A new method for assessing the effect of replication on DNA base composition asymmetry. Mol Biol Evol, 2007. 24(10): p. 2169-79.
49. Tillier, E.R. and R.A. Collins, Genome rearrangement by replication-directed translocation. Nat Genet, 2000. 26(2): p. 195-7.
50. Rosenblatt, M., A CENTRAL LIMIT THEOREM AND A STRONG MIXING CONDITION. Proc Natl Acad Sci U S A, 1956. 42(1): p. 43-7.
51. Li, W. and K. Kaneko, LONG-RANGE CORRELATION AND PARTIAL 1/F-ALPHA SPECTRUM IN A NONCODING DNA-SEQUENCE. Europhysics Letters, 1992. 17(7): p. 655-660.
52. Peng, C.K., et al., LONG-RANGE CORRELATIONS IN NUCLEOTIDE-SEQUENCES. Nature, 1992. 356(6365): p. 168-170.
53. Voss, R.F., EVOLUTION OF LONG-RANGE FRACTAL CORRELATIONS AND 1/F NOISE IN DNA-BASE SEQUENCES. Physical Review Letters, 1992. 68(25): p. 3805-3808.
54. Martin, A.P., Increasing genomic complexity by gene duplication and the origin of vertebrates. American Naturalist, 1999. 154(2): p. 111-128.
55. Lynch, M., Mutation accumulation in nuclear, organelle, and prokaryotic transfer RNA genes. Molecular Biology and Evolution, 1997. 14(9): p. 914-925.
56. Bowers, J.E., et al., Unravelling angiosperm genome evolution by phylogenetic analysis of chromosomal duplication events. Nature, 2003. 422(6930): p. 433-438.
57. Lynch, M. and J.S. Conery, The evolutionary fate and consequences of duplicate genes. Science, 2000. 290(5494): p. 1151-1155.
58. Force, A., et al., Preservation of duplicate genes by complementary, degenerative mutations. Genetics, 1999. 151(4): p. 1531-1545.
59. Holland, P.W.H., et al., GENE DUPLICATIONS AND THE ORIGINS OF VERTEBRATE DEVELOPMENT. Development, 1994: p. 125-133.
60. Hughes, A.L., THE EVOLUTION OF FUNCTIONALLY NOVEL PROTEINS AFTER GENE DUPLICATION. Proceedings of the Royal Society of London Series B-Biological Sciences, 1994. 256(1346): p. 119-124.
61. Hughes, M.K. and A.L. Hughes, EVOLUTION OF DUPLICATE GENES IN A TETRAPLOID ANIMAL, XENOPUS-LAEVIS. Molecular Biology and Evolution, 1993. 10(6): p. 1360-1369.
62. Lynch, M. and A. Force, The probability of duplicate gene preservation by subfunctionalization. Genetics, 2000. 154(1): p. 459-473.
63. Deckert, G., et al., The complete genome of the hyperthermophilic bacterium Aquifex aeolicus. Nature, 1998. 392(6674): p. 353-358.
64. Csink, A.K. and S. Henikoff, Something from nothing: The evolution and utility of satellite repeats. Trends in Genetics, 1998. 14(5): p. 200-204.
65. She, X.W., et al., Shotgun sequence assembly and recent segmental duplications within the human genome. Nature, 2004. 431(7011): p. 927-930.