Comparison of Real Frequencies of Strings vs. the Expected Ones Reveals the Information Capacity of Macromoleculae - PubMed
Comparison of Real Frequencies of Strings vs. the Expected Ones Reveals the Information Capacity of Macromoleculae
Michael G Sadovsky. J Biol Phys. 2003 Mar.
The information capacity of nucleotide sequences is defined through the calculation of specific entropy of their frequency dictionary. The specificentropy of the frequency dictionary is calculated against the reconstructeddictionary; this latter bears the most probable continuations of the shorterstrings. This developed measure allows to distinguish the sequences both from the randons ones, and from those with high level of (rather simple) order. Some implications of the developed methodology in the fields of genetics,bioinformatics, and molecular biology are discussed.
Keywords: Markov model; dictionary; entropy; information capacity; ordered sequence; random sequence; reconstructed dictionary; specific entropy.
Similar articles
Genes, information and sense: complexity and knowledge retrieval.
Sadovsky MG, Putintseva JA, Shchepanovsky AS. Sadovsky MG, et al. Theory Biosci. 2008 Jun;127(2):69-78. doi: 10.1007/s12064-008-0032-1. Epub 2008 Apr 29. Theory Biosci. 2008. PMID: 18443840 Review.
Information capacity of nucleotide sequences and its applications.
Sadovsky MG. Sadovsky MG. Bull Math Biol. 2006 May;68(4):785-806. doi: 10.1007/s11538-005-9017-0. Epub 2006 Apr 7. Bull Math Biol. 2006. PMID: 16802083
The method to compare nucleotide sequences based on the minimum entropy principle.
Sadovsky MG. Sadovsky MG. Bull Math Biol. 2003 Mar;65(2):309-22. doi: 10.1016/S0092-8240(02)00107-6. Bull Math Biol. 2003. PMID: 12675334
[Information capacity of the nucleotide sequences and their fragments].
Bugaenko NN, Gorban' AN, Sadovskiĭ MG. Bugaenko NN, et al. Biofizika. 1997 Sep-Oct;42(5):1047-53. Biofizika. 1997. PMID: 9410032 Russian.
Complexity and information in regular and random phyllotactic patterns.
Barabé D, Jeune B. Barabé D, et al. Riv Biol. 2006 Jan-Apr;99(1):85-102. Riv Biol. 2006. PMID: 16791792 Review.
Cited by
Self-organization and entropy reduction in a living cell.
Davies PC, Rieper E, Tuszynski JA. Davies PC, et al. Biosystems. 2013 Jan;111(1):1-10. doi: 10.1016/j.biosystems.2012.10.005. Epub 2012 Nov 15. Biosystems. 2013. PMID: 23159919 Free PMC article.
Jun SR, Sims GE, Wu GA, Kim SH. Jun SR, et al. Proc Natl Acad Sci U S A. 2010 Jan 5;107(1):133-8. doi: 10.1073/pnas.0913033107. Epub 2009 Dec 14. Proc Natl Acad Sci U S A. 2010. PMID: 20018669 Free PMC article.
Genes, information and sense: complexity and knowledge retrieval.
Sadovsky MG, Putintseva JA, Shchepanovsky AS. Sadovsky MG, et al. Theory Biosci. 2008 Jun;127(2):69-78. doi: 10.1007/s12064-008-0032-1. Epub 2008 Apr 29. Theory Biosci. 2008. PMID: 18443840 Review.
Whole-proteome phylogeny of large dsDNA virus families by an alignment-free method.
Wu GA, Jun SR, Sims GE, Kim SH. Wu GA, et al. Proc Natl Acad Sci U S A. 2009 Aug 4;106(31):12826-31. doi: 10.1073/pnas.0905115106. Epub 2009 Jun 24. Proc Natl Acad Sci U S A. 2009. PMID: 19553209 Free PMC article.
Zhang Q, Jun SR, Leuze M, Ussery D, Nookaew I. Zhang Q, et al. Sci Rep. 2017 Jan 19;7:40712. doi: 10.1038/srep40712. Sci Rep. 2017. PMID: 28102365 Free PMC article.
- Waterman M.S., editor. Mathematical Methods for DNA Sequences. Boca Raton: CRC Press; 1998.
- Alexandrov A.A., Alexandrov N.N., Borodovsky M., Kalambet Y., Kister A.Z., Mironov A.A., Pevzner P.A., Shepelev V.A. Computer Analysis of Genetic Texts. Moscow: Nauka; 1990.
- Claverie, J.-M., Sauvaget, I. and Bougueleret, L.: k-Tuple Frequency Analysis: From Intron/ Exon Discrimination to T-Cell Epitope Mapping, In: R.F. Doolittle (ed.), Molecular Evolution: Computer Analysis of Protein and Nucleic Acid Sequences Meth. Enzymol. vol. 183, 1990, pp. 252-281. - PubMed
- Karlin S., Cardon L.R. Computational DNA Sequence Analysis. Ann. Rev. Microbiol. 1994;48:619–654. - PubMed
- Yockey H.P. Information Theory and Molecular Biology. N.Y.: Cambridge Univ. Press; 1992.
LinkOut - more resources
Full Text Sources