CAST: an iterative algorithm for the complexity analysis of sequence tracts. Complexity analysis of sequence tracts - PubMed
CAST: an iterative algorithm for the complexity analysis of sequence tracts. Complexity analysis of sequence tracts
V J Promponas et al. Bioinformatics. 2000 Oct.
Abstract
Motivation: Sensitive detection and masking of low-complexity regions in protein sequences. Filtered sequences can be used in sequence comparison without the risk of matching compositionally biased regions. The main advantage of the method over similar approaches is the selective masking of single residue types without affecting other, possibly important, regions.
Results: A novel algorithm for low-complexity region detection and selective masking. The algorithm is based on multiple-pass Smith-Waterman comparison of the query sequence against twenty homopolymers with infinite gap penalties. The output of the algorithm is both the masked query sequence for further analysis, e.g. database searches, as well as the regions of low complexity. The detection of low-complexity regions is highly specific for single residue types. It is shown that this approach is sufficient for masking database query sequences without generating false positives. The algorithm is benchmarked against widely available algorithms using the 210 genes of Plasmodium falciparum chromosome 2, a dataset known to contain a large number of low-complexity regions.
Availability: CAST (version 1.0) executable binaries are available to academic users free of charge under license. Web site entry point, server and additional material: http://www.ebi.ac.uk/research/cgg/services/cast/
Similar articles
-
A new algorithm for detecting low-complexity regions in protein sequences.
Shin SW, Kim SM. Shin SW, et al. Bioinformatics. 2005 Jan 15;21(2):160-70. doi: 10.1093/bioinformatics/bth497. Epub 2004 Aug 27. Bioinformatics. 2005. PMID: 15333459
-
Barry AE, Leliwa A, Choi M, Nielsen KM, Hartl DL, Day KP. Barry AE, et al. Mol Biochem Parasitol. 2003 Aug 31;130(2):143-7. doi: 10.1016/s0166-6851(03)00164-6. Mol Biochem Parasitol. 2003. PMID: 12946852 No abstract available.
-
The physics of DNA and the annotation of the Plasmodium falciparum genome.
Yeramian E. Yeramian E. Gene. 2000 Sep 19;255(2):151-68. doi: 10.1016/s0378-1119(00)00300-0. Gene. 2000. PMID: 11024276
-
GeneRAGE: a robust algorithm for sequence clustering and domain detection.
Enright AJ, Ouzounis CA. Enright AJ, et al. Bioinformatics. 2000 May;16(5):451-7. doi: 10.1093/bioinformatics/16.5.451. Bioinformatics. 2000. PMID: 10871267
-
Curation of the Plasmodium falciparum genome.
Berry AE, Gardner MJ, Caspers GJ, Roos DS, Berriman M. Berry AE, et al. Trends Parasitol. 2004 Dec;20(12):548-52. doi: 10.1016/j.pt.2004.09.003. Trends Parasitol. 2004. PMID: 15522662 Review.
Cited by
-
Evolutionary Study of Disorder in Protein Sequences.
Kastano K, Erdős G, Mier P, Alanis-Lobato G, Promponas VJ, Dosztányi Z, Andrade-Navarro MA. Kastano K, et al. Biomolecules. 2020 Oct 6;10(10):1413. doi: 10.3390/biom10101413. Biomolecules. 2020. PMID: 33036302 Free PMC article.
-
Harrison PM. Harrison PM. Sci Rep. 2024 Jan 5;14(1):680. doi: 10.1038/s41598-023-50991-8. Sci Rep. 2024. PMID: 38182699 Free PMC article.
-
Ntountoumi C, Vlastaridis P, Mossialos D, Stathopoulos C, Iliopoulos I, Promponas V, Oliver SG, Amoutzias GD. Ntountoumi C, et al. Nucleic Acids Res. 2019 Nov 4;47(19):9998-10009. doi: 10.1093/nar/gkz730. Nucleic Acids Res. 2019. PMID: 31504783 Free PMC article.
-
Parallel Evolution of Ameloblastic scpp Genes in Bony and Cartilaginous Vertebrates.
Leurs N, Martinand-Mari C, Marcellini S, Debiais-Thibaud M. Leurs N, et al. Mol Biol Evol. 2022 May 3;39(5):msac099. doi: 10.1093/molbev/msac099. Mol Biol Evol. 2022. PMID: 35535508 Free PMC article.
-
The phylogenetic diversity of eukaryotic transcription.
Coulson RM, Ouzounis CA. Coulson RM, et al. Nucleic Acids Res. 2003 Jan 15;31(2):653-60. doi: 10.1093/nar/gkg156. Nucleic Acids Res. 2003. PMID: 12527774 Free PMC article.
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Miscellaneous