CGKB: an annotation knowledge base for cowpea (Vigna unguiculata L.) methylation filtered genomic genespace sequences - PubMed
- ️Mon Jan 01 2007
CGKB: an annotation knowledge base for cowpea (Vigna unguiculata L.) methylation filtered genomic genespace sequences
Xianfeng Chen et al. BMC Bioinformatics. 2007.
Abstract
Background: Cowpea [Vigna unguiculata (L.) Walp.] is one of the most important food and forage legumes in the semi-arid tropics because of its ability to tolerate drought and grow on poor soils. It is cultivated mostly by poor farmers in developing countries, with 80% of production taking place in the dry savannah of tropical West and Central Africa. Cowpea is largely an underexploited crop with relatively little genomic information available for use in applied plant breeding. The goal of the Cowpea Genomics Initiative (CGI), funded by the Kirkhouse Trust, a UK-based charitable organization, is to leverage modern molecular genetic tools for gene discovery and cowpea improvement. One aspect of the initiative is the sequencing of the gene-rich region of the cowpea genome (termed the genespace) recovered using methylation filtration technology and providing annotation and analysis of the sequence data.
Description: CGKB, Cowpea Genespace/Genomics Knowledge Base, is an annotation knowledge base developed under the CGI. The database is based on information derived from 298,848 cowpea genespace sequences (GSS) isolated by methylation filtering of genomic DNA. The CGKB consists of three knowledge bases: GSS annotation and comparative genomics knowledge base, GSS enzyme and metabolic pathway knowledge base, and GSS simple sequence repeats (SSRs) knowledge base for molecular marker discovery. A homology-based approach was applied for annotations of the GSS, mainly using BLASTX against four public FASTA formatted protein databases (NCBI GenBank Proteins, UniProtKB-Swiss-Prot, UniprotKB-PIR (Protein Information Resource), and UniProtKB-TrEMBL). Comparative genome analysis was done by BLASTX searches of the cowpea GSS against four plant proteomes from Arabidopsis thaliana, Oryza sativa, Medicago truncatula, and Populus trichocarpa. The possible exons and introns on each cowpea GSS were predicted using the HMM-based Genscan gene predication program and the potential domains on annotated GSS were analyzed using the HMMER package against the Pfam database. The annotated GSS were also assigned with Gene Ontology annotation terms and integrated with 228 curated plant metabolic pathways from the Arabidopsis Information Resource (TAIR) knowledge base. The UniProtKB-Swiss-Prot ENZYME database was used to assign putative enzymatic function to each GSS. Each GSS was also analyzed with the Tandem Repeat Finder (TRF) program in order to identify potential SSRs for molecular marker discovery. The raw sequence data, processed annotation, and SSR results were stored in relational tables designed in key-value pair fashion using a PostgreSQL relational database management system. The biological knowledge derived from the sequence data and processed results are represented as views or materialized views in the relational database management system. All materialized views are indexed for quick data access and retrieval. Data processing and analysis pipelines were implemented using the Perl programming language. The web interface was implemented in JavaScript and Perl CGI running on an Apache web server. The CPU intensive data processing and analysis pipelines were run on a computer cluster of more than 30 dual-processor Apple XServes. A job management system called Vela was created as a robust way to submit large numbers of jobs to the Portable Batch System (PBS).
Conclusion: CGKB is an integrated and annotated resource for cowpea GSS with features of homology-based and HMM-based annotations, enzyme and pathway annotations, GO term annotation, toolkits, and a large number of other facilities to perform complex queries. The cowpea GSS, chloroplast sequences, mitochondrial sequences, retroelements, and SSR sequences are available as FASTA formatted files and downloadable at CGKB. This database and web interface are publicly accessible at http://cowpeagenomics.med.virginia.edu/CGKB/.
Figures

Screenshot of the cowpea genespace knowledge base.

The relational database architecture of the data management system for cowpea methylation filtered genomic genespace sequences.

Snapshots of annotation database based on cowpea methylation filtered genomic genespace sequences.

Snapshots of the metabolic pathway database based on cowpea methylation filtered genomic genespace sequences.
Similar articles
-
Sequencing and analysis of the gene-rich space of cowpea.
Timko MP, Rushton PJ, Laudeman TW, Bokowiec MT, Chipumuro E, Cheung F, Town CD, Chen X. Timko MP, et al. BMC Genomics. 2008 Feb 27;9:103. doi: 10.1186/1471-2164-9-103. BMC Genomics. 2008. PMID: 18304330 Free PMC article.
-
TOBFAC: the database of tobacco transcription factors.
Rushton PJ, Bokowiec MT, Laudeman TW, Brannock JF, Chen X, Timko MP. Rushton PJ, et al. BMC Bioinformatics. 2008 Jan 25;9:53. doi: 10.1186/1471-2105-9-53. BMC Bioinformatics. 2008. PMID: 18221524 Free PMC article.
-
Chen H, Wang L, Liu X, Hu L, Wang S, Cheng X. Chen H, et al. BMC Genet. 2017 Jul 11;18(1):65. doi: 10.1186/s12863-017-0531-5. BMC Genet. 2017. PMID: 28693419 Free PMC article.
-
Introgression Breeding in Cowpea [Vigna unguiculata (L.) Walp.].
Boukar O, Abberton M, Oyatomi O, Togola A, Tripathi L, Fatokun C. Boukar O, et al. Front Plant Sci. 2020 Sep 16;11:567425. doi: 10.3389/fpls.2020.567425. eCollection 2020. Front Plant Sci. 2020. PMID: 33072144 Free PMC article. Review.
-
Developments and Prospects in Imperative Underexploited Vegetable Legumes Breeding: A Review.
Dhaliwal SK, Talukdar A, Gautam A, Sharma P, Sharma V, Kaushik P. Dhaliwal SK, et al. Int J Mol Sci. 2020 Dec 17;21(24):9615. doi: 10.3390/ijms21249615. Int J Mol Sci. 2020. PMID: 33348635 Free PMC article. Review.
Cited by
-
Tobacco transcription factors: novel insights into transcriptional regulation in the Solanaceae.
Rushton PJ, Bokowiec MT, Han S, Zhang H, Brannock JF, Chen X, Laudeman TW, Timko MP. Rushton PJ, et al. Plant Physiol. 2008 May;147(1):280-95. doi: 10.1104/pp.107.114041. Epub 2008 Mar 12. Plant Physiol. 2008. PMID: 18337489 Free PMC article.
-
Bohra A, Pandey MK, Jha UC, Singh B, Singh IP, Datta D, Chaturvedi SK, Nadarajan N, Varshney RK. Bohra A, et al. Theor Appl Genet. 2014 Jun;127(6):1263-91. doi: 10.1007/s00122-014-2301-3. Epub 2014 Apr 8. Theor Appl Genet. 2014. PMID: 24710822 Free PMC article. Review.
-
Jha UC, Bohra A, Jha R, Parida SK. Jha UC, et al. Plant Cell Rep. 2019 Mar;38(3):255-277. doi: 10.1007/s00299-019-02374-5. Epub 2019 Jan 12. Plant Cell Rep. 2019. PMID: 30637478 Review.
-
The genome assembly of asparagus bean, Vigna unguiculata ssp. sesquipedialis.
Xia Q, Pan L, Zhang R, Ni X, Wang Y, Dong X, Gao Y, Zhang Z, Kui L, Li Y, Wang W, Yang H, Chen C, Miao J, Chen W, Dong Y. Xia Q, et al. Sci Data. 2019 Jul 17;6(1):124. doi: 10.1038/s41597-019-0130-6. Sci Data. 2019. PMID: 31316072 Free PMC article.
-
Genomics, genetics and breeding of tropical legumes for better livelihoods of smallholder farmers.
Ojiewo C, Monyo E, Desmae H, Boukar O, Mukankusi-Mugisha C, Thudi M, Pandey MK, Saxena RK, Gaur PM, Chaturvedi SK, Fikre A, Ganga Rao N, SameerKumar CV, Okori P, Janila P, Rubyogo JC, Godfree C, Akpo E, Omoigui L, Nkalubo S, Fenta B, Binagwa P, Kilango M, Williams M, Mponda O, Okello D, Chichaybelu M, Miningou A, Bationo J, Sako D, Diallo S, Echekwu C, Umar ML, Oteng-Frimpong R, Mohammed H, Varshney RK. Ojiewo C, et al. Plant Breed. 2019 Aug;138(4):487-499. doi: 10.1111/pbr.12554. Epub 2018 Apr 17. Plant Breed. 2019. PMID: 31787790 Free PMC article. Review.
References
-
- Singh BB. Cowpea [Vigna unguiculata (L.) Walp. In: Singh RJ, Jauhar PP, editor. Genetic Resources, Chromosome Engineering and Crop Improvement. Vol. 1. Boca Raton, FL: CRC Press; 2005. pp. 117–162.
-
- Timko MP, Ehlers JD, Roberts PA. Cowpea. In: Kole C, editor. Genome Mapping and Molecular Breeding in Plants Pulses, Sugar and Tuber Crops. Vol. 3. Berlin Heidelberg: Springer-Verlag; 2007. pp. 49–68.
-
- Arumuganathan K, Earle ED. Nuclear DNA content of some important plant species. Plant Mol Biol Rep. 1991;9:208–218.
-
- Bennetzen JL, Schrick K, Springer PS, Brown WE, SanMiguel P. Active maize genes are unmodified and flanked by diverse classes of modified, highly repetitive DNA. Genome. 1994;37:565–576. - PubMed
Publication types
MeSH terms
Substances
LinkOut - more resources
Full Text Sources
Other Literature Sources
Miscellaneous