The Pfam protein families database - PubMed
. 2010 Jan;38(Database issue):D211-22.
doi: 10.1093/nar/gkp985. Epub 2009 Nov 17.
Jaina Mistry, John Tate, Penny Coggill, Andreas Heger, Joanne E Pollington, O Luke Gavin, Prasad Gunasekaran, Goran Ceric, Kristoffer Forslund, Liisa Holm, Erik L L Sonnhammer, Sean R Eddy, Alex Bateman
Affiliations
- PMID: 19920124
- PMCID: PMC2808889
- DOI: 10.1093/nar/gkp985
The Pfam protein families database
Robert D Finn et al. Nucleic Acids Res. 2010 Jan.
Abstract
Pfam is a widely used database of protein families and domains. This article describes a set of major updates that we have implemented in the latest release (version 24.0). The most important change is that we now use HMMER3, the latest version of the popular profile hidden Markov model package. This software is approximately 100 times faster than HMMER2 and is more sensitive due to the routine use of the forward algorithm. The move to HMMER3 has necessitated numerous changes to Pfam that are described in detail. Pfam release 24.0 contains 11,912 families, of which a large number have been significantly updated during the past two years. Pfam is available via servers in the UK (http://pfam.sanger.ac.uk/), the USA (http://pfam.janelia.org/) and Sweden (http://pfam.sbc.su.se/).
Figures

Sequence search results page. Results page for a single sequence search, showing at the top, the graphic of the domains matched by the query sequence along its length, with any active-site or metal-binding residues marked up if present. Underneath comes, firstly, the significant matches to Pfam-A families, then the insignificant matches to Pfam-A families, followed by the significant matches to Pfam-B families. At the bottom is the expanded match results with the #HMM line coloured such that residues identical to those in the query are coloured cyan and those that are similar in dark blue, and a #PP (posterior probability) line giving the posterior-probabilities at each point such that the #SEQ, query, line is colour-coded accordingly.

New Pfam display of a protein domain architecture. Pfam-A families classified as type ‘family’ and ‘domain’ with a lozenge shape, and families with type ‘repeat’ or ‘motif’ are represented by rectangles. The alignment co-ordinates are depitcted with a solid colour, and the envelope co-ordianates in a lighter shade of this colour. Where the profile HMM match for a domain or family is only of partial length, the curved end of the lozenge/rectangle is replaced by a jagged edge. Active-site residues are marked with a lollipop with a diamond-shaped head. An example tooltip showing the domain description, co-ordinates and source is shown for the fourth domain. Note the overlapping envelopes between fourth and fifth domains.

New alignment confidence display. The colour of the residues reflects the alignment uncertainty, and is based on the posterior probability that is calculated by HMMER3. A green residue indicates a high posterior probability which means that the alignment of the amino acid to the match/insert state in the profile HMM is very likely to be correct. Where the posterior probablity is lower, and therefore the alignment certainty decreases, the colour becomes closer to red. This allows users quickly to identify regions of the alignment where some sequences are aligned with less certainty.

New BioLit/TOPSAN views. Left: using the webservices provided by BioLit, we display the abstract, figures and figure legends from the publication associated with a particular PDB entry (only where articles are published in open access journals). In this case, we have retrieved open access articles that reference the PDB entry 1dan. Right: using the webservices provided by TOPSAN, we display images and text from the TOPSAN wiki, and a link so that users can contribute to the TOPSAN wiki. In this example, we show the information contained in TOPSAN describing PDB entry 1kq3.
Similar articles
-
The Pfam protein families database.
Finn RD, Tate J, Mistry J, Coggill PC, Sammut SJ, Hotz HR, Ceric G, Forslund K, Eddy SR, Sonnhammer EL, Bateman A. Finn RD, et al. Nucleic Acids Res. 2008 Jan;36(Database issue):D281-8. doi: 10.1093/nar/gkm960. Epub 2007 Nov 26. Nucleic Acids Res. 2008. PMID: 18039703 Free PMC article.
-
The Pfam protein families database.
Punta M, Coggill PC, Eberhardt RY, Mistry J, Tate J, Boursnell C, Pang N, Forslund K, Ceric G, Clements J, Heger A, Holm L, Sonnhammer EL, Eddy SR, Bateman A, Finn RD. Punta M, et al. Nucleic Acids Res. 2012 Jan;40(Database issue):D290-301. doi: 10.1093/nar/gkr1065. Epub 2011 Nov 29. Nucleic Acids Res. 2012. PMID: 22127870 Free PMC article.
-
The Pfam protein families database.
Bateman A, Birney E, Cerruti L, Durbin R, Etwiller L, Eddy SR, Griffiths-Jones S, Howe KL, Marshall M, Sonnhammer EL. Bateman A, et al. Nucleic Acids Res. 2002 Jan 1;30(1):276-80. doi: 10.1093/nar/30.1.276. Nucleic Acids Res. 2002. PMID: 11752314 Free PMC article.
-
Pfam 3.1: 1313 multiple alignments and profile HMMs match the majority of proteins.
Bateman A, Birney E, Durbin R, Eddy SR, Finn RD, Sonnhammer EL. Bateman A, et al. Nucleic Acids Res. 1999 Jan 1;27(1):260-2. doi: 10.1093/nar/27.1.260. Nucleic Acids Res. 1999. PMID: 9847196 Free PMC article.
-
Pfam 10 years on: 10,000 families and still growing.
Sammut SJ, Finn RD, Bateman A. Sammut SJ, et al. Brief Bioinform. 2008 May;9(3):210-9. doi: 10.1093/bib/bbn010. Epub 2008 Mar 15. Brief Bioinform. 2008. PMID: 18344544 Review.
Cited by
-
Ortuño FM, Valenzuela O, Pomares H, Rojas F, Florido JP, Urquiza JM, Rojas I. Ortuño FM, et al. Nucleic Acids Res. 2013 Jan 7;41(1):e26. doi: 10.1093/nar/gks919. Epub 2012 Oct 11. Nucleic Acids Res. 2013. PMID: 23066102 Free PMC article.
-
Khan MA, Knox N, Prashar A, Alexander D, Abdel-Nour M, Duncan C, Tang P, Amatullah H, Dos Santos CC, Tijet N, Low DE, Pourcel C, Van Domselaar G, Terebiznik M, Ensminger AW, Guyard C. Khan MA, et al. PLoS One. 2013 Jun 27;8(6):e67298. doi: 10.1371/journal.pone.0067298. Print 2013. PLoS One. 2013. PMID: 23826259 Free PMC article.
-
Ufarté L, Potocki-Veronese G, Laville É. Ufarté L, et al. Front Microbiol. 2015 Jun 5;6:563. doi: 10.3389/fmicb.2015.00563. eCollection 2015. Front Microbiol. 2015. PMID: 26097471 Free PMC article. Review.
-
Buttigieg PL, Hankeln W, Kostadinov I, Kottmann R, Yilmaz P, Duhaime MB, Glöckner FO. Buttigieg PL, et al. PLoS One. 2013;8(3):e50869. doi: 10.1371/journal.pone.0050869. Epub 2013 Mar 14. PLoS One. 2013. PMID: 23516388 Free PMC article.
-
Paukszto L, Mikolajczyk A, Jastrzebski JP, Majewska M, Dobrzyn K, Kiezun M, Smolinska N, Kaminski T. Paukszto L, et al. Int J Mol Sci. 2020 Jun 13;21(12):4217. doi: 10.3390/ijms21124217. Int J Mol Sci. 2020. PMID: 32545766 Free PMC article.
References
Publication types
MeSH terms
Grants and funding
LinkOut - more resources
Full Text Sources
Other Literature Sources
Miscellaneous