pubmed.ncbi.nlm.nih.gov

A blind deconvolution approach to high-resolution mapping of transcription factor binding sites from ChIP-seq data - PubMed

  • ️Mon Oct 12 2009

A blind deconvolution approach to high-resolution mapping of transcription factor binding sites from ChIP-seq data

Desmond S Lun et al. Genome Biol. 2009.

Abstract

We present CSDeconv, a computational method that determines locations of transcription factor binding from ChIP-seq data. CSDeconv differs from prior methods in that it uses a blind deconvolution approach that allows closely-spaced binding sites to be called accurately. We apply CSDeconv to novel ChIP-seq data for DosR binding in Mycobacterium tuberculosis and to existing data for GABP in humans and show that it can discriminate binding sites separated by as few as 40 bp.

PubMed Disclaimer

Figures

Figure 1
Figure 1

Overview of CSDeconv. After an initial stage in which enriched regions are identified and probability density functions associated with ChIP and control read locations are derived, we obtain enrichment profiles that describe the enrichment level throughout each enriched site for both forward and reverse reads. From the enrichment profiles, an initial estimate is made of the shape of an enrichment peak. The peak shape is used to deconvolve the enrichment profiles, deriving binding-site locations and magnitudes, which are then used to reestimate the peak shape, and this iterative cycle is repeated until convergence.

Figure 2
Figure 2

Empirical FDR of CSDeconv. (a, b) The empirical FDR for enriched regions as a function of LLR threshold is shown for the (a) DosR and (b) GABP datasets. (c, d) With the LLR threshold fixed, the empirical FDR for binding sites as a function of the regularization factor α is shown for the (c) DosR dataset (LLR threshold at 18.75) and the (d) GABP dataset (LLR threshold at 38.5).

Figure 3
Figure 3

Sequence logos of binding motifs. The sequence logo of the binding motifs found through CSDeconv analysis is shown for (a) DosR and (b) GABP.

Figure 4
Figure 4

Illustration of the results obtained by CSDeconv for DosR binding upstream of Rv2031c. (a) The forward and reverse enrichment profiles obtained after kernel density estimation of the read distributions are shown in black and shaded in gray. Colored lines display various fits arising from estimated binding. Note that no distinct peaks are evident in the enrichment profiles and, in particular, there are no dips. (b) Both forward and reverse reads are associated with fits: the forward fit 3 is the sum of the forward enrichment peaks 1 and 2, whereas the reverse fit 3' is the sum of the reverse enrichment peaks 1' and 2'. (c) The combined forward and reverse enrichment peaks arise from two binding sites, which are peaks 15 and 16 in Table 1. Motif logos overlay the actual sequence of the intergenic region truncated for brevity, showing the two binding sites, which are separated by a scant 57 bp. Enrichment is plotted as the fold magnitude of the ChIP read density over the control read density.

Figure 5
Figure 5

Comparison of CSDeconv, MACS, and SISSRs by motif analysis. The percentage of predicted binding sites with associated motifs within 50 bp is shown as a function of the number of predicted binding sites with CSDeconv, MACS, and SISSRs for (a) DosR and (b) GABP. For MACS and SISSRs, we take the predicted binding-site location to be the peak center.

Similar articles

Cited by

References

    1. Mardis ER. ChIP-seq: welcome to the new frontier. Nat Methods. 2007;4:613–614. doi: 10.1038/nmeth0807-613. - DOI - PubMed
    1. Robertson G, Hirst M, Bainbridge M, Bilenky M, Zhao Y, Zeng T, Euskirchen G, Bernier B, Varhol R, Delaney A, Thiessen N, Griffith OL, He A, Marra M, Snyder M, Jones S. Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing. Nat Methods. 2007;4:651–657. doi: 10.1038/nmeth1068. - DOI - PubMed
    1. Mikkelsen TS, Ku M, Jaffe DB, Issac B, Lieberman E, Giannoukos G, Alvarez P, Brockman W, Kim T-K, Koche RP, Lee W, Mendenhall E, O'Donovan A, Presser A, Russ C, Xie X, Meissner A, Wernig M, Jaenisch R, Nusbaum C, Lander ES, Bernstein BE. Genome-wide maps of chromatin state in pluripotent and lineage-committed cells. Nature. 2007;448:553–560. doi: 10.1038/nature06008. - DOI - PMC - PubMed
    1. Valouev A, Johnson DS, Sundquist A, Medina C, Anton E, Batzoglou S, Myers RM, Sidow A. Genome-wide analysis of transcription factor binding sites based on ChIP-Seq data. Nat Methods. 2008;5:829–834. doi: 10.1038/nmeth.1246. - DOI - PMC - PubMed
    1. Johnson DS, Mortazavi A, Myers RM, Wold B. Genome-wide mapping of in vivo protein-DNA interactions. Science. 2007;316:1497–1502. doi: 10.1126/science.1141319. - DOI - PubMed

MeSH terms

Substances