pubmed.ncbi.nlm.nih.gov

Genome Dashboards: Framework and Examples - PubMed

  • ️Wed Jan 01 2020

Genome Dashboards: Framework and Examples

Zilong Li et al. Biophys J. 2020.

Abstract

Genomics is a sequence-based informatics science and a three-dimensional-structure-based material science. However, in practice, most genomics researchers utilize sequence-based informatics approaches or three-dimensional-structure-based material science techniques, not both. This division is, at least in part, the result of historical developments rather than a fundamental necessity. The underlying computational tools, experimental techniques, and theoretical models were developed independently. The primary result presented here is a framework for the unification of informatics- and physics-based data associated with DNA, nucleosomes, and chromatin. The framework is based on the mathematical representation of geometrically exact rods and the generalization of DNA basepair step parameters. Data unification enables researchers to integrate computational, experimental, and theoretical approaches for the study of chromatin biology. The framework can be implemented using model-view-controller design principles, existing genome browsers, and existing molecular visualization tools. We developed a minimal, web-based genome dashboard, G-Dash-min, and applied it to two simple examples to demonstrate the usefulness of data unification and proof of concept. Genome dashboards developed using the framework and design principles presented here are extensible and customizable and are therefore more broadly applicable than the examples presented. We expect a number of purpose-specific genome dashboards to emerge as a novel means of investigating structure-function relationships for genomes that range from basepairs to entire chromosomes and for generating, validating, and testing mechanistic hypotheses.

Copyright © 2020 Biophysical Society. Published by Elsevier Inc. All rights reserved.

PubMed Disclaimer

Figures

Figure 1
Figure 1

Unification is the process of merging data from different sources. Physical structures and informatics data are unified by mathematical representations of an oriented space curve in laboratory [r→(s), D(s)] and material [Γ→(s),Ω→(s)] reference frames. The conformation of a physical structure C(s) is associated with the laboratory frame, and informatics track data T(s) is associated with the material frame. Masks M(s) alter the material properties of DNA and may be expressed in either representation. Exchanging data between laboratory and material frames unifies the physical structure and informatics.

Figure 2
Figure 2

MVC design. Model: Laboratory frame [r→(s), D(s)] and material frame [Γ→(s),Ω→(s)] descriptions of DNA as the common thread, an inventory of masks M(si, si + ni), and procedures for converting between representations are given. View: an MV displays C(s), a genome browser (GB) displays T(s), and a CP provides a graphical interface to the controller. G-Dash-min uses JSmol and Biodalliance for the MV and GB components, respectively. An OTS approach enables a genome dashboard to use any desired MVs and GBs. Controller manages the exchange of data between model and views.

Figure 3
Figure 3

Colored boxes: C(s) and T(s) representations of two allowed states indicated by red and blue boxes, respectively. Upper boxes are T(s) representations of nucleosome positions (blue bars) and an ERE (red bar). Lower boxes are C(s) representations (small beads represent five basepairs; large beads represent histone octamers). Colored ellipses are the corresponding all-atom structures with the estrogen receptor DNA-binding domain docked to the DNA as in PDB:

1HCQ

. (a) The ERE is located within a nucleosome, with the major groove facing inward. The receptor is prohibited from binding. (b) The ERE is located in a nucleosome-free region. Docking PDB: 1HCQ indicates that the ERE is physically accessible.

Figure 4
Figure 4

(a) HOXC coarse-grained model of chromatin containing ∼55,000 basepairs of DNA and 284 nucleosomes. Uploading the HOXC model to G-Dash-min generates (b) a two-angle representation of the HOXC model, the color bar represents the index of nucleosomes, from red to blue, (c) a distance-distance matrix based on nucleosome centers of mass, the color bar represents the distance between nucleosomes, darker is closer, and black for the distance between nucleosomes less than 10 nm, and (d) structural informatics data. “Generalized Helical Parameter” (“Twist” and “Rise”) and nucleosome position (“Nucleosomes”) data are displayed alongside experimentally determined nucleosome positions (“Nuc-Pos”) and other informatics data (“Gencode”).

Similar articles

Cited by

References

    1. Fussner E., Ching R.W., Bazett-Jones D.P. Living without 30nm chromatin fibers. Trends Biochem. Sci. 2011;36:1–6. - PubMed
    1. Auton A., Brooks L.D., Abecasis G.R., 1000 Genomes Project Consortium A global reference for human genetic variation. Nature. 2015;526:68–74. - PMC - PubMed
    1. Consortium E.P., ENCODE Project Consortium An integrated encyclopedia of DNA elements in the human genome. Nature. 2012;489:57–74. - PMC - PubMed
    1. Dekker J., Belmont A.S., Zhong S., 4D Nucleome Network The 4D nucleome project. Nature. 2017;549:219–226. - PMC - PubMed
    1. Cowper-Sal⋅lari R., Zhang X., Lupien M. Breast cancer risk-associated SNPs modulate the affinity of chromatin for FOXA1 and alter gene expression. Nat. Genet. 2012;44:1191–1198. - PMC - PubMed

Publication types

MeSH terms

Substances

LinkOut - more resources