DNase-seq; ENCODE; footprinting; gene regulation; motifs; psychiatric genetics; transcription factors
Characterizing the tissue-specific binding sites of transcription factors (TFs) is essential to reconstruct gene regulatory networks and predict functions for non-coding genetic variation. DNase-seq footprinting enables the prediction of genome-wide binding sites for hundreds of TFs simultaneously. Despite the public availability of high-quality DNase-seq data from hundreds of samples, a comprehensive, up-to-date resource for the locations of genomic footprints is lacking. Here, we develop a scalable footprinting workflow using two state-of-the-art algorithms: Wellington and HINT. We apply our workflow to detect footprints in 192 ENCODE DNase-seq experiments and predict the genomic occupancy of 1,515 human TFs in 27 human tissues. We validate that these footprints overlap true-positive TF binding sites from ChIP-seq. We demonstrate that the locations, depth, and tissue specificity of footprints predict effects of genetic variants on gene expression and capture a substantial proportion of genetic risk for complex traits.
Institute for Systems Biology
Funk, Cory C; Casella, Alex M; Jung, Segun; Richards, Matthew A; Rodriguez, Alex; Shannon, Paul; Donovan-Maiye, Rory; Heavner, Ben; Chard, Kyle; Xiao, Yukai; Glusman, Gustavo; Ertekin-Taner, Nilufer; Golde, Todd E; Toga, Arthur; Hood, Leroy; Van Horn, John D; Kesselman, Carl; Foster, Ian; Madduri, Ravi; Price, Nathan D; and Ament, Seth A, "Atlas of Transcription Factor Binding Sites from ENCODE DNase Hypersensitivity Data across 27 Tissue Types." (2020). Articles, Abstracts, and Reports. 3499.