In:
Bioinformatics, Oxford University Press (OUP), Vol. 33, No. 15 ( 2017-08-01), p. 2258-2265
Abstract:
Chromatin immunoprecipitation followed by deep sequencing (ChIP-Seq) is a widely used approach to study protein–DNA interactions. Often, the quantities of interest are the differential occupancies relative to controls, between genetic backgrounds, treatments, or combinations thereof. Current methods for differential occupancy of ChIP-Seq data rely however on binning or sliding window techniques, for which the choice of the window and bin sizes are subjective. Results Here, we present GenoGAM (Genome-wide Generalized Additive Model), which brings the well-established and flexible generalized additive models framework to genomic applications using a data parallelism strategy. We model ChIP-Seq read count frequencies as products of smooth functions along chromosomes. Smoothing parameters are objectively estimated from the data by cross-validation, eliminating ad hoc binning and windowing needed by current approaches. GenoGAM provides base-level and region-level significance testing for full factorial designs. Application to a ChIP-Seq dataset in yeast showed increased sensitivity over existing differential occupancy methods while controlling for type I error rate. By analyzing a set of DNA methylation data and illustrating an extension to a peak caller, we further demonstrate the potential of GenoGAM as a generic statistical modeling tool for genome-wide assays. Availability and Implementation Software is available from Bioconductor: https://www.bioconductor.org/packages/release/bioc/html/GenoGAM.html. Supplementary information Supplementary information is available at Bioinformatics online.
Type of Medium:
Online Resource
ISSN:
1367-4803
,
1367-4811
DOI:
10.1093/bioinformatics/btx150
Language:
English
Publisher:
Oxford University Press (OUP)
Publication Date:
2017
detail.hit.zdb_id:
1468345-3
SSG:
12
Permalink