ChIP-PaM: an algorithm to identify protein-DNA interaction using ChIP-Seq data

  • PDF / 1,691,782 Bytes
  • 17 Pages / 595 x 794 pts Page_size
  • 80 Downloads / 159 Views

DOWNLOAD

REPORT


Open Access

RESEARCH

Research ChIP-PaM: an algorithm to identify protein-DNA interaction using ChIP-Seq data Song Wu*1, Jianmin Wang2, Wei Zhao1, Stanley Pounds1 and Cheng Cheng1

* Correspondence: [email protected] 1 Department of Biostatistics, St.

Jude Children's Research Hospital, 262 Danny Thomas Place, Memphis, TN 38105, USA Full list of author information is available at the end of the article

Abstract Background: ChIP-Seq is a powerful tool for identifying the interaction between genomic regulators and their bound DNAs, especially for locating transcription factor binding sites. However, high cost and high rate of false discovery of transcription factor binding sites identified from ChIP-Seq data significantly limit its application. Results: Here we report a new algorithm, ChIP-PaM, for identifying transcription factor target regions in ChIP-Seq datasets. This algorithm makes full use of a protein-DNA binding pattern by capitalizing on three lines of evidence: 1) the tag count modelling at the peak position, 2) pattern matching of a specific tag count distribution, and 3) motif searching along the genome. A novel data-based two-step eFDR procedure is proposed to integrate the three lines of evidence to determine significantly enriched regions. Our algorithm requires no technical controls and efficiently discriminates falsely enriched regions from regions enriched by true transcription factor (TF) binding on the basis of ChIP-Seq data only. An analysis of real genomic data is presented to demonstrate our method. Conclusions: In a comparison with other existing methods, we found that our algorithm provides more accurate binding site discovery while maintaining comparable statistical power.

Background Understanding of transcriptional regulation mechanisms is of fundamental importance to the study of biological processes such as development, drug response and disease pathogenesis [1]. Through modulation of gene expression patterns, the differentiation and function of cells are tightly controlled. The on/off switch of specific gene expression is one of the main modulating mechanisms and is mainly through the association and disassociation of transcription factors (TFs) with their target gene promoters. Therefore, revealing the mechanism by which transcription factors regulate their target genes is essential to understanding many important biological processes. Several methods have been developed to identify the TF-target gene interactions and to investigate how and why cells respond to different signals. One such method, chromatin immunoprecipitation (ChIP) on a chip (ChIP-chip), is based on a tiling-array platform in which genomic DNA oligomers from gene promoters are pre-fixed. The DNA fragments immuno-precipitated from cell lysate by a TF antibody hybridize with the ChIP-chip array and TF-binding regions are identified by their highintensity signals. Like all other array-based methods, however, this method can detect only targets included on the array. © 2010 Wu et al; licensee BioMed Central Ltd. This