Attentive gated neural networks for identifying chromatin accessibility

  • PDF / 2,939,243 Bytes
  • 15 Pages / 595.276 x 790.866 pts Page_size
  • 45 Downloads / 197 Views

DOWNLOAD

REPORT


(0123456789().,-volV)(0123456789(). ,- volV)

ORIGINAL ARTICLE

Attentive gated neural networks for identifying chromatin accessibility Yanbu Guo1 • Dongming Zhou1



Weihua Li1 • Rencan Nie1,2 • Ruichao Hou1,3 • Chengli Zhou1,4

Received: 20 August 2019 / Accepted: 20 March 2020 Ó Springer-Verlag London Ltd., part of Springer Nature 2020

Abstract Accessible chromatin is associated strongly with active gene regulatory regions. Enhancers and promoters commonly occur in accessible chromatin, and systematically discovering functional sites is indispensable at the whole genome level. However, biological experiments are expensive and time-consuming, and currently, computational methods could not completely learn the hidden key regulatory patterns of genomic contexts. Moreover, the feature encoding methods of genetic sequences often ignore position information among sequences, and accurately identifying accessibility regions greatly depends on capturing more informative sequence features. To address the issues, we first encode the DNA sequences by using position embeddings, which are produced by integrating position information of the original sequences into embedding vectors and then propose a novel deep learning framework, called attentive gated neural networks (AGNet), to automatically extract complex patterns for predicting chromatin accessibility from DNA sequences. Specifically, we combine gated neural networks (GNNs) with dual attention to extract multiple patterns and long-term associations merely from DNA sequences. Experimental results on five cell-type datasets show that AGNet obtains the best performance than the published methods for the accessibility prediction. Furthermore, the results not only reveal that AGNet can learn more regulatory patterns that underlie DNA sequences, but also validate the significance of position embeddings for the accessibility prediction. Keywords Gated convolutional networks  Gated recurrent units  Attention mechanism  Chromatin accessibility

1 Introduction Chromatin accessibility plays a significant role in epigenetic gene activation and silencing, and it is also a window of conducting a thorough study for the genome. Specially, in chromatin, open regions of genomes allow regulatory molecules such as transcription factors and polymerases to bind for cellular machines involved in gene expression, while closed regions of genomes can prevent the activity of the transcriptional machinery. Furthermore, chromatin & Dongming Zhou [email protected] 1

School of Information Science and Engineering, Yunnan University, Kunming, China

2

School of Automation, Southeast University, Nanjing, China

3

Department of Computer Science and Technology, Nanjing University, Nanjing, China

4

Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, China

accessibility is highly correlated with DNA methylation and histone modifications such as methylation, acetylation and phosphorylation [1]. The studies [1–3] have shown that the aberrant alterations of chromatin accessibility caus