Globally learning gene regulatory networks based on hidden atomic regulators from transcriptomic big data

  • PDF / 2,873,268 Bytes
  • 14 Pages / 595.276 x 790.866 pts Page_size
  • 87 Downloads / 147 Views

DOWNLOAD

REPORT


METHODOLOGY ARTICLE

Open Access

Globally learning gene regulatory networks based on hidden atomic regulators from transcriptomic big data Ming Shi1,2†, Sheng Tan3†, Xin-Ping Xie4†, Ao Li5, Wulin Yang6, Tao Zhu2* and Hong-Qiang Wang1,6*

Abstract Background: Genes are regulated by various types of regulators and most of them are still unknown or unobserved. Current gene regulatory networks (GRNs) reverse engineering methods often neglect the unknown regulators and infer regulatory relationships in a local and sub-optimal manner. Results: This paper proposes a global GRNs inference framework based on dictionary learning, named dlGRN. The method intends to learn atomic regulators (ARs) from gene expression data using a modified dictionary learning (DL) algorithm, which reflects the whole gene regulatory system, and predicts the regulation between a known regulator and a target gene in a global regression way. The modified DL algorithm fits the scale-free property of biological network, rendering dlGRN intrinsically discern direct and indirect regulations. Conclusions: Extensive experimental results on simulation and real-world data demonstrate the effectiveness and efficiency of dlGRN in reverse engineering GRNs. A novel predicted transcription regulation between a TF TFAP2C and an oncogene EGFR was experimentally verified in lung cancer cells. Furthermore, the real application reveals the prevalence of DNA methylation regulation in gene regulatory system. dlGRN can be a standalone tool for GRN inference for its globalization and robustness.

Background Gene regulatory networks (GRNs) play fundamental and central roles in response to endogenous or exogenous stimuli for maintaining the viability and plasticity of cells [1, 2]. Although it has been acknowledged that aberrant gene networks can be a key driver of human diseases including cancer, little is known about the GRNs of cancer, which has largely impeded the development of cancer precision medicine [3–5]. In these years, a deluge of omics big data has been generated and accumulated * Correspondence: [email protected]; [email protected] † Ming Shi, Sheng Tan and Xin-Ping Xie are joint First Authors 2 Current Address: MOE Key Laboratory of Bioinformatics, Division of Bioinformatics and Center for Synthetic and Systems Biology, TNLIST, Department of Automation, Tsinghua University, Beijing 100084, China 1 MICB Laboratory, Institute of Intelligent Machines, Hefei Institutes of Physical Science, CAS, 350 Shushanghu Road, Hefei, Anhui 230031, P. R. China Full list of author information is available at the end of the article

worldwide, which provides an unprecedented opportunity for reverse engineering GRNs in a cost-efficient way [6, 7]. Efficient computational models for inferring GRNs from these omics data are urgently needed theoretically and practically. Generally, several key issues need to be carefully dealt with in inferring GRNs [7]: 1) Highly complex and heterogeneous networking. Various types of regulations, e.g., transcriptional, methylation or miRNA regulati