Machine learning based refined differential gene expression analysis of pediatric sepsis
- PDF / 1,714,568 Bytes
- 10 Pages / 595.276 x 790.866 pts Page_size
- 98 Downloads / 168 Views
(2020) 13:122
TECHNICAL ADVANCE
Open Access
Machine learning based refined differential gene expression analysis of pediatric sepsis Mostafa Abbas1 and Yasser EL-Manzalawy1,2*
Abstract Background: Differential expression (DE) analysis of transcriptomic data enables genome-wide analysis of gene expression changes associated with biological conditions of interest. Such analysis often provides a wide list of genes that are differentially expressed between two or more groups. In general, identified differentially expressed genes (DEGs) can be subject to further downstream analysis for obtaining more biological insights such as determining enriched functional pathways or gene ontologies. Furthermore, DEGs are treated as candidate biomarkers and a small set of DEGs might be identified as biomarkers using either biological knowledge or datadriven approaches. Methods: In this work, we present a novel approach for identifying biomarkers from a list of DEGs by re-ranking them according to the Minimum Redundancy Maximum Relevance (MRMR) criteria using repeated cross-validation feature selection procedure. Results: Using gene expression profiles for 199 children with sepsis and septic shock, we identify 108 DEGs and propose a 10-gene signature for reliably predicting pediatric sepsis mortality with an estimated Area Under ROC Curve (AUC) score of 0.89. Conclusions: Machine learning based refinement of DE analysis is a promising tool for prioritizing DEGs and discovering biomarkers from gene expression profiles. Moreover, our reported 10-gene signature for pediatric sepsis mortality may facilitate the development of reliable diagnosis and prognosis biomarkers for sepsis. Keywords: Biomarkers discovery, Differential expression analysis, Refined differential gene expression analysis, Feature selection
Background Pediatric sepsis is a life-threatening condition that is considered a leading cause of morbidity and mortality in infants and children [1, 2]. Sepsis is a systematic response to infection that is characterized by a generalized pro-inflammatory cascade, which may lead to extensive tissue damage [3]. Early recognition of sepsis and septic shock will help pediatric care physicians to intervene before the onset of advanced organ dysfunction and thus * Correspondence: [email protected] 1 Department of Imaging Science and Innovation, Geisinger Health System, Danville, PA 17822, USA 2 Department of Biomedical and Translational Informatics, Geisinger Health System, Danville, PA 17822, USA
reduce the mortality and length of stay as well as post critical care complications [4]. However, reliable risk stratification of sepsis, especially in children, is a challenge due to significant patient heterogeneity [5] and existing poor definitions of sepsis in pediatric populations [6]. Existing physiological scoring tools commonly used in intensive care units (ICUs), such as Acute Physiologic and Chronic Health Evaluation (APACHE) [7] and Sepsis-related Organ Failure Assessment (SOFA) [8], use clinical and laboratory measurem
Data Loading...