Interpretable Classifiers in Precision Medicine: Feature Selection and Multi-class Categorization
Growing insight into the molecular nature of diseases leads to the definition of finer grained diagnostic classes. Allowing for better adapted drugs and treatments this change also alters the diagnostic task from binary to multi-categorial decisions. Keep
- PDF / 369,911 Bytes
- 12 Pages / 439.37 x 666.142 pts Page_size
- 17 Downloads / 169 Views
Institute of Medical Systems Biology, Ulm University, 89069 Ulm, Germany [email protected] 2 Institute of Number Theory and Probability Theory, Ulm University, 89069 Ulm, Germany
Abstract. Growing insight into the molecular nature of diseases leads to the definition of finer grained diagnostic classes. Allowing for better adapted drugs and treatments this change also alters the diagnostic task from binary to multi-categorial decisions. Keeping the corresponding multi-class architectures accurate and interpretable is currently one of the key tasks in molecular diagnostics. In this work, we specifically address the question to which extent biomarkers that characterize pairwise differences among classes, correspond to biomarkers that discriminate one class from all remaining. We compare one-against-one and one-against-all architectures of feature selecting base classifiers. They are validated for their classification performance and their stability of feature selection.
1
Introduction
The analysis of molecular profiles adds a new instrument to the toolbox of medical diagnoses. It allows for a deeper insight in the molecular processes of a cell or a tissue. Due to their high dimensionality, the interpretation of these profiles is often quite challenging. Comprising tens of thousands of molecular measurements, the size of a profile typically exceeds the possibility of a direct visual inspection. Computer-aided classification algorithms are needed for diagnostic purposes [9,18,22]. Training these models often incorporates an internal feature selection process [14,21], which basically yields at a limitation of the measurements in the final prediction [8,19]. The resulting feature signature typically optimizes heuristic criteria [12]. It is often constructed in a purely datadriven or model-driven procedure [2]. Alternatively, feature selection can also be conducted from (prior) domain knowledge about the subject or the measuring process of an experiment [15]. One of the most important findings from the analysis of molecular profiles, is the insight that an observable phenotype or disease that was thought to be L.-R. Schirra and F. Schmid—Contributed equally. H.A. Kestler and L. Lausser—Joint senior authors. c Springer International Publishing AG 2016 F. Schwenker et al. (Eds.): ANNPR 2016, LNAI 9896, pp. 105–116, 2016. DOI: 10.1007/978-3-319-46182-3 9
106
L.-R. Schirra et al.
a uniform entity can be evoked by varying molecular causes [11]. These refinements of the traditional phenotypes bring up the possibility of more specific treatments and can be seen as a starting point for the field of precision medicine or personalized medicine [4]. From a diagnostic point of view, the challenge of identifying a correct phenotype has changed due to the increased number of diagnostic classes [7]. Primarily designed for binary categorization problems many classification models cannot be directly applied to such a multi-class scenario [6,20]. Fusion architectures for combining an ensemble of binary classifiers are needed [16]. These combi
Data Loading...