Learning Equations from Biological Data with Limited Time Samples

PDF / 2,311,369 Bytes
33 Pages / 439.37 x 666.142 pts Page_size
101 Downloads / 233 Views

Learning Equations from Biological Data with Limited Time Samples John T. Nardini, et al. [full author details at the end of the article] Received: 19 May 2020 / Accepted: 16 August 2020 © Society for Mathematical Biology 2020

Abstract Equation learning methods present a promising tool to aid scientists in the modeling process for biological data. Previous equation learning studies have demonstrated that these methods can infer models from rich datasets; however, the performance of these methods in the presence of common challenges from biological data has not been thoroughly explored. We present an equation learning methodology comprised of data denoising, equation learning, model selection and post-processing steps that infers a dynamical systems model from noisy spatiotemporal data. The performance of this methodology is thoroughly investigated in the face of several common challenges presented by biological data, namely, sparse data sampling, large noise levels, and heterogeneity between datasets. We find that this methodology can accurately infer the correct underlying equation and predict unobserved system dynamics from a small number of time samples when the data are sampled over a time interval exhibiting both linear and nonlinear dynamics. Our findings suggest that equation learning methods can be used for model discovery and selection in many areas of biology when an informative dataset is used. We focus on glioblastoma multiforme modeling as a case study in this work to highlight how these results are informative for data-driven modeling-based tumor invasion predictions. Keywords Equation learning · Numerical differentiation · Sparse regression · Model selection · Partial differential equations · Parameter estimation · Population dynamics · Glioblastoma multiforme

This material was based upon work partially supported by the National Science Foundation under Grant DMS-1638521 to the Statistical and Applied Mathematical Sciences Institute and IOS-1838314 to KBF, and in part by National Institute of Aging Grant R21AG059099 to KBF. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation. BM gratefully acknowledges Ph.D. studentship funding from the UK EPSRC (reference EP/N50970X/1). AHD, LC, and KRS gratefully acknowledge funding through the NIH U01CA220378 and the James S. McDonnell Foundation 220020264.

B

John T. Nardini [email protected]

Extended author information available on the last page of the article 0123456789().: V,-vol

123

119

Page 2 of 33

J. T. Nardini et al.

1 Introduction Mathematical models are a crucial tool for inferring the mechanics underlying a scientific system of study (Nardini et al. 2016) or predicting future outcomes (Ferguson et al. 2020). The task of interpreting biological data in particular benefits from mathematical modeling, as models allow biologists to test multiple hypotheses in silico (Ozik et al. 2018), optimally design experiments

Data Loading...

Learning Equations from Biological Data with Limited Time Samples

Recommend Documents

Learning to Count in the Crowd from Limited Labeled Data

HIM of Biological Samples

Digital Soil Mapping with Limited Data

Use of Dielectric Mixture Equations for Estimating Permittivities of Solids from Data on Pulverized Samples

Isolation and Purification of DNA from Complicated Biological Samples

Unsupervised Learning for Efficient Texture Estimation From Limited Discrete Orientation Data

Biological Selenium Species and Selenium Speciation in Biological Samples

Raman Optical Activity of Biological Samples

Data and Samples

Atomic Force Microscopy of Biological Samples

Data Sensing with Limited Mobile Sensors in Sweep Coverage

Absolute value equations with uncertain data