Latent class distributional regression for the estimation of non-linear reference limits from contaminated data sources

PDF / 1,624,816 Bytes
15 Pages / 595.276 x 790.866 pts Page_size
46 Downloads / 342 Views

METHODOLOGY ARTICLE

Open Access

Latent class distributional regression for the estimation of non‑linear reference limits from contaminated data sources Tobias Hepp1* , Jakob Zierk2, Manfred Rauh2, Markus Metzler2 and Andreas Mayr3 *Correspondence: [email protected] 1 Institut für Medizininformatik, Biometrie und Epidemiologie, Friedrich-AlexanderUniversität ErlangenNürnberg, Waldstraße 6, 91054 Erlangen, Germany Full list of author information is available at the end of the article

Abstract Background: Medical decision making based on quantitative test results depends on reliable reference intervals, which represent the range of physiological test results in a healthy population. Current methods for the estimation of reference limits focus either on modelling the age-dependent dynamics of different analytes directly in a prospective setting or the extraction of independent distributions from contaminated data sources, e.g. data with latent heterogeneity due to unlabeled pathologic cases. In this article, we propose a new method to estimate indirect reference limits with non-linear dependencies on covariates from contaminated datasets by combining the framework of mixture models and distributional regression. Results: Simulation results based on mixtures of Gaussian and gamma distributions suggest accurate approximation of the true quantiles that improves with increasing sample size and decreasing overlap between the mixture components. Due to the high flexibility of the framework, initialization of the algorithm requires careful considerations regarding appropriate starting weights. Estimated quantiles from the extracted distribution of healthy hemoglobin concentration in boys and girls provide clinically useful pediatric reference limits similar to solutions obtained using different approaches which require more samples and are computationally more expensive. Conclusions: Latent class distributional regression models represent the first method to estimate indirect non-linear reference limits from a single model fit, but the general scope of applications can be extended to other scenarios with latent heterogeneity. Keywords: Latent class regression, Finite mixture models, Distributional regression, Reference limits

Background Reference intervals play an important role in clinical practice in deciding whether a particular test result measured in a patient should be considered physiological or pathologic. As a consequence, proper determination of the reference limits, i.e. the bounds that define these intervals, has been extensively discussed in the recent decades, leading to the proposition of several guidelines [1, 2]. Although prospective approaches using only samples from healthy individuals from the reference population are often © The Author(s) 2020. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s

Data Loading...

Latent class distributional regression for the estimation of non-linear reference limits from contaminated data sources

Recommend Documents

Estimation of Coda Q for northeast India using nonlinear regression

Sources of uncertainty in estimation of eelgrass depth limits

Latent-Class-Analyse (LCA)

The Potential for Nonparametric Joint Latent Class Modeling of Longitudinal and Time-to-Event Data

Latent class models for multiple ordered categorical health data: testing violation of the local independence assumption

Nonlinear Regression

Ensemble with estimation: seeking for optimization in class noisy data

Assessing the Number of Clusters of the Latent Class Model

Regularizing effect of the interplay between coefficients in some nonlinear Dirichlet problems with distributional data

Bayesian latent factor regression for multivariate functional data with variable selection

Correction of Systematic Error and Estimation of Confidence Limits for one Data Assimilation Method

Nonlinear Regression with R