Machine Learning Approaches for Fracture Risk Assessment: A Comparative Analysis of Genomic and Phenotypic Data in 5130
- PDF / 798,729 Bytes
- 9 Pages / 595.276 x 790.866 pts Page_size
- 66 Downloads / 177 Views
ORIGINAL RESEARCH
Machine Learning Approaches for Fracture Risk Assessment: A Comparative Analysis of Genomic and Phenotypic Data in 5130 Older Men Qing Wu1,2 · Fatma Nasoz3,4 · Jongyun Jung1,2 · Bibek Bhattarai3 · Mira V. Han1,5 Received: 10 January 2020 / Accepted: 18 July 2020 © Springer Science+Business Media, LLC, part of Springer Nature 2020
Abstract The study aims were to develop fracture prediction models by using machine learning approaches and genomic data, as well as to identify the best modeling approach for fracture prediction. The genomic data of Osteoporotic Fractures in Men, cohort Study (n = 5130), were analyzed. After a comprehensive genotype imputation, genetic risk score (GRS) was calculated from 1103 associated Single Nucleotide Polymorphisms for each participant. Data were normalized and split into a training set (80%) and a validation set (20%) for analysis. Random forest, gradient boosting, neural network, and logistic regression were used to develop prediction models for major osteoporotic fractures separately, with GRS, bone density, and other risk factors as predictors. In model training, the synthetic minority oversampling technique was used to account for low fracture rate, and tenfold cross-validation was employed for hyperparameters optimization. In the testing, the area under curve (AUC) and accuracy were used to assess the model performance. The McNemar test was employed to examine the accuracy difference between models. The results showed that the prediction performance of gradient boosting was the best, with AUC of 0.71 and an accuracy of 0.88, and the GRS ranked as the 7th most important variable in the model. The performance of random forest and neural network were also significantly better than that of logistic regression. This study suggested that improving fracture prediction in older men can be achieved by incorporating genetic profiling and by utilizing the gradient boosting approach. This result should not be extrapolated to women or young individuals. Keywords Machine learning · Fracture · Osteoporosis · Genomics · Comparison Abbreviations MrOS Osteoporotic fractures in men study ML Machine learning BMD Bone mineral density FRAX The fracture risk assessment tool * Qing Wu [email protected] 1
Nevada Institute of Personalized Medicine, University of Nevada, Las Vegas, 4505 Maryland Parkway, Las Vegas, NV 89154‑4009, USA
2
Department of Epidemiology and Biostatistics, School of Public Health, University of Nevada, Las Vegas, NV, USA
3
Department of Computer Science, University of Nevada, Las Vegas, NV, USA
4
The Lincy Institute, University of Nevada, Las Vegas, NV, USA
5
School of Life Sciences, University of Nevada, Las Vegas, NV, USA
GRS Genetic risk score QUS Quantitative ultrasound ROC Receiver-operating curve AUC Area under curve LR Logistic regression RF Random forest GB Gradient boosting NN Neural network MOF Major osteoporotic fracture SNPs Single nucleotide polymorphisms FNBMD Femoral neck BMD TSBMD Total spine BMD THBMD To
Data Loading...