An ontology-based documentation of data discovery and integration process in cancer outcomes research

PDF / 1,567,897 Bytes
22 Pages / 595.276 x 790.866 pts Page_size
27 Downloads / 296 Views

RESEARCH

Open Access

An ontology-based documentation of data discovery and integration process in cancer outcomes research Hansi Zhang1, Yi Guo1,2, Mattia Prosperi3 and Jiang Bian1,2* From The 4th International Workshop on Semantics-Powered Data Analytics Auckland, New Zealand. 27 October 2019

Abstract Background: To reduce cancer mortality and improve cancer outcomes, it is critical to understand the various cancer risk factors (RFs) across different domains (e.g., genetic, environmental, and behavioral risk factors) and levels (e.g., individual, interpersonal, and community levels). However, prior research on RFs of cancer outcomes, has primarily focused on individual level RFs due to the lack of integrated datasets that contain multi-level, multidomain RFs. Further, the lack of a consensus and proper guidance on systematically identify RFs also increase the difficulty of RF selection from heterogenous data sources in a multi-level integrative data analysis (mIDA) study. More importantly, as mIDA studies require integrating heterogenous data sources, the data integration processes in the limited number of existing mIDA studies are inconsistently performed and poorly documented, and thus threatening transparency and reproducibility. Methods: Informed by the National Institute on Minority Health and Health Disparities (NIMHD) research framework, we (1) reviewed existing reporting guidelines from the Enhancing the QUAlity and Transparency Of health Research (EQUATOR) network and (2) developed a theory-driven reporting guideline to guide the RF variable selection, data source selection, and data integration process. Then, we developed an ontology to standardize the documentation of the RF selection and data integration process in mIDA studies. Results: We summarized the review results and created a reporting guideline—ATTEST—for reporting the variable selection and data source selection and integration process. We provided an ATTEST check list to help researchers to annotate and clearly document each step of their mIDA studies to ensure the transparency and reproducibility. We used the ATTEST to report two mIDA case studies and further transformed annotation results into sematic triples, so that the relationships among variables, data sources and integration processes are explicitly standardized and modeled using the classes and properties from OD-ATTEST. (Continued on next page)

* Correspondence: [email protected] 1 Department of Health Outcomes and Biomedical Informatics, College of Medicine, University of Florida, 2197 Mowry Road, Suite 122, PO Box 100177, Gainesville, FL 32610-0177, USA 2 Cancer Informatics & eHealth Core, University of Florida Health Cancer Center, Gainesville, FL, USA Full list of author information is available at the end of the article © The Author(s). 2020 Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit

Data Loading...

An ontology-based documentation of data discovery and integration process in cancer outcomes research

Recommend Documents

An Improved Process for the Creation, Maintenance, and Documentation of Analysis-ready Data

Systems Biology in Cancer Research and Drug Discovery

Integration of Translational Research in the European Organization for Research and Treatment of Cancer Research (EORTC)

Data-Driven Process Discovery and Analysis First International Sympo

Data-Driven Process Discovery and Analysis 4th International Symposi

Data Discovery in Multimedia

Process Integration

Quantitative Data Analysis in Psychotherapy Process Research: Structures and Procedures

Documentation

An Integration Framework for Liver Cancer Subtype Classification and Survival Prediction Based on Multi-omics Data

Knowledge Discovery in Spatial Data

Nursing Research and Outcomes