Optimizing open data to support one health: best practices to ensure interoperability of genomic data from bacterial pat

  • PDF / 1,862,696 Bytes
  • 11 Pages / 595.276 x 790.866 pts Page_size
  • 66 Downloads / 148 Views

DOWNLOAD

REPORT


One Health Outlook

REVIEW

Open Access

Optimizing open data to support one health: best practices to ensure interoperability of genomic data from bacterial pathogens Ruth E. Timme1* , William J. Wolfgang2, Maria Balkey1, Sai Laxmi Gubbala Venkata2, Robyn Randolph3, Marc Allard1 and Errol Strain4

Abstract The holistic approach of One Health, which sees human, animal, plant, and environmental health as a unit, rather than discrete parts, requires not only interdisciplinary cooperation, but standardized methods for communicating and archiving data, enabling participants to easily share what they have learned and allow others to build upon their findings. Ongoing work by NCBI and the GenomeTrakr project illustrates how open data platforms can help meet the needs of federal and state regulators, public health laboratories, departments of agriculture, and universities. Here we describe how microbial pathogen surveillance can be transformed by having an open access database along with Best Practices for contributors to follow. First, we describe the open pathogen surveillance framework, hosted on the NCBI platform. We cover the current community standards for WGS quality, provide an SOP for assessing your own sequence quality and recommend QC thresholds for all submitters to follow. We then provide an overview of NCBI data submission along with step by step details. And finally, we provide curation guidance and an SOP for keeping your public data current within the database. These Best Practices can be models for other open data projects, thereby advancing the One Health goals of Findable, Accessible, Interoperable and Reusable (FAIR) data. Keywords: Genomic epidemiology, GenomeTrakr, Microbial pathogen surveillance, NCBI submission, Whole genome sequencing, QA/QC, One health

Background The One Health perspective, which sees human, animal, plant, and environmental health as a unit, rather than discrete parts, requires not only interdisciplinary cooperation, but standardized methods for communicating and archiving data, so researchers can easily share what they have learned and allow others to build upon their findings. Two developments have made a great difference in our ability to support these requirements. First, the * Correspondence: [email protected] 1 U.S. Food and Drug Administration, Center for Food Safety and Applied Nutrition, 5001 Campus Drive, College Park, MD 20740, USA Full list of author information is available at the end of the article

advent of whole genome sequencing (WGS) made it possible to establish genomic DNA as a standard data type and increase the resolution possible between isolates, dramatically changing how surveillance data for human pathogens could be stored, shared, and analyzed [1]. Second, storing and sharing genomic pathogen data and surveillance analyses as “open data” [2] has enabled a truly open vision for all global pathogen surveillance, as shown by the success of the open foodborne pathogen surveillance model in the United States [2–4] and in partnering countries, such as