A ligand-based computational drug repurposing pipeline using KNIME and Programmatic Data Access: case studies for rare d

  • PDF / 5,347,593 Bytes
  • 20 Pages / 595.276 x 790.866 pts Page_size
  • 14 Downloads / 167 Views

DOWNLOAD

REPORT


ournal of Cheminformatics Open Access

EDUCATIONAL

A ligand‑based computational drug repurposing pipeline using KNIME and Programmatic Data Access: case studies for rare diseases and COVID‑19 Alzbeta Tuerkova* and Barbara Zdrazil* 

Abstract  Biomedical information mining is increasingly recognized as a promising technique to accelerate drug discovery and development. Especially, integrative approaches which mine data from several (open) data sources have become more attractive with the increasing possibilities to programmatically access data through Application Programming Interfaces (APIs). The use of open data in conjunction with free, platform-independent analytic tools provides the additional advantage of flexibility, re-usability, and transparency. Here, we present a strategy for performing ligandbased in silico drug repurposing with the analytics platform KNIME. We demonstrate the usefulness of the developed workflow on the basis of two different use cases: a rare disease (here: Glucose Transporter Type 1 (GLUT-1) deficiency), and a new disease (here: COVID 19). The workflow includes a targeted download of data through web services, data curation, detection of enriched structural patterns, as well as substructure searches in DrugBank and a recently deposited data set of antiviral drugs provided by Chemical Abstracts Service. Developed workflows, tutorials with detailed step-by-step instructions, and the information gained by the analysis of data for GLUT-1 deficiency syndrome and COVID-19 are made freely available to the scientific community. The provided framework can be reused by researchers for other in silico drug repurposing projects, and it should serve as a valuable teaching resource for conveying integrative data mining strategies. Keywords:  Drug repurposing, Data integration, Data mining, Data access, Application programming interface, Substructure search, Rare disease, KNIME workflow, COVID-19, SARS-CoV-2, GLUT-1 deficiency syndrome, ChEMBL, Open targets platform, DrugBank, PDB, UniProtKB, Guide-to-pharmacology, PubChem Background Computer-aided mining of biomedical data is an emerging field in cheminformatics and drug design which has reshaped current drug development [1–3]. Open access to various life-science repositories, such as ChEMBL [4], PubChem [5], UniProt [6], or DrugBank [7], provides a *Correspondence: [email protected]; [email protected] Department of Pharmaceutical Chemistry, Division of Drug Design and Medicinal Chemistry, University of Vienna, Althanstraße 14, 1090 Vienna, Austria

competitive advantage when using data-driven drug discovery approaches as opposed to non-integrative approaches [8]. Furthermore, many databases enable programmatic access of the stored data through an Application Programming Interface (API). Consequently, it is of importance to find appropriate tools to analyze gathered data in an automated way. The Konstanz Integration Miner (KNIME) is an open-source data pipelining and analytics platform which enables the creation of (semi)automa