Intuitive, reproducible high-throughput molecular dynamics in Galaxy: a tutorial
- PDF / 4,524,569 Bytes
- 13 Pages / 595.276 x 790.866 pts Page_size
- 90 Downloads / 162 Views
Journal of Cheminformatics Open Access
EDUCATIONAL
Intuitive, reproducible high‑throughput molecular dynamics in Galaxy: a tutorial Simon A. Bray1 , Tharindu Senapathi2 , Christopher B. Barnett2* and Björn A. Grüning1*
Abstract This paper is a tutorial developed for the data analysis platform Galaxy. The purpose of Galaxy is to make highthroughput computational data analysis, such as molecular dynamics, a structured, reproducible and transparent process. In this tutorial we focus on 3 questions: How are protein-ligand systems parameterized for molecular dynamics simulation? What kind of analysis can be carried out on molecular trajectories? How can high-throughput MD be used to study multiple ligands? After finishing you will have learned about force-fields and MD parameterization, how to conduct MD simulation and analysis for a protein-ligand system, and understand how different molecular interactions contribute to the binding affinity of ligands to the Hsp90 protein. Keywords: Galaxy, Molecular Dynamics, Reproducible Introduction Molecular dynamics (MD) is a commonly used method in computational chemistry and cheminformatics, in particular for studying the interactions between small molecules and large biological macromolecules such as proteins [1]. However, the barrier to entry for MD simulation is high; not only is the theory difficult to master, but commonly used MD software is technically demanding. Furthermore, generating reliable, reproducible simulation data requires the user to maintain detailed records of all parameters and files used, which again poses a challenge to newcomers to the field. One solution to the latter problem is usage of a workflow management system such as Galaxy [2], which provides a selection of tools for molecular dynamics simulation and analysis [3]. MD simulations are rarely performed singly; in recent years, the concept of high-throughput molecular dynamics (HTMD) has come to the fore [4, 5]. Galaxy lends itself *Correspondence: [email protected]; [email protected]‑freiburg.de 1 Department of Computer Science, University of Freiburg, Georges‑Köhler‑Allee 106, Freiburg, Germany 2 Department of Chemistry and Scientific Computing Research Unit, University of Cape Town, 7700 Cape Town, South Africa
well to this kind of study, as we will demonstrate in this paper, thanks to features allowing construction of complex workflows, which can then be executed on multiple inputs in parallel. This tutorial provides a detailed workflow for highthroughput molecular dynamics with Galaxy, using the N-terminal domain (NTD) of Hsp90 (heat shock protein 90) as a case-study. Galaxy [2] is a data analysis platform that provides access to thousands of tools for scientific computation. It features a web-based user interface while automatically and transparently managing underlying computation details, enabling structured and reproducible high-throughput data analysis. This tutorial provides sample data, workflows, hands-on material and references for further reading. It presumes that t
Data Loading...