spyrmsd: symmetry-corrected RMSD calculations in Python

  • PDF / 2,068,810 Bytes
  • 7 Pages / 595.276 x 790.866 pts Page_size
  • 118 Downloads / 270 Views

DOWNLOAD

REPORT


ournal of Cheminformatics Open Access

SOFTWARE

spyrmsd: symmetry‑corrected RMSD calculations in Python Rocco Meli*  and Philip C. Biggin 

Abstract  Root mean square displacement (RMSD) calculations play a fundamental role in the comparison of different conformers of the same ligand. This is particularly important in the evaluation of protein-ligand docking, where different ligand poses are generated by docking software and their quality is usually assessed by RMSD calculations. Unfortunately, many RMSD calculation tools do not take into account the symmetry of the molecule, remain difficult to integrate flawlessly in cheminformatics and machine learning pipelines—which are often written in Python—or are shipped within large code bases. Here we present a new open-source RMSD calculation tool written in Python, designed to be extremely lightweight and easy to integrate into existing software. Keywords:  RMSD, Symmetry, Software, Python Introduction Computational structure-based drug discovery has steadily gained traction partially thanks to the constant improvements in available software, now often free and open source. Protein-ligand docking in particular is now a standard tool employed in the early stages of drug discovery pipelines in order to screen possible drugs acting on a known target of interest. Protein-ligand docking consists of the prediction of binding modes and binding affinity of a (flexible) ligand to a target of known structure. The performance of docking programs is often assessed by their ability to reproduce the crystallographic pose of the bound ligand. A common metric to evaluate the difference between the predicted binding pose and the crystallographic pose is the heavy-atoms root mean square displacement (RMSD) [1], although other metrics have been suggested [2]. RMSD calculations are also used in other contexts, for example for the evaluation of diversity in generated conformers [3]. Many simple scripts to compute RMSDs are based on the assumption of a direct one-to-one mapping between *Correspondence: [email protected] Department of Biochemistry, South Parks Rd., Oxford OX1 3QU, UK

atoms of different conformers of the same ligand. In different words, atoms are often assumed to be labelled according to their position in a coordinate file (or data structure) and they are paired according to such label. This assumption breaks down when such labels are not conserved—i.e. the order of atoms is different in the two structures being compared—and/or for symmetric molecules. In the case of symmetric molecules, different binding poses can be chemically identical but different in terms of atom-atom mapping. Since molecular connectivity is naturally represented by graphs (atoms as vertices and bonds as edges), tools from graph theory can be used to obtain the correct atom-atom mapping for two different conformers of the same molecule, thus avoiding the problems outlined above. Here we present a new Python tool, spyrmsd, for the calculation of symmetry-corrected RMSDs based on graph isomorphis