Benchmarking Methods of Protein Structure Alignment
- PDF / 6,833,242 Bytes
- 23 Pages / 595.276 x 790.866 pts Page_size
- 37 Downloads / 189 Views
ORIGINAL ARTICLE
Benchmarking Methods of Protein Structure Alignment Janan Sykes1 · Barbara R. Holland1 · Michael A. Charleston1 Received: 27 February 2020 / Accepted: 10 July 2020 © Springer Science+Business Media, LLC, part of Springer Nature 2020
Abstract The function of a protein is primarily determined by its structure and amino acid sequence. Many biological questions of interest rely on being able to accurately determine the group of structures to which domains of a protein belong; this can be done through alignment and comparison of protein structures. Dozens of different methods for Protein Structure Alignment (PSA) have been proposed that use a wide range of techniques. The aim of this study is to determine the ability of PSA methods to identify pairs of protein domains known to share differing levels of structural similarity, and to assess their utility for clustering domains from several different folds into known groups. We present the results of a comprehensive investigation into eighteen PSA methods, to our knowledge the largest piece of independent research on this topic. Overall, SP-AlignNS (non-sequential) was found to be the best method for classification, and among the best performing methods for clustering. Methods (where possible) were split into the algorithm used to find the optimal alignment and the score used to assess similarity. This allowed us to largely separate the algorithm from the score it maximizes and thus, to assess their effectiveness independently of each other. Surprisingly, we found that some hybrids of mismatched scores and algorithms performed better than either of the native methods at classification and, in some cases, clustering as well. It is hoped that this investigation and the accompanying discussion will be useful for researchers selecting or designing methods to align protein structures. Keywords Protein structure alignment · Classification · Clustering · Benchmarking
Introduction Protein structure alignment (PSA) is a complex problem to which many different solutions have been proposed. It is extremely important, as much of our understanding of protein evolution and function depends on being able to sort proteins and protein domains into categories according to their structure. Alignment of protein structures currently relies substantially on manual curation (Csaba et al. 2009; Wang et al. 2013), which leads to obvious problems with efficiency and reproducibility. The relevance of this problem extends beyond proteins, as PSA methods can be used to Handling editor: Arndt von Haeseler. Electronic supplementary material The online version of this article (https://doi.org/10.1007/s00239-020-09960-2) contains supplementary material, which is available to authorized users. * Janan Sykes [email protected] 1
School of Natural Sciences, University of Tasmania, Hobart, Australia
align other biological macromolecules and to search a database for a motif such as a binding site for a ligand (Kaiser et al. 2015; Kleywegt and Jones 1997; Shapiro and Brut
Data Loading...