A polynomial time algorithm for computing the area under a GDT curve

PDF / 2,125,697 Bytes
8 Pages / 595.276 x 793.701 pts Page_size
88 Downloads / 227 Views

Open Access

RESEARCH

A polynomial time algorithm for computing the area under a GDT curve Aleksandar Poleksic*

Abstract Background: Progress in the field of protein three-dimensional structure prediction depends on the development of new and improved algorithms for measuring the quality of protein models. Perhaps the best descriptor of the quality of a protein model is the GDT function that maps each distance cutoff θ to the number of atoms in the protein model that can be fit under the distance θ from the corresponding atoms in the experimentally determined structure. It has long been known that the area under the graph of this function (GDT_A) can serve as a reliable, single numerical measure of the model quality. Unfortunately, while the well-known GDT_TS metric provides a crude approximation of GDT_A, no algorithm currently exists that is capable of computing accurate estimates of GDT_A. Methods: We prove that GDT_A is well defined and that it can be approximated by the Riemann sums, using available methods for computing accurate (near-optimal) GDT function values. Results: In contrast to the GDT_TS metric, GDT_A is neither insensitive to large nor oversensitive to small changes in model’s coordinates. Moreover, the problem of computing GDT_A is tractable. More specifically, GDT_A can be computed in cubic asymptotic time in the size of the protein model. Conclusions: This paper presents the first algorithm capable of computing the near-optimal estimates of the area under the GDT function for a protein model. We believe that the techniques implemented in our algorithm will pave ways for the development of more practical and reliable procedures for estimating 3D model quality. Keywords: Protein structure, Structure modeling, Structure prediction, Model quality Background Advances in the area of protein three-dimensional structure prediction depend on the ability to accurately measure the quality of a protein model. One of the most popular and most reliable measure of the protein model quality is GDT_TS. It is defined as the average value of GDT_Pθ computed for four distance cutoffs θ = 2i, i = 0, 3, where GDT_Pθ is the percentage of model residues (represented by their Cα atoms) that can be placed under θ ångströms from the corresponding residues in the experimental structure [1, 2]. In a “high-accuracy” version of GDT_TS, denoted by GDT_HA, the distance cutoffs are cut in half (θ = 2i, i = −1, 2) [3]. In both approaches, the underlying assumption is that the experimental (crystallographic or NMR) structure is close to *Correspondence: [email protected] Department of Computer Science, University of Northern Iowa, 305 ITTC, Cedar Falls, Iowa 50613, USA

the real (native) structure (which is sometimes not true due to experimental errors). Several methods exist for computing GDT_TS. The LGA algorithm [4] can estimate GDT_TS quickly, but those estimates deviate from the true GDT_TS values in about 10 % of the cases [5]. Rigorous algorithms for computing GDT_TS have also been developed [6–9], but they are com

Data Loading...

A polynomial time algorithm for computing the area under a GDT curve

Recommend Documents

Area Under the Curve Across All Time

Area Under the Curve (AUC)

A Polynomial-Time-Delay and Polynomial-Space Algorithm for Enumeration Problems in Multi-criteria Optimization

Evaluation of a trough-only extrapolated area under the curve vancomycin dosing method on clinical outcomes

Computing the equisingularity type of a pseudo-irreducible polynomial

A real-time integrated optimization of the aircraft holding time and rerouting under risk area

A Polynomial Interior-Point Algorithm for Monotone Linear Complementarity Problems

Polynomial-time

Experimental Design for Estimating Area under the Curve by Trapezoidal Approximation in Destructive Sampling

Mycophenolic acid area under the concentration-time curve is associated with therapeutic response in childhood-onset lup

Estimation of the area under concentration-time curve of polymyxin B based on limited sampling concentrations in Chinese

A Revenue-Based Service Management Algorithm for Vehicular Cloud Computing