Maximum common property: a new approach for molecular similarity

  • PDF / 3,955,635 Bytes
  • 22 Pages / 595.276 x 790.866 pts Page_size
  • 57 Downloads / 206 Views

DOWNLOAD

REPORT


urnal of Cheminformatics Open Access

RESEARCH ARTICLE

Maximum common property: a new approach for molecular similarity Aurelio Antelo‑Collado1  , Ramón Carrasco‑Velar1*  , Nicolás García‑Pedrajas2  and Gonzalo Cerruela‑García2 

Abstract  The maximum common property similarity (MCPhd) method is presented using descriptors as a new approach to determine the similarity between two chemical compounds or molecular graphs. This method uses the concept of maximum common property arising from the concept of maximum common substructure and is based on the electrotopographic state index for atoms. A new algorithm to quantify the similarity values of chemical structures based on the presented maximum common property concept is also developed in this paper. To verify the validity of this approach, the similarity of a sample of compounds with antimalarial activity is calculated and compared with the results obtained by four different similarity methods: the small molecule subgraph detector (SMSD), molecular finger‑ print based (OBabel_FP2), ISIDA descriptors and shape-feature similarity (SHAFTS). The results obtained by the MCPhd method differ significantly from those obtained by the compared methods, improving the quantification of the simi‑ larity. A major advantage of the proposed method is that it helps to understand the analogy or proximity between physicochemical properties of the molecular fragments or subgraphs compared with the biological response or bio‑ logical activity. In this new approach, more than one property can be potentially used. The method can be considered a hybrid procedure because it combines descriptor and the fragment approaches. Keywords:  Maximum common property, Electrotopographic state index, Molecular similarity, Tanimoto function, Maximum common structure Introduction Molecular similarity is one of the most explored and employed concepts in cheminformatics (chemical informatics or chemoinformatics) [1]. Moreover, it is currently one of the central subjects in medicinal chemistry research [1, 2]. Molecular similarity can be evaluated using different approaches, which can be classified into two principal categories: those based on descriptors and those based on substructures [3]. To estimate similarity among molecules, it is necessary to identify those

*Correspondence: [email protected] 1 University of Informatics Science, Carretera San Antonio de los Baños Km. 2 1/2 , Boyeros, La Habana, Cuba, Havana, Cuba Full list of author information is available at the end of the article

structural or chemical/physical properties that are useful to correlate and then predict the relationships among them. Similarity calculations based on molecular descriptors use fingerprint representations [3, 4]. These representations can be codified both by topological or topographic descriptors. Topological descriptors are the most popular because the 2D representation of molecules is computationally less difficult to work with than the 3D representation [1]. This work proposes a different approach in contrast with what