AiZynthFinder: a fast, robust and flexible open-source software for retrosynthetic planning

  • PDF / 1,142,743 Bytes
  • 9 Pages / 595.276 x 790.866 pts Page_size
  • 47 Downloads / 232 Views

DOWNLOAD

REPORT


Journal of Cheminformatics Open Access

SOFTWARE

AiZynthFinder: a fast, robust and flexible open‑source software for retrosynthetic planning Samuel Genheden1*, Amol Thakkar1,2, Veronika Chadimová1, Jean‑Louis Reymond2, Ola Engkvist1 and Esben Bjerrum1* 

Abstract  We present the open-source AiZynthFinder software that can be readily used in retrosynthetic planning. The algo‑ rithm is based on a Monte Carlo tree search that recursively breaks down a molecule to purchasable precursors. The tree search is guided by an artificial neural network policy that suggests possible precursors by utilizing a library of known reaction templates. The software is fast and can typically find a solution in less than 10 s and perform a com‑ plete search in less than 1 min. Moreover, the development of the code was guided by a range of software engineer‑ ing principles such as automatic testing, system design and continuous integration leading to robust software with high maintainability. Finally, the software is well documented to make it suitable for beginners. The software is avail‑ able at http://www.githu​b.com/Molec​ularA​I/aizyn​thfin​der. Keywords:  Neural network, CASP, Retrosynthesis planning software, Monte Carlo tree-search, Retrosynthesis Introduction Synthesis planning is the process by which a chemist or a computer determines how to synthesize a specific compound. This is typically carried out by retrosynthetic analysis where the desired compound is iteratively broken down into intermediates or smaller precursors until known or purchasable building blocks have been found. Such analysis was pioneered by Corey et  al. and was traditionally carried out by hand or by using expert systems utilizing hand-encoded rules [1–3]. With the rise of deep learning, in the last decade, the field of retrosynthetic software tools has undergone a swift change. Now, sophisticated and automatic algorithms have the potential to provide retrosynthetic analysis with a broader application domain and with better accuracy [4–6]. *Correspondence: [email protected]; esben. [email protected] 1 Hit Discovery, Discovery Sciences, R&D, AstraZeneca Gothenburg, Mölndal, Sweden Full list of author information is available at the end of the article

Retrosynthesis planning algorithms can be divided into template-based and template-free approaches. In template-based approaches, reaction templates or rules that describe chemical transformations are manually encoded or derived from a database of known reactions, and subsequently applied to other compounds to create plausible reaction outcomes. Segler et  al. showed that it was possible to train a neural network to prioritize templates, and subsequently use this as a policy to guide a Monte Carlo tree search algorithm that suggests synthetic pathways for a given compound [7, 8]. Templatefree approaches, on the other hand, do not rely on such templates but typically treat the chemical reaction as a natural language problem, where one set of words (reactants) is transformed into another set of