Comprehensive Mass Spectrometric Mapping of Chemical Compounds for the Development of Algorithms for Machine Learning an
- PDF / 1,250,380 Bytes
- 6 Pages / 612 x 792 pts (letter) Page_size
- 10 Downloads / 192 Views
AL CHEMISTRY
Comprehensive Mass Spectrometric Mapping of Chemical Compounds for the Development of Algorithms for Machine Learning and Artificial Intelligence J. V. Burykinaa, D. A. Boikoa,b, V. V. Ilyushenkovaa, D. B. Eremina, and Academician V. P. Ananikova,b,* Received April 3, 2020; revised May 27, 2020; accepted June 5, 2020
Abstract—The influence of the accuracy of mass measurements on the number of possible structural compositions and the computation time of computer-aided interpretation of mass spectrometric data has been evaluated. Experimental measurements have been performed for two model objects in the range of small and medium masses using high, ultrahigh, and extreme high resolution electrospray ionization mass spectrometers. The number of possible solutions have been examined and prospects of using machine learning in combination with mass spectrometry for predicting new data on reaction mechanisms and searching for hidden relationships in the chemical space have been demonstrated. It has been shown that there are two types of relationships between the molecular formula and the mass determination error depending on the ion mass: a nonlinear curve is observed for small molecules and a linear relationship is observed for large molecules. Keywords: mass spectrometry, FT-ICR-MS, ESI-MS, machine learning, artificial intelligence DOI: 10.1134/S0012501620050024
and then try to select an algorithm that describes the relationship between them as accurately as possible. Sometimes, the number of parameters for these algorithms reaches enormous values: tens of millions of parameters for neural networks for image classification, such as ResNet, Inception, and EfficientNet [8]. Since the rules for making a decision are not directly specified, the algorithms find hidden dependencies in the data that are necessary to solve the problem. Currently, these technologies are becoming more integrated into our lives: face recognition [9], self-driving cars [10], natural language processing (for example, translation of texts) [11].
Today mass spectrometry is one of the most important analytical methods for studying the composition and structure of chemical compounds [1–4]. Mass spectrometry is distinguished by high sensitivity up to 10–18 M for routine measurements [5]. The combination of the versatility of the method and its high sensitivity has made it possible to identify complex mixtures. The most important advantage of mass spectrometric analysis is the possibility to accumulate a large body of data on the object under study at a high rate: from thousands to tens of thousands of individual signals in the spectrum are recorded in a short period of time [6, 7]. Thus, the three key factors—versatility, high sensitivity, and high-throughput acquisition of large amounts of data—make mass spectrometry one of the most approachable experimental methods for the development of machine learning and artificial intelligence algorithms.
In recent years, machine learning has been increasingly used by researchers to gain new informa
Data Loading...