Predicting in silico electron ionization mass spectra using quantum chemistry
- PDF / 2,894,970 Bytes
- 11 Pages / 595.276 x 790.866 pts Page_size
- 67 Downloads / 227 Views
Journal of Cheminformatics Open Access
RESEARCH ARTICLE
Predicting in silico electron ionization mass spectra using quantum chemistry Shunyang Wang1,2, Tobias Kind1, Dean J. Tantillo2 and Oliver Fiehn1*
Abstract Compound identification by mass spectrometry needs reference mass spectra. While there are over 102 million compounds in PubChem, less than 300,000 curated electron ionization (EI) mass spectra are available from NIST or MoNA mass spectral databases. Here, we test quantum chemistry methods (QCEIMS) to generate in silico EI mass spectra (MS) by combining molecular dynamics (MD) with statistical methods. To test the accuracy of predictions, in silico mass spectra of 451 small molecules were generated and compared to experimental spectra from the NIST 17 mass spectral library. The compounds covered 43 chemical classes, ranging up to 358 Da. Organic oxygen compounds had a lower matching accuracy, while computation time exponentially increased with molecular size. The parameter space was probed to increase prediction accuracy including initial temperatures, the number of MD trajectories and impact excess energy (IEE). Conformational flexibility was not correlated to the accuracy of predictions. Overall, QCEIMS can predict 70 eV electron ionization spectra of chemicals from first principles. Improved methods to calculate potential energy surfaces (PES) are still needed before QCEIMS mass spectra of novel molecules can be generated at large scale. Keywords: Quantum chemistry, Similarity score, Mass spectra, QCEIMS Introduction Mass spectrometry is the most important analytical technique to detect and analyze small molecules. Gas chromatography coupled to mass spectrometry (GC/MS) is frequently used for such molecules and has been standardized with electron ionization (EI) at 70 eV more than 50 years ago [1]. Yet, current mass spectral libraries are still insufficient in breadth and scope to identify all chemicals detected: there are only 306,622 EI-MS compound spectra in the NIST 17 mass spectral database [2], while PubChem has recorded 102 million known chemical compounds of which 14 million are commercially available. That means there is a large discrepancy between compounds and associated reference mass spectra [3]. For example, less than 30% of all detected peaks can be *Correspondence: [email protected] 1 West Coast Metabolomics Center, UC Davis Genome Center, University of California, 451 Health Sciences Drive, Davis, CA 95616, USA Full list of author information is available at the end of the article
identified in GC–MS based metabolomics [4]. To solve this problem, the size and complexity of MS libraries must be increased. Several approaches have been developed to compute 70 eV mass spectra, including machine learning [5, 6], reaction rule-based methods [7] and a method based on physical principles, the recently developed quantum chemical software Quantum Chemical Electron Ionization Mass Spectrometry (QCEIMS) [8]. While empirical and machine learning methods depend on experimental mass spectral dat
Data Loading...