A PHP Error was encountered

Severity: Warning

Message: file_get_contents(https://...@pubfacts.com&api_key=b8daa3ad693db53b1410957c26c9a51b4908&a=1): Failed to open stream: HTTP request failed! HTTP/1.1 429 Too Many Requests

Filename: helpers/my_audit_helper.php

Line Number: 176

Backtrace:

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 176
Function: file_get_contents

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 250
Function: simplexml_load_file_from_url

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 3122
Function: getPubMedXML

File: /var/www/html/application/controllers/Detail.php
Line: 575
Function: pubMedSearch_Global

File: /var/www/html/application/controllers/Detail.php
Line: 489
Function: pubMedGetRelatedKeyword

File: /var/www/html/index.php
Line: 316
Function: require_once

Predicting the absence of an unknown compound in a mass spectral database. | LitMetric

Predicting the absence of an unknown compound in a mass spectral database.

Eur J Mass Spectrom (Chichester)

Chemistry Department, Lomonosov Moscow State University, Moscow, Russia.

Published: December 2019

Only a small subset of known organic compounds (amenable for gas chromatography/mass spectrometry) is present in the largest mass spectral databases (such as NIST or Wiley). Nevertheless, library search algorithms available in the market are not able to predict the absence of a compound in the database. In the present work, we have tried to implement such prediction by means of supervised classification. Training and validation set contained 1500 and 750 compounds, respectively. Two prediction sets (containing 750 and about 3000 mass spectra) were considered. The easiest-to-use models were built with only one input variable: match factor of the best candidate or InLib factor (both parameters were calculated within MS Search (NIST) software). Multivariate classification models were built by partial least squares discriminant analysis (PLS-DA); match factors of top n candidates were used as input variables. PLS-DA was found to be the most effective approach. The prediction efficiency strongly depended on the 'uniqueness' of mass spectra presented in the test set. PLS-DA model was able to correctly predict the absence of a compound in the database in 29.9% for prediction set #1 and in 74.4% for prediction set #2 (only 1.3% and 2.5% of compounds actually presented in the database were wrongly classified).

Download full-text PDF

Source
http://dx.doi.org/10.1177/1469066719855503DOI Listing

Publication Analysis

Top Keywords

mass spectral
8
predict absence
8
absence compound
8
compound database
8
mass spectra
8
models built
8
prediction set
8
prediction
5
predicting absence
4
absence unknown
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!