Machine learning to predict retention time of small molecules in nano-HPLC.

Anal Bioanal Chem

Center for Computational and Data-Intensive Science and Engineering, Skolkovo Institute of Science and Technology, Nobel Str., 3, 121205, Moscow, Russia.

Published: November 2020

Retention time is an important parameter for identification in untargeted LC-MS screening. Precise retention time prediction facilitates the annotation process and is well known for proteomics. However, the lack of available experimental information for a long time has limited the prediction accuracy for small molecules. Recently introduced large databases for small-molecule retention times make possible reliable machine learning-based predictions for the whole diversity of compounds. Applying simple projections may expand these predictions on various LC systems and conditions. In our work, we describe a complex approach to predict retention times for nano-HPLC that includes the consequent deployment of binary and regression gradient boosting models trained on the METLIN small-molecule dataset and simple projection of the results with a small number of easily available compounds onto nano-HPLC separations. The proposed model outperforms previous attempts to use machine learning for predictions with a 46-s mean absolute error. The overall performance after transfer to nano-LC conditions is less than 155 s (10.8%) in terms of the median absolute (relative) error. To illustrate the applicability of the described approach, we successfully managed to eliminate averagely 25 to 42% of false-positives with a filter threshold derived from ROC curves. Thus, the proposed approach should be used in addition to other well-established in silico methods and their integration may broaden the range of correctly identified molecules.

Download full-text PDF

Source
http://dx.doi.org/10.1007/s00216-020-02905-0DOI Listing

Publication Analysis

Top Keywords

retention time
12
machine learning
8
predict retention
8
small molecules
8
retention times
8
retention
5
learning predict
4
time
4
time small
4
molecules nano-hplc
4

Similar Publications

Identification of plant-based spilled oils using direct analysis in real-time-time-of-flight mass spectrometry with hydrophobic paper sampling.

Environ Monit Assess

January 2025

Science and Technology Branch, Pacific Environmental Science Centre, Environment and Climate Change Canada, Pacific and Yukon Laboratory for Environmental Testing, North Vancouver, BC, Canada.

Spilled plant-based oils behave very differently in comparison to petroleum oils and require different clean-up measures. They do not evaporate, disperse, dissolve, or emulsify to a significant degree but can polymerize and form an impermeable cap on sediment, smothering benthic media and resulting in an immediate impact on the wildlife community. The current study explored the application of rapid up-to-date direct analysis in real time (DART) with high-resolution mass spectrometry for plant-based oil typing.

View Article and Find Full Text PDF

Purpose: This study aims to evaluate the effects of taper angle and the number of insertion-removal cycles on the retention force of 4 mol% yttria partially stabilized zirconia (4Y-PSZ) double crowns over time.

Materials And Methods: Primary and secondary crowns were fabricated using 4Y-PSZ with taper angles of 2°, 4°, and 6° (n=15). Retention force during crown removal was measured after applying 50-N and 100-N loads.

View Article and Find Full Text PDF

Integrating augmented reality (AR) and virtual reality (VR) into dental surgery education and practice has significantly advanced the precision and interactivity of dental training and patient care. This narrative review summarizes findings from extensive literature searches conducted in PubMed, Cochrane Library, and Embase, highlighting AR and VR technologies transformative impact and current applications. Research shows that AR improves surgical precision by offering real-time data overlays during procedures, leading to better outcomes in operations like dental implant placements.

View Article and Find Full Text PDF

-Related Muscular Dystrophies, LGMD, and TMD, in an Estonian Family Caused by the Finnish Founder Variant.

Neurol Genet

December 2024

From the The Institute of Clinical Medicine (K.Õ., T.R., E.Õ.-S., L.M., S. Pajusalu), Faculty of Medicine, University of Tartu; Genetics and Personalized Medicine Clinic (K.Õ., T.R., L.M., Sander Pajusalu); Children's Clinic (E.O.-S.); Pathology Department (S. Puusepp), Tartu University Hospital, Estonia; Folkhalsan Research Center (M.S., B.U.), Helsinki; and Tampere Neuromuscular Center (B.U.), Tampere, Finland.

Background And Objectives: Tibial muscular dystrophy (TMD) is an autosomal dominant, slowly progressive late-onset distal myopathy. TMD was first described in 1991 by Udd et al. in Finnish patients, who were later found to harbor a heterozygous unique 11-bp insertion/deletion in the last exon of the gene-the Finnish founder variant (FINmaj).

View Article and Find Full Text PDF

While gas chromatography mass spectrometry (GC-MS) has long been used to identify compounds in complex mixtures, this process is often subjective and time-consuming and leaves a large fraction of seemingly good-quality spectra unidentified. In this work, we describe a set of new mass spectral library-based methods to assist compound identification in complex mixtures. These methods employ mass spectral uniqueness and compound ubiquity of library entries alongside noise reduction and automated comparison of retention indices to library compounds.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!