Background: The NIST retention index database is one the most widely used sources of retention indices. In both untargeted analysis and machine learning studies filtering for potential errors is rather lacking or nonexistent. According to our estimates about 80% of the compounds from both NIST 17 and NIST 20 retention index databases have only one RI value per stationary phase, which makes searching for erroneous values with statistical methods impossible. Manual inspection is also impractical because the database contains more than 300 000 entries.
Results: We suggest a two-step procedure to find potentially erroneous retention indices based on machine learning. The first step is to use five predictive models to obtain predicted retention index values for the whole database. The second one is to compare these predicted values against the experimental ones. We consider a retention index erroneous if its accuracy (the difference between predicted and experimental value) is in the bottom 5% for each of the five models simultaneously. Using this method, we were able to detect 2093 outlier entries for standard and semi-standard non-polar stationary phases in the NIST 17 retention index database, 566 of those were corrected or removed by the developers in the NIST 20.
Significance: This is a novel approach to find potentially erroneous entries in a large-scale database with mostly unique entries, which can be applied not only to retention indices. The procedure can help filter and report mishandled data to improve the quality of the dataset for machine learning applications and experimental use.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1016/j.aca.2024.342375 | DOI Listing |
PLoS One
December 2024
Faculty of Music and Performing Arts, Universiti Pendidikan Sultan Idris, Tanjong Malim, Perak, Malaysia.
The rise of quality education has led to increased attention on music training as a vital means of enhancing personal qualities. However, with numerous music training institutions competing in the market, distinguishing oneself has become an urgent challenge. This study explores the key factors influencing customers' willingness to renew their enrollment at music training institutions through a questionnaire survey.
View Article and Find Full Text PDFThe ability to maintain a diverse scientific workforce is vital to promoting the US's economic and technological competitiveness. Data have shown disparities in science, mathematics, medical, and engineering programs across each level of education from high school to doctoral studies for students from underrepresented groups (URG). Research suggests that many URG students are pushed out of the biomedical track early in their academic careers, particularly during the first year.
View Article and Find Full Text PDFJ Vet Intern Med
December 2024
Faculty of Veterinary Medicine, Department of Small Animals, Ghent University, Merelbeke, Belgium.
Background: Although gut-derived uremic toxins are increased in azotemic chronic kidney disease (CKD) in cats and implicated in disease progression, it remains unclear if augmented formation or retention of these toxins is associated with the development of renal azotemia.
Objectives: Assess the association between gut-derived toxins (ie, indoxyl-sulfate, p-cresyl-sulfate, and trimethylamine-N-oxide [TMAO]) and the onset of azotemic CKD in cats.
Animals: Forty-eight client-owned cats.
Commun Biol
December 2024
Institute of Biomedical Sciences, Academia Sinica, Taipei, Taiwan.
Brain-derived neurotrophic factor (BDNF) plays important roles in brain development and neural function. Constitutive knockout of the splicing regulator RBM4 reduces BDNF expression in the developing brain and causes cerebellar hypoplasia, an autism-like feature. Here, we show that Rbm4 knockout induced intron 6 retention of Hsf1, leading to downregulation of HSF1 protein and its downstream target BDNF.
View Article and Find Full Text PDFEnviron Monit Assess
December 2024
Department of Civil and Environmental Engineering, Birla Institute of Technology, Mesra, Jharkhand, India, 835215.
Microplastics (MPs) seriously threaten soil quality and crop health, particularly in agricultural systems using plastic mulch and sewage sludge, with their abundance being strongly influenced by soil properties such as texture, structure, and chemical content. Considering this, the present study assessed MP contamination in arid agricultural soils, focusing on their abundance, morphology, composition, and association with heavy metals to evaluate environmental risks. Soil samples were collected from ten plastic-mulched fields and a control site across a 50 sq.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!