The rise of machine learning (ML) has created an explosion in the potential strategies for using data to make scientific predictions. For physical scientists wishing to apply ML strategies to a particular domain, it can be difficult to assess in advance what strategy to adopt within a vast space of possibilities. Here we outline the results of an online community-powered effort to swarm search the space of ML strategies and develop algorithms for predicting atomic-pairwise nuclear magnetic resonance (NMR) properties in molecules. Using an open-source dataset, we worked with Kaggle to design and host a 3-month competition which received 47,800 ML model predictions from 2,700 teams in 84 countries. Within 3 weeks, the Kaggle community produced models with comparable accuracy to our best previously published 'in-house' efforts. A meta-ensemble model constructed as a linear combination of the top predictions has a prediction accuracy which exceeds that of any individual model, 7-19x better than our previous state-of-the-art. The results highlight the potential of transformer architectures for predicting quantum mechanical (QM) molecular properties.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8291653 | PMC |
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0253612 | PLOS |
Invest Ophthalmol Vis Sci
January 2025
Institute for Applied Mathematics, University of Bonn, Bonn, Germany.
Purpose: To quantify outer retina structural changes and define novel biomarkers of inherited retinal degeneration associated with biallelic mutations in RPE65 (RPE65-IRD) in patients before and after subretinal gene augmentation therapy with voretigene neparvovec (Luxturna).
Methods: Application of advanced deep learning for automated retinal layer segmentation, specifically tailored for RPE65-IRD. Quantification of five novel biomarkers for the ellipsoid zone (EZ): thickness, granularity, reflectivity, and intensity.
Rheumatol Int
January 2025
Stroke Monitoring and Diagnostic Division, AtheroPoint™, Roseville, CA, 95661, USA.
Women are disproportionately affected by chronic autoimmune diseases (AD) like systemic lupus erythematosus (SLE), scleroderma, rheumatoid arthritis (RA), and Sjögren's syndrome. Traditional evaluations often underestimate the associated cardiovascular disease (CVD) and stroke risk in women having AD. Vitamin D deficiency increases susceptibility to these conditions.
View Article and Find Full Text PDFJ Clin Sleep Med
January 2025
Division of Pulmonary, Critical Care, and Sleep Medicine, UC San Diego, San Diego, CA.
Continuous positive airway pressure (CPAP) is the treatment of choice for obstructive sleep apnea (OSA); however some people have residual respiratory events or require significantly higher CPAP pressure while on therapy. Our objective was to develop predictive models for CPAP outcomes and assess whether the inclusion of physiological traits enhances prediction. We constructed predictive models from baseline information for subsequent residual apnea-hypopnea index (AHI) and optimal CPAP pressure.
View Article and Find Full Text PDFJ Chem Phys
January 2025
Department of Applied Physics, Aalto University, P.O. Box 11000, FI-00076 Aalto, Finland.
Active learning (AL) has shown promise to be a particularly data-efficient machine learning approach. Yet, its performance depends on the application, and it is not clear when AL practitioners can expect computational savings. Here, we carry out a systematic AL performance assessment for three diverse molecular datasets and two common scientific tasks: compiling compact, informative datasets and targeted molecular searches.
View Article and Find Full Text PDFmSphere
December 2024
Department of Bioengineering, University of California, San Diego, La Jolla, California, USA.
Unlabelled: Thousands of complete genome sequences for strains of a species that are now available enable the advancement of pangenome analytics to a new level of sophistication. We collected 2,377 publicly available complete genomes of for detailed pangenome analysis. The core genome and accessory genomes consisted of 2,398 and 5,182 genes, respectively.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!