Moving beyond generalization to accurate interpretation of flexible models.

Nat Mach Intell

Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724.

Published: November 2020

Machine learning optimizes flexible models to predict data. In scientific applications, there is a rising interest in interpreting these flexible models to derive hypotheses from data. However, it is unknown whether good data prediction guarantees accurate interpretation of flexible models. Here we test this connection using a flexible, yet intrinsically interpretable framework for modelling neural dynamics. We find that many models discovered during optimization predict data equally well, yet they fail to match the correct hypothesis. We develop an alternative approach that identifies models with correct interpretation by comparing model features across data samples to separate true features from noise. We illustrate our findings using recordings of spiking activity from the visual cortex of behaving monkeys. Our results reveal that good predictions cannot substitute for accurate interpretation of flexible models and offer a principled approach to identify models with correct interpretation.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9708065PMC
http://dx.doi.org/10.1038/s42256-020-00242-6DOI Listing

Publication Analysis

Top Keywords

flexible models
20
accurate interpretation
12
interpretation flexible
12
models
8
predict data
8
models correct
8
correct interpretation
8
flexible
6
interpretation
5
data
5

Similar Publications

Background: Esophageal squamous cell carcinoma (ESCC) exhibits a long latency period and has a significant geographical disparity in incidence, which underscores the need for models predicting the long-term absolute risk adaptable to regional disease burden.

Methods: 31,883 participants in a large-scale population-based screening trial (Hua County, China) were enrolled to develop the model. Severe dysplasia and above (SDA) identified at screening or follow-up were defined as the outcome.

View Article and Find Full Text PDF

We introduce Hydrogen-Exchange Experimental Structure Prediction (HX-ESP), a method that integrates hydrogen exchange (HX) data with molecular dynamics (MD) simulations to accurately predict ligand binding modes, even for targets requiring significant conformational changes. Benchmarking HX-ESP by fitting two ligands to PAK1 and four ligands to MAP4K1 (HPK1), and comparing the results to X-ray crystallography structures, demonstrated that HX-ESP successfully identified binding modes across a range of affinities significantly outperforming flexible docking for ligands necessitating large conformational adjustments. By objectively guiding simulations with experimental HX data, HX-ESP overcomes the long timescales required for binding predictions using traditional MD.

View Article and Find Full Text PDF

Unlabelled: Atomic coordinate models are important in the interpretation of 3D maps produced with cryoEM and sub-tomogram averaging in cryoET, or more generically, 3D electron microscopy (3DEM). In addition to visual inspection of such maps and models, quantitative metrics convey the reliability of the atomic coordinates, in particular how well the model is supported by the experimentally determined 3DEM map. A recently introduced metric, Q-score, was shown to correlate well with the reported resolution of the map for well-fitted models.

View Article and Find Full Text PDF

NOD-like receptors (NLRs) are intracellular immune receptors that detect pathogen-associated cues and trigger defense mechanisms, including regulated cell death. In filamentous fungi, some NLRs mediate heterokaryon incompatibility, a self/non-self recognition process that prevents the vegetative fusion of genetically distinct individuals, reducing the risk of parasitism. The and NLRs in are highly polymorphic incompatibility genes ( genes) whose products recognize different alleles of the gene via a sensor domain composed of WD40 repeats.

View Article and Find Full Text PDF

Targeted barcoding of variable antibody domains and individual transcriptomes of the human B-cell repertoire using Link-Seq.

PNAS Nexus

January 2025

Institute of Bioengineering, School of Engineering, École Polytechnique Fédérale de Lausanne (EPFL), 1015 Lausanne, Switzerland.

Here, we present Link-Seq, a highly efficient droplet microfluidic method for combined sequencing of antibody-encoding genes and the transcriptome of individual B cells at large scale. The method is based on 3' barcoding of the transcriptome and subsequent single-molecule PCR in droplets, which freely shift the barcode along specific gene regions, such as the antibody heavy- and light-chain genes. Using the immune repertoire of COVID-19 patients and healthy donors as a model system, we obtain up to 91.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!