Batch effects (BEs) refer to systematic technical differences in data collection unrelated to biological variations whose noise is shown to negatively impact machine learning (ML) model generalizability. Here we release CohortFinder (http://cohortfinder.com), an open-source tool aimed at mitigating BEs via data-driven cohort partitioning. We demonstrate CohortFinder improves ML model performance in downstream digital pathology and medical image processing tasks. CohortFinder is freely available for download at cohortfinder.com.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11216973PMC
http://dx.doi.org/10.1038/s44303-024-00018-2DOI Listing

Publication Analysis

Top Keywords

open-source tool
8
digital pathology
8
cohortfinder
4
cohortfinder open-source
4
tool data-driven
4
data-driven partitioning
4
partitioning digital
4
pathology imaging
4
imaging cohorts
4
cohorts yield
4

Similar Publications

mettannotator: a comprehensive and scalable Nextflow annotation pipeline for prokaryotic assemblies.

Bioinformatics

January 2025

European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom.

Summary: In recent years there has been a surge in prokaryotic genome assemblies, coming from both isolated organisms and environmental samples. These assemblies often include novel species that are poorly represented in reference databases creating a need for a tool that can annotate both well-described and novel taxa, and can run at scale. Here, we present mettannotator-a comprehensive, scalable Nextflow pipeline for prokaryotic genome annotation that identifies coding and non-coding regions, predicts protein functions, including antimicrobial resistance, and delineates gene clusters.

View Article and Find Full Text PDF

Glycans associated with biopharmaceutical drugs play crucial roles in drug safety and efficacy, and therefore, their reliable detection and quantification is essential. Our study introduces a multi-level quantification approach for glycosylation analysis in monoclonal antibodies (mAbs), focusing on minor abundant glycovariants. Mass spectrometric data is evaluated mainly employing open-source software tools.

View Article and Find Full Text PDF

Background: The Every Newborn Action Plan (ENAP) indicators are essential in monitoring neonatal healthcare coverage and quality. The District Health Information System (DHIS2), an open-source platform in over 80 countries, supports health data collection and analysis, enabling progress tracking at national and subnational levels. This study evaluates the availability and quality of maternal and newborn health indicators, explicitly focusing on ENAP indicators within Tanzania's DHIS2.

View Article and Find Full Text PDF

The ReAct project: Analysis of data from 23 different laboratories to characterise DNA recovery given two sets of activity level propositions.

Forensic Sci Int Genet

January 2025

Bundeskriminalamt, Wiesbaden, Germany; International Commission on Missing Persons, The Hague, The Netherlands.

The ReAct (Recovery, Activity) project is an ENFSI (European Network of Forensic Science Institutes) supported initiative comprising a large consortium of laboratories. Here, the results from more than 23 laboratories are presented. The primary purpose was to design experiments simulating typical casework circumstances; collect data and to implement Bayesian networks to assess the value (i.

View Article and Find Full Text PDF

Electron tomography is a powerful tool for the three-dimensional characterization of materials at the nano- and atomic-scales. A typical workflow for tomography involves several pre-processing steps that may include spatial binning, image registration, and tilt-axis alignment depending upon the nature of the acquired data. Here we describe the capabilities of a new, open-source software package named ETSpy that builds upon the widely used HyperSpy package.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!