The characteristics of data produced by omics technologies are pivotal, as they critically influence the feasibility and effectiveness of computational methods applied in downstream analyses, such as data harmonization and differential abundance analyses. Furthermore, variability in these data characteristics across datasets plays a crucial role, leading to diverging outcomes in benchmarking studies, which are essential for guiding the selection of appropriate analysis methods in all omics fields. Additionally, downstream analysis tools are often developed and applied within specific omics communities due to the presumed differences in data characteristics attributed to each omics technology. In this study, we investigate over ten thousand datasets to understand how proteomics, metabolomics, lipidomics, transcriptomics, and microbiome data vary in specific data characteristics. We were able to show patterns of data characteristics specific to the investigated omics types and provide a tool that enables researchers to assess how representative a given omics dataset is for its respective discipline. Moreover, we illustrate how data characteristics can impact analyses at the example of normalization in the presence of sample-dependent proportions of missing values. Given the variability of omics data characteristics, we encourage the systematic inspection of these characteristics in benchmark studies and for downstream analyses to prevent suboptimal method selection and unintended bias.

Download full-text PDF

Source
http://dx.doi.org/10.1038/s41598-025-87256-5DOI Listing

Publication Analysis

Top Keywords

data characteristics
24
data
9
characteristics
8
downstream analyses
8
omics
7
characterizing omics
4
omics landscape
4
landscape based
4
based 10000+
4
10000+ datasets
4

Similar Publications

Identification of an ANCA-associated vasculitis cohort using deep learning and electronic health records.

Int J Med Inform

January 2025

Rheumatology and Allergy Clinical Epidemiology Research Center and Division of Rheumatology, Allergy, and Immunology, and Mongan Institute, Department of Medicine, Massachusetts General Hospital Boston MA USA. Electronic address:

Background: ANCA-associated vasculitis (AAV) is a rare but serious disease. Traditional case-identification methods using claims data can be time-intensive and may miss important subgroups. We hypothesized that a deep learning model analyzing electronic health records (EHR) can more accurately identify AAV cases.

View Article and Find Full Text PDF

Racial discrimination is a pervasive global problem. Bystanders who observe racism can intervene to support the targets of racism, but they often fail to do so due to several context-specific barriers. There is currently little research on bystander behaviour in racism outside of English-speaking countries.

View Article and Find Full Text PDF

Objective: We aimed to investigate the association of sociodemographic, clinical and functional characteristics with the volume of transitions and specific trajectories across living and care settings.

Methods: Using data from the Swedish National Study on Aging and Care in Kungsholmen study, we identified transitions across home (with or without social care), nursing homes, hospitals and postacute care facilities among 3021 adults aged 60+. Poisson and multistate models were used to investigate the association between sociodemographic, clinical and functional characteristics and both the overall volume and hazard ratios (HRs) of specific transitions.

View Article and Find Full Text PDF

Advances in Diagnosis, Treatment and Prognostic in Aortoiliac Occlusive Disease - A Narrative Review.

Port J Card Thorac Vasc Surg

January 2025

Department of Biomedicine - Unit of Anatomy, Faculty of Medicine, University of Porto; RISE@Health, Porto, Portugal.

Background: Aortoiliac disease (AID) is a variant of peripheral artery disease involving the infrarenal aorta and iliac arteries. Similar to other arterial diseases, aortoiliac disease obstructs blood flow through narrowed lumens or by embolization of plaques. AID, when symptomatic, may present with a triad of claudication, impotence, and absence of femoral pulses, a triad also referred as Leriche Syndrome (LS).

View Article and Find Full Text PDF

The objective of this retrospective observational study was to estimate the prevalence of actinic keratosis (AK) in individuals aged ≥ 40 years in France, to describe the characteristics of affected patients, and to describe treatments. A representative panel of 20,000 households with ≥ 1 member aged ≥ 40 years were invited to participate. Participants who reported AK lesions diagnosed by a physician were eligible.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!