Omada: robust clustering of transcriptomes through multiple testing.

Gigascience

Singapore Institute for Clinical Sciences, Agency for Science, Technology and Research (A*STAR), 30 Medical Dr, 117609, Singapore, Republic of Singapore.

Published: January 2024

Background: Cohort studies increasingly collect biosamples for molecular profiling and are observing molecular heterogeneity. High-throughput RNA sequencing is providing large datasets capable of reflecting disease mechanisms. Clustering approaches have produced a number of tools to help dissect complex heterogeneous datasets, but selecting the appropriate method and parameters to perform exploratory clustering analysis of transcriptomic data requires deep understanding of machine learning and extensive computational experimentation. Tools that assist with such decisions without prior field knowledge are nonexistent. To address this, we have developed Omada, a suite of tools aiming to automate these processes and make robust unsupervised clustering of transcriptomic data more accessible through automated machine learning-based functions.

Findings: The efficiency of each tool was tested with 7 datasets characterized by different expression signal strengths to capture a wide spectrum of RNA expression datasets. Our toolkit's decisions reflected the real number of stable partitions in datasets where the subgroups are discernible. Within datasets with less clear biological distinctions, our tools either formed stable subgroups with different expression profiles and robust clinical associations or revealed signs of problematic data such as biased measurements.

Conclusions: In conclusion, Omada successfully automates the robust unsupervised clustering of transcriptomic data, making advanced analysis accessible and reliable even for those without extensive machine learning expertise. Implementation of Omada is available at http://bioconductor.org/packages/omada/.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11238428PMC
http://dx.doi.org/10.1093/gigascience/giae039DOI Listing

Publication Analysis

Top Keywords

transcriptomic data
12
machine learning
8
robust unsupervised
8
unsupervised clustering
8
clustering transcriptomic
8
datasets
6
clustering
5
omada
4
omada robust
4
robust clustering
4

Similar Publications

Inflammatory Signatures in VEXAS Syndrome, Myelodysplasia Cutis, and Sweet Syndrome.

JAMA Dermatol

March 2025

Service de Dermatologie et Allergologie, Faculté de Médecine, Sorbonne Université, Hôpital Tenon, Assistance Publique-Hôpitaux de Paris, Paris, France.

Importance: VEXAS syndrome (vacuoles, E1 enzyme, X-linked, autoinflammatory, somatic) is a monogenic disease caused by UBA1 somatic variants in hematopoietic progenitor cells, mostly involving adult men. It is associated with inflammatory-related symptoms, frequently involving the skin and hematological disorders. Recently described myelodysplasia cutis (MDS-cutis) is a cutaneous manifestation of myelodysplasia in which clonal myelodysplastic cells infiltrate the skin.

View Article and Find Full Text PDF

The Microbiome in Asthma Heterogeneity: The Role of Multi-Omic Investigations.

Immunol Rev

March 2025

Department of Internal Medicine, Division of Pulmonary and Critical Care Medicine, Department of Microbiology and Immunology, University of Michigan Medical School, Ann Arbor, Michigan, USA.

Asthma is one of the most prevalent and extensively studied chronic respiratory conditions, yet the heterogeneity of asthma remains biologically puzzling. Established factors like exogenous exposures and treatment adherence contribute to variability in asthma risk and clinical outcomes. It is also clear that the endogenous factors of genetics and immune system response patterns play key roles in asthma.

View Article and Find Full Text PDF

By combining in silico, biophysical, and in vitro experiments, we decipher the topology, physical, and potential biological properties of hybrid-parallel nucleic acids triplexes, an elusive structure at the basis of life. We found that hybrid triplex topology follows a stability order: r(Py)-d(Pu)·r(Py) > r(Py)-d(Pu)·d(Py) > d(Py)-d(Pu)·d(Py) > d(Py)-d(Pu)·r(Py). The r(Py)-d(Pu)·d(Py) triplex is expected to be preferred in the cell as it avoids the need to open the duplex reducing the torsional stress required for triplex formation in the r(Py)-d(Pu)·r(Py) topology.

View Article and Find Full Text PDF

Transcriptome-wide mapping of N3-methylcytidine modification at single-base resolution.

Nucleic Acids Res

February 2025

Department of Clinical Laboratory of Sir Run-Run Shaw Hospital, and School of Public Health, Zhejiang University School of Medicine, Hangzhou 310058, China.

3-Methylcytidine (m3C), a prevalent modification of transfer RNAs (tRNAs), was recently identified in eukaryotic messenger RNAs (mRNAs). However, its precise distribution and formation mechanisms in mRNAs remain elusive. Here, we develop a novel approach, m3C immunoprecipitation and sequencing (m3C-IP-seq), utilizing antibody enrichment to profile the m3C methylome at single-nucleotide resolution.

View Article and Find Full Text PDF

Archival fixed tissues hold key insights into the evolutionary history of RNA viruses and the associated host immune response, yet access to the RNA sequence data is limited by a lack of robust methods for RNA extraction and sequence retrieval from these tissue types. Here we compared three commercial RNA extraction techniques (bead, column, and phase-based) on five fixed human brain tissues done in triplicate, that have been stored for up to 43 years. We found that for this sample set, bead-based extractions captured longer molecules and yielded a greater proportion of unique reads when aligned to the human genome, than did column and phase-based extraction methods.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!