: With the rapid development of the accumulation of large-scale multiomics data sets, integrating various omics data to provide a thorough study from multiple perspectives can significantly provide stronger support for precise treatment of diseases. However, due to the complexity of multiomics data, issues of feature redundancy and noise often do not receive sufficient attention when processing high-dimensional data. Moreover, simple concatenation strategies may overlook the correlations between different omics data, failing to effectively capture the unique information inherent in multiomics data. Meanwhile, deep neural networks often rely on complex structures and numerous parameters for training and inference, making their internal feature representations and decision-making processes difficult to interpret. : We propose an interpretable multiomics data integration method for cancer subtype classification, named MOCapsNet, based on self-attention and capsule networks. Specifically, the self-attention confidence learning module is implemented to assess the feature information within each omic and to assign weights to the embedded representations of various groups, resulting in more targeted integrated information. Furthermore, the capsule network structure is employed for the final cancer classification task. s: The model achieved strong performance on both tasks: 87.8% accuracy on the BRCA multiclassification data set and 83.6% accuracy with an AUC of 88.8% on the LGG data set. : The proposed framework has undergone extensive testing on omics data sets, consistently proving its effectiveness in integrating multiomics data. It improves classification accuracy while enhancing the interpretability of results by fully utilizing the feature information.

Download full-text PDF

Source
http://dx.doi.org/10.1021/acs.jcim.4c02130DOI Listing

Publication Analysis

Top Keywords

multiomics data
24
data
12
omics data
12
data integration
8
cancer subtype
8
capsule networks
8
data sets
8
data set
8
multiomics
5
mocapsnet multiomics
4

Similar Publications

Spondyloarthritis is a prevalent and persistent condition that significantly impacts the quality of life. Its intricate pathological mechanisms have led to a scarcity of animal models capable of replicating the disease progression in humans, making it a prominent area of research interest in the field. To delve into the pathological and physiological traits of spontaneous non-human primate spondyloarthritis, this study meticulously examined the disease features of this natural disease model through an array of techniques including X-ray imaging, MRI imaging, blood biochemistry, markers of bone metabolism, transcriptomics, proteomics, and metabolomics.

View Article and Find Full Text PDF

RNA velocities and generalizations emerge as powerful approaches for extracting time-resolved information from high-throughput snapshot single-cell data. Yet, several inherent limitations restrict applying the approaches to genes not suitable for RNA velocity inference due to complex transcriptional dynamics, low expression, or lacking splicing dynamics, or data of non-transcriptomic modality. Here, we present GraphVelo, a graph-based machine learning procedure that uses as input the RNA velocities inferred from existing methods and infers velocity vectors lying in the tangent space of the low-dimensional manifold formed by the single cell data.

View Article and Find Full Text PDF

Colorectal cancer (CRC) patients with microsatellite-stable (MSS) tumors are mostly treated with chemotherapy. Clinical benefits of targeted therapies depend on mutational states and tumor location. Many tumors carry mutations in KRAS proto-oncogene, GTPase (KRAS) or B-Raf proto-oncogene, serine/threonine kinase (BRAF), rendering them more resistant to therapies.

View Article and Find Full Text PDF

Polycystic Ovary Syndrome (PCOS) is among the most prevalent endocrinological abnormalities of young females, posing a grave public health challenge to the society. The objective of the present literature review is to analyze the enormous amount of information available by way of numerous multi-omic studies, and to explore a meaningful relationship between various factors such as genetic, proteomic, environmental etc. to understand the multifactorial metabolic disorder in a proper manner.

View Article and Find Full Text PDF

Using a novel unsupervised method to integrate multi-omic data, we previously identified a breast cancer group with a poor prognosis. In the current study, we characterize the biological features of this subgroup, defined as the high-risk group, using various data sources. Assessment of three published hypoxia signatures showed that the high-risk group exhibited higher hypoxia scores (p < 0.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!