Data-fusion in clustering microarray data: balancing discovery and interpretability.

IEEE/ACM Trans Comput Biol Bioinform

Department of Public Health Sciences, University of Toronto, Health Sciences Building, Toronto, Ontario, Canada.

Published: May 2010

While clustering genes remains one of the most popular exploratory tools for expression data, it often results in a highly variable and biologically uninformative clusters. This paper explores a data fusion approach to clustering microarray data. Our method, which combined expression data and Gene Ontology (GO)-derived information, is applied on a real data set to perform genome-wide clustering. A set of novel tools is proposed to validate the clustering results and pick a fair value of infusion coefficient. These tools measure stability, biological relevance, and distance from the expression-only clustering solution. Our results indicate that a data-fusion clustering leads to more stable, biologically relevant clusters that are still representative of the experimental data.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TCBB.2007.70267DOI Listing

Publication Analysis

Top Keywords

data-fusion clustering
8
clustering microarray
8
microarray data
8
expression data
8
data
7
clustering
6
data balancing
4
balancing discovery
4
discovery interpretability
4
interpretability clustering
4

Similar Publications

Breast cancer is the leading cancer among women, with a significant number experiencing recurrence and metastasis, thereby reducing survival rates. This study focuses on the role of long noncoding RNAs (lncRNAs) in breast cancer immunotherapy response. We conducted an analysis involving 1027 patients from Sun Yat-sen Memorial Hospital, Sun Yat-sen University, and The Cancer Genome Atlas, utilizing RNA sequencing and pathology whole-slide images.

View Article and Find Full Text PDF

Background: DNA microarrays provide informative data for transcriptional profiling and identifying gene expression signatures to help prevent progression of latent tuberculosis infection (LTBI) to active disease. However, constructing a prognostic model for distinguishing LTBI from active tuberculosis (ATB) is very challenging due to the noisy nature of data and lack of a generally stable analysis approach.

Methods: In the present study, we proposed an accurate predictive model with the help of data fusion at the decision level.

View Article and Find Full Text PDF

IPFMC: an iterative pathway fusion approach for enhanced multi-omics clustering in cancer research.

Brief Bioinform

September 2024

Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, No. 1 Shizishan Street, Hongshan District, Wuhan 430070, People's Republic of China.

Article Synopsis
  • Using multi-omics data for cancer subtyping is essential for precision medicine, but current methods often fall short in performance and biological relevance.
  • The Iterative Pathway Fusion approach for enhanced Multi-omics Clustering (IPFMC) improves clustering by incorporating biological pathways in two stages: first by selecting crucial pathways for data representation, and then by fusing similarity networks from multiple omics sources.
  • IPFMC shows superior performance compared to ten other methods in experiments with nine cancer datasets, demonstrating both effective clustering and meaningful biological insights from the identified pathways and genes.
View Article and Find Full Text PDF

[Establishment and application of infectious disease monitoring, early warning and disposal system].

Zhonghua Yu Fang Yi Xue Za Zhi

October 2024

Hangzhou Gupo Medical Technology Co., Ltd, Hangzhou 311215, China.

Using big data and artificial intelligence to establish a multi-point monitoring, early warning, and disposal system to achieve early warning and intervention of infectious disease outbreaks is an important means of controlling the spread of the epidemic. Taking Xiaoshan district as an example, this study analyzes the monitoring contents, warning methods, and application effectiveness of the infectious disease monitoring, early warning and disposal system. Based on Xiaoshan's health big data resources, the system starts with syndrome, disease diagnosis and etiology.

View Article and Find Full Text PDF
Article Synopsis
  • Accurate drug response prediction is critical in precision medicine, and existing models face challenges like limited data and poor adaptability to new drugs and cell lines.
  • TransCDR is introduced as a solution that uses transfer learning and self-attention to effectively combine drug and cell line data, outperforming previous models in predicting drug responses.
  • The system shows significant promise, demonstrating strong results on various datasets and revealing important biological insights related to drug sensitivity and resistance.*
View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!