Motivation: Unsupervised clustering is important in disease subtyping, among having other genomic applications. As genomic data has become more multifaceted, how to cluster across data sources for more precise subtyping is an ever more important area of research. Many of the methods proposed so far, including iCluster and Cluster of Cluster Assignments (COCAs), make an unreasonable assumption of a common clustering across all data sources, and those that do not are fewer and tend to be computationally intensive.
Results: We propose a Bayesian parametric model for integrative, unsupervised clustering across data sources. In our two-way latent structure model, samples are clustered in relation to each specific data source, distinguishing it from methods like COCAs and iCluster, but cluster labels have across-dataset meaning, allowing cluster information to be shared between data sources. A common scaling across data sources is not required, and inference is obtained by a Gibbs Sampler, which we improve with a warm start strategy and modified density functions to robustify and speed convergence. Posterior interpretation allows for inference on common clusterings occurring among subsets of data sources. An interesting statistical formulation of the model results in sampling from closed-form posteriors despite incorporation of a complex latent structure. We fit the model with Gaussian and more general densities, which influences the degree of across-dataset cluster label sharing. Uniquely among integrative clustering models, our formulation makes no nestedness assumptions of samples across data sources so that a sample missing data from one genomic source can be clustered according to its existing data sources. We apply our model to a Norwegian breast cancer cohort of ductal carcinoma in situ and invasive tumors, comprised of somatic copy-number alteration, methylation and expression datasets. We find enrichment in the Her2 subtype and ductal carcinoma among those observations exhibiting greater cluster correspondence across expression and CNA data. In general, there are few pan-genomic clusterings, suggesting that models assuming a common clustering across genomic data sources might yield misleading results.
Availability And Implementation: The model is implemented in an R package called twl ('two-way latent'), available on CRAN. Data for analysis are available within the R package.
Supplementary Information: Supplementary data are available at Bioinformatics online.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1093/bioinformatics/btz381 | DOI Listing |
Phys Rev Lett
December 2024
Center for Quantum Information, Korea Institute of Science and Technology (KIST), Seoul 02792, Korea and Division of Quantum Information Technology, KIST School, Korea University of Science and Technology, Seoul 02792, Korea.
High-dimensional multipartite entanglement plays a crucial role in quantum information science. However, existing schemes for generating such entanglement become complex and costly as the dimension of quantum units increases. In this Letter, we overcome the limitation by proposing a significantly enhanced linear optical heralded scheme that generates the d-level N-partite Greenberger-Horne-Zeilinger (GHZ) state with single-photon sources and linear operations.
View Article and Find Full Text PDFJMIR Med Inform
January 2025
Sungkyunkwan University, Seoul, Republic of Korea.
Background: Mental health chatbots have emerged as a promising tool for providing accessible and convenient support to individuals in need. Building on our previous research on digital interventions for loneliness and depression among Korean college students, this study addresses the limitations identified and explores more advanced artificial intelligence-driven solutions.
Objective: This study aimed to develop and evaluate the performance of HoMemeTown Dr.
JMIR Res Protoc
January 2025
Department of Women's and Children's Health, Participatory eHealth and Health Data Research Group, Uppsala University, Uppsala, Sweden.
Background: Digital health interventions have become increasingly popular in recent years, expanding the possibilities for treatment for various patient groups. In clinical research, while the design of the intervention receives close attention, challenges with research participant engagement and retention persist. This may be partially due to the use of digital health platforms, which may lack adequacy for participants.
View Article and Find Full Text PDFJ Med Internet Res
January 2025
Cancer Rehabilitation and Survivorship, Department of Supportive Care, Princess Margaret Cancer Centre, Toronto, ON, Canada.
Background: Virtual follow-up (VFU) has the potential to enhance cancer survivorship care. However, a greater understanding is needed of how VFU can be optimized.
Objective: This study aims to examine how, for whom, and in what contexts VFU works for cancer survivorship care.
J Med Internet Res
January 2025
College of Public Health, The Ohio State University, Columbus, OH, United States.
Background: Young gay, bisexual, and other men who have sex with men have been referred to as a "hard-to-reach" or "hidden" community in terms of recruiting for research studies. With widespread internet use among this group and young adults in general, web-based avenues represent an important approach for reaching and recruiting members of this community. However, little is known about how participants recruited from various web-based sources may differ from one another.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!