Computational methods in biology can infer large molecular interaction networks from multiple data sources and at different resolutions, creating unprecedented opportunities to explore the mechanisms driving complex biological phenomena. Networks can be built to represent distinct conditions and compared to uncover graph-level differences-such as when comparing patterns of gene-gene interactions that change between biological states. Given the importance of the graph comparison problem, there is a clear and growing need for robust and scalable methods that can identify meaningful differences. We introduce node2vec2rank (n2v2r), a method for graph differential analysis that ranks nodes according to the disparities of their representations in joint latent embedding spaces. Improving upon previous bag-of-features approaches, we take advantage of recent advances in machine learning and statistics to compare graphs in higher-order structures and in a data-driven manner. Formulated as a multi-layer spectral embedding algorithm, n2v2r is computationally efficient, incorporates stability as a key feature, and can provably identify the correct ranking of differences between graphs in an overall procedure that adheres to veridical data science principles. By better adapting to the data, node2vec2rank clearly outperformed the commonly used node degree in finding complex differences in simulated data. In the real-world applications of breast cancer subtype characterization, analysis of cell cycle in single-cell data, and searching for sex differences in lung adenocarcinoma, node2vec2rank found meaningful biological differences enabling the hypothesis generation for therapeutic candidates. Software and analysis pipelines implementing n2v2r and used for the analyses presented here are publicly available.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11212899PMC
http://dx.doi.org/10.1101/2024.06.16.599201DOI Listing

Publication Analysis

Top Keywords

graph differential
8
differential analysis
8
data
5
differences
5
node2vec2rank
4
node2vec2rank large
4
large scale
4
scale stable
4
stable graph
4
analysis
4

Similar Publications

Introduction: Colorectal cancer (CRC) is the third most commonly diagnosed cancer in the United States (U.S.).

View Article and Find Full Text PDF

: Alzheimer's disease is a progressive neurological condition marked by a decline in cognitive abilities. Early diagnosis is crucial but challenging due to overlapping symptoms among impairment stages, necessitating non-invasive, reliable diagnostic tools. : We applied information geometry and manifold learning to analyze grayscale MRI scans classified into No Impairment, Very Mild, Mild, and Moderate Impairment.

View Article and Find Full Text PDF

Hypertension is a critical risk factor and cause of mortality in cardiovascular diseases, and it remains a global public health issue. Therefore, understanding its mechanisms is essential for treating and preventing hypertension. Gene expression data is an important source for obtaining hypertension biomarkers.

View Article and Find Full Text PDF

Background/objectives: This research investigates brain connectivity patterns in reaction to social and non-social stimuli within a virtual reality environment, emphasizing their impact on cognitive functions, specifically working memory.

Methods: Employing the LEiDA framework with EEG data from 47 participants, I examined dynamic brain network states elicited by social avatars compared to non-social stick cues during a VR memory task. Through the integration of LEiDA with deep learning and graph theory analyses, unique connectivity patterns associated with cue type were discerned, underscoring the substantial influence of social cues on cognitive processes.

View Article and Find Full Text PDF

Single-cell RNA sequencing (scRNA-seq) offers remarkable insights into cellular development and differentiation by capturing the gene expression profiles of individual cells. The role of dimensionality reduction and visualization in the interpretation of scRNA-seq data has gained widely acceptance. However, current methods face several challenges, including incomplete structure-preserving strategies and high distortion in embeddings, which fail to effectively model complex cell trajectories with multiple branches.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!