Identifying genes causally linked to cancer from a multi-omics perspective is essential for understanding the mechanisms of cancer and improving therapeutic strategies. Traditional statistical and machine-learning methods that rely on generalized correlation approaches to identify cancer genes often produce redundant, biased predictions with limited interpretability, largely due to overlooking confounding factors, selection biases, and the nonlinear activation function in neural networks. In this study, we introduce a novel framework for identifying cancer genes across multiple omics domains, named ICGI (Integrative Causal Gene Identification), which leverages a large language model (LLM) prompted with causality contextual cues and prompts, in conjunction with data-driven causal feature selection. This approach demonstrates the effectiveness and potential of LLMs in uncovering cancer genes and comprehending disease mechanisms, particularly at the genomic level. However, our findings also highlight that current LLMs may not capture comprehensive information across all omics levels. By applying the proposed causal feature selection module to transcriptomic datasets from six cancer types in The Cancer Genome Atlas and comparing its performance with state-of-the-art methods, it demonstrates superior capability in identifying cancer genes that distinguish between cancerous and normal samples. Additionally, we have developed an online service platform that allows users to input a gene of interest and a specific cancer type. The platform provides automated results indicating whether the gene plays a significant role in cancer, along with clear and accessible explanations. Moreover, the platform summarizes the inference outcomes obtained from data-driven causal learning methods.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1093/bib/bbaf113 | DOI Listing |
Bioinformatics
March 2025
Department of Statistics, Hunan University, Changsha, 410000, China.
Motivation: Inferring gene networks provides insights into biological pathways and functional relationships among genes. When gene expression samples exhibit heterogeneity, they may originate from unknown subtypes, prompting the utilization of mixture Gaussian graphical model for simultaneous subclassification and gene network inference. However, this method overlooks the heterogeneity of network relationships across subtypes and does not sufficiently emphasize shared relationships.
View Article and Find Full Text PDFJ Immunol
March 2025
Department of Microbiology and Immunology, University of California, San Francisco, San Francisco, CA, United States.
Natural killer (NK) cells express activating receptors that signal through ITAM (immunoreceptor tyrosine-based activation motif)-bearing adapter proteins. The phosphorylation of each ITAM creates binding sites for SYK and ZAP70 protein tyrosine kinases to propagate downstream signaling including the induction of Ca2+ influx. While all immature and mature human NK cells coexpress SYK and ZAP70, clonally driven memory or adaptive NK cells can methylate SYK genes, and signaling is mediated exclusively using ZAP70.
View Article and Find Full Text PDFMicrobiology (Reading)
March 2025
School of Science and Technology, Nottingham Trent University, Nottingham, UK.
Novel treatment options are needed for the gastric pathogen due to its increasing antibiotic resistance. The vitamin K analogue menadione has been extensively studied due to interest in its anti-bacterial and anti-cancer properties. Here, we investigated the effects of menadione on growth, viability, antibiotic resistance, motility and gene expression using clinical isolates.
View Article and Find Full Text PDFBrief Bioinform
March 2025
School of Artificial Intelligence, Jilin University, 3003 Qianjin Street, Changchun 130012, Jilin Province, China.
Identifying genes causally linked to cancer from a multi-omics perspective is essential for understanding the mechanisms of cancer and improving therapeutic strategies. Traditional statistical and machine-learning methods that rely on generalized correlation approaches to identify cancer genes often produce redundant, biased predictions with limited interpretability, largely due to overlooking confounding factors, selection biases, and the nonlinear activation function in neural networks. In this study, we introduce a novel framework for identifying cancer genes across multiple omics domains, named ICGI (Integrative Causal Gene Identification), which leverages a large language model (LLM) prompted with causality contextual cues and prompts, in conjunction with data-driven causal feature selection.
View Article and Find Full Text PDFBiochem Genet
March 2025
Department of Gynecology, People's Hospital of Jianshi, Enshi Tujia and Miao Autonomous Prefecture, Enshi City, Hubei Province, China.
Breast cancer is a prevalent and highly heterogeneous malignancy that continues to be a major global health concern. Voltage-gated sodium channels are primarily known for their role in neuronal excitability, but emerging evidence suggests their involvement in the pathogenesis of various cancers, including breast cancer. However, the effect of β-subunits on breast cancer cells is not yet studied.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!