Multimodal Learning for Mapping the Genotype-Phenotype Dynamics.

Res Sq

Institute for Medical Engineering and Science, Massachusetts Institute of Technology, Cambridge, 02139, MA, USA.

Published: May 2024

How complex phenotypes emerge from intricate gene expression patterns is a fundamental question in biology. Quantitative characterization of this relationship, however, is challenging due to the vast combinatorial possibilities and dynamic interplay between genotype and phenotype landscapes. Integrating high-content genotyping approaches such as single-cell RNA sequencing and advanced learning methods such as language models offers an opportunity for dissecting this complex relationship. Here, we present a computational integrated genetics framework designed to analyze and interpret the high-dimensional landscape of genotypes and their associated phenotypes simultaneously. We applied this approach to develop a multimodal foundation model to explore the genotype-phenotype relationship manifold for human transcriptomics at the cellular level. Analyzing this joint manifold showed a refined resolution of cellular heterogeneity, enhanced precision in phenotype annotating, and uncovered potential cross-tissue biomarkers that are undetectable through conventional gene expression analysis alone. Moreover, our results revealed that the gene networks are characterized by scale-free patterns and show context-dependent gene-gene interactions, both of which result in significant variations in the topology of the gene network, particularly evident during aging. Finally, utilizing contextualized embeddings, we investigated gene polyfunctionality which illustrates the multifaceted roles that genes play in different biological processes, and demonstrated that for VWF gene in endothelial cells. Overall, this study advances our understanding of the dynamic interplay between gene expression and phenotypic manifestation and demonstrates the potential of integrated genetics in uncovering new dimensions of cellular function and complexity.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11118704PMC
http://dx.doi.org/10.21203/rs.3.rs-4355413/v1DOI Listing

Publication Analysis

Top Keywords

gene expression
12
dynamic interplay
8
integrated genetics
8
gene
7
multimodal learning
4
learning mapping
4
mapping genotype-phenotype
4
genotype-phenotype dynamics
4
dynamics complex
4
complex phenotypes
4

Similar Publications

Acute inflammation induces acute megakaryopoiesis with impaired platelet production during fetal hematopoiesis.

Development

January 2025

Institute for Regenerative Medicine, State Key Laboratory of Cardiology and Medical Innovation Center, Shanghai East Hospital, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China.

Hematopoietic development is tightly regulated by various factors. The role of RNA m6A modification during fetal hematopoiesis, particularly in megakaryopoiesis, remains unclear. Here, we demonstrate that loss of m6A methyltransferase METTL3 induces formation of double-stranded RNAs (dsRNAs) and activates acute inflammation during fetal hematopoiesis.

View Article and Find Full Text PDF

Mining Silent Biosynthetic Gene Clusters for Natural Products in Filamentous Fungi.

Chem Biodivers

January 2025

Zhejiang University, Polytechnic Institute, 866 Yuhangtang Road, Hangzhou, CHINA.

Filamentous fungi are of great interest due to their powerful metabolic capabilities and potentials to produce abundant various secondary metabolites as natural products (NPs), some of which have been developed into pharmaceuticals. Furthermore, high-throughput genome sequencing has revealed tremendous cryptic NPs underexplored. Based on the development of in silico genome mining, various techniques have been introduced to rationally modify filamentous fungi,awakening the silent biosynthetic gene clusters (BGCs) and visualizing the NPs originally cryptic.

View Article and Find Full Text PDF

Unlabelled: a natural inhabitant of the human body, is a promising candidate vehicle for vaccine delivery. An obstacle in developing bacterial delivery vehicles is generating a production strain that lacks antibiotic resistance genes and contains minimal foreign DNA. To deal with this obstacle, we have constructed a finetuned, inducible two-plasmid CRISPR/Cas9-system for chromosomal gene insertion in .

View Article and Find Full Text PDF

One key determinant of HIV-1 latency reversal is the activation of the viral long terminal repeat (LTR) by cellular transcription factors such as NF-κB and AP-1. Interestingly, the activity of these two transcription factors can be modulated by glucocorticoid receptors (GRs). Furthermore, the HIV-1 genome contains multiple binding sites for GRs.

View Article and Find Full Text PDF

Plant growth-promoting rhizobacterium Sp7 utilizes fructose efficiently via a fructose phosphotransferase system (Fru-PTS). Its genome encodes two putative Fru-PTS, each consisting of FruB (EIIA), FruK (Pfk), and FruA (EIIBC) proteins. We compared the proteomes of Sp7 grown with malate or fructose as sole carbon source, and noticed upregulation of the constituent proteins of Fru-PTS1 only on fructose.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!