Genome-wide association studies (GWASs) have identified tens of thousands of disease associated variants and provided critical insights into developing effective treatments. However, limited sample sizes have hindered the discovery of variants for uncommon and rare diseases. Here, we introduce KGWAS, a novel geometric deep learning method that leverages a massive functional knowledge graph across variants and genes to improve detection power in small-cohort GWASs significantly.
View Article and Find Full Text PDFThe cell is arguably the most fundamental unit of life and is central to understanding biology. Accurate modeling of cells is important for this understanding as well as for determining the root causes of disease. Recent advances in artificial intelligence (AI), combined with the ability to generate large-scale experimental data, present novel opportunities to model cells.
View Article and Find Full Text PDFDrug repurposing-identifying new therapeutic uses for approved drugs-is often a serendipitous and opportunistic endeavour to expand the use of drugs for new diseases. The clinical utility of drug-repurposing artificial intelligence (AI) models remains limited because these models focus narrowly on diseases for which some drugs already exist. Here we introduce TxGNN, a graph foundation model for zero-shot drug repurposing, identifying therapeutic candidates even for diseases with limited treatment options or no existing drugs.
View Article and Find Full Text PDFPowder X-ray diffraction (PXRD) is a cornerstone technique in materials characterization. However, complete structure determination from PXRD patterns alone remains time-consuming and is often intractable, especially for novel materials. Current machine learning (ML) approaches to PXRD analysis predict only a subset of the total information that comprises a crystal structure.
View Article and Find Full Text PDFOptimal treatments depend on numerous factors such as drug chemical properties, disease biology, and patient characteristics to which the treatment is applied. To realize the promise of AI in healthcare, there is a need for designing systems that can capture patient heterogeneity and relevant biomedical knowledge. Here we present PlaNet, a geometric deep learning framework that reasons over population variability, disease biology, and drug chemistry by representing knowledge in the form of a massive clinical knowledge graph that can be enhanced by language models.
View Article and Find Full Text PDFAnalysis of single-cell datasets generated from diverse organisms offers unprecedented opportunities to unravel fundamental evolutionary processes of conservation and diversification of cell types. However, interspecies genomic differences limit the joint analysis of cross-species datasets to homologous genes. Here we present SATURN, a deep learning method for learning universal cell embeddings that encodes genes' biological properties using protein language models.
View Article and Find Full Text PDFA long-standing expectation is that large, dense and cosmopolitan areas support socioeconomic mixing and exposure among diverse individuals. Assessing this hypothesis has been difficult because previous measures of socioeconomic mixing have relied on static residential housing data rather than real-life exposures among people at work, in places of leisure and in home neighbourhoods. Here we develop a measure of exposure segregation that captures the socioeconomic diversity of these everyday encounters.
View Article and Find Full Text PDFUnderstanding cellular responses to genetic perturbation is central to numerous biomedical applications, from identifying genetic interactions involved in cancer to developing methods for regenerative medicine. However, the combinatorial explosion in the number of possible multigene perturbations severely limits experimental interrogation. Here, we present graph-enhanced gene activation and repression simulator (GEARS), a method that integrates deep learning with a knowledge graph of gene-gene relationships to predict transcriptional responses to both single and multigene perturbations using single-cell RNA-sequencing data from perturbational screens.
View Article and Find Full Text PDFThe intestine is a complex organ that promotes digestion, extracts nutrients, participates in immune surveillance, maintains critical symbiotic relationships with microbiota and affects overall health. The intesting has a length of over nine metres, along which there are differences in structure and function. The localization of individual cell types, cell type development trajectories and detailed cell transcriptional programs probably drive these differences in function.
View Article and Find Full Text PDFAging is characterized by a decline in tissue function, but the underlying changes at cellular resolution across the organism remain unclear. Here, we present the Aging Fly Cell Atlas, a single-nucleus transcriptomic map of the whole aging . We characterized 163 distinct cell types and performed an in-depth analysis of changes in tissue cell composition, gene expression, and cell identities.
View Article and Find Full Text PDFObjective: To determine whether graph neural network based models of electronic health records can predict specialty consultation care needs for endocrinology and hematology more accurately than the standard of care checklists and other conventional medical recommendation algorithms in the literature.
Methods: Demand for medical expertise far outstrips supply, with tens of millions in the US alone with deficient access to specialty care. Rather than potentially months long delays to initiate diagnostic workup and medical treatment with a specialist, referring primary care supported by an automated recommender algorithm could anticipate and directly initiate patient evaluation that would otherwise be needed at subsequent a specialist appointment.
The exceptionally rapid development of highly flexible, reusable artificial intelligence (AI) models is likely to usher in newfound capabilities in medicine. We propose a new paradigm for medical AI, which we refer to as generalist medical AI (GMAI). GMAI models will be capable of carrying out a diverse set of tasks using very little or no task-specific labelled data.
View Article and Find Full Text PDFAnalysis of single-cell datasets generated from diverse organisms offers unprecedented opportunities to unravel fundamental evolutionary processes of conservation and diversification of cell types. However, inter-species genomic differences limit the joint analysis of cross-species datasets to homologous genes. Here, we present SATURN, a deep learning method for learning universal cell embeddings that encodes genes' biological properties using protein language models.
View Article and Find Full Text PDFPac Symp Biocomput
December 2022
Biological networks are powerful representations for the discovery of molecular phenotypes. Fundamental to network analysis is the principle-rooted in social networks-that nodes that interact in the network tend to have similar properties. While this long-standing principle underlies powerful methods in biology that associate molecules with phenotypes on the basis of network proximity, interacting molecules are not necessarily similar, and molecules with similar properties do not necessarily interact.
View Article and Find Full Text PDFAccurate cell-type annotation from spatially resolved single cells is crucial to understand functional spatial biology that is the basis of tissue organization. However, current computational methods for annotating spatially resolved single-cell data are typically based on techniques established for dissociated single-cell technologies and thus do not take spatial organization into account. Here we present STELLAR, a geometric deep learning method for cell-type discovery and identification in spatially resolved single-cell datasets.
View Article and Find Full Text PDFArtificial intelligence (AI) is poised to advance therapeutic science. Therapeutics Data Commons is an initiative to access and evaluate AI capability across therapeutic modalities and stages of discovery, establishing the foundation of which AI methods are most suitable and why.
View Article and Find Full Text PDFFor more than 100 years, the fruit fly has been one of the most studied model organisms. Here, we present a single-cell atlas of the adult fly, Tabula , that includes 580,000 nuclei from 15 individually dissected sexed tissues as well as the entire head and body, annotated to >250 distinct cell types. We provide an in-depth analysis of cell type-related gene signatures and transcription factor markers, as well as sexual dimorphism, across the whole animal.
View Article and Find Full Text PDFAn unhealthy diet is a major risk factor for chronic diseases including cardiovascular disease, type 2 diabetes, and cancer. Limited access to healthy food options may contribute to unhealthy diets. Studying diets is challenging, typically restricted to small sample sizes, single locations, and non-uniform design across studies, and has led to mixed results on the impact of the food environment.
View Article and Find Full Text PDFIn this paper we analyze the effect of shocks in production networks. Our work is based on a rich dataset that contains information about companies from Slovenia right after the financial crisis of 2008. The processed data spans for 8 years and covers the transaction history as well as performance indicators and various metadata of the companies.
View Article and Find Full Text PDFSingle cell technologies are rapidly generating large amounts of data that enables us to understand biological systems at single-cell resolution. However, joint analysis of datasets generated by independent labs remains challenging due to a lack of consistent terminology to describe cell types. Here, we present OnClass, an algorithm and accompanying software for automatically classifying cells into cell types that are part of the controlled vocabulary that forms the Cell Ontology.
View Article and Find Full Text PDFProc Natl Acad Sci U S A
September 2021
Deceased public figures are often said to live on in collective memory. We quantify this phenomenon by tracking mentions of 2,362 public figures in English-language online news and social media (Twitter) 1 y before and after death. We measure the sharp spike and rapid decay of attention following death and model collective memory as a composition of communicative and cultural memory.
View Article and Find Full Text PDF