Optimal treatments depend on numerous factors such as drug chemical properties, disease biology, and patient characteristics to which the treatment is applied. To realize the promise of AI in healthcare, there is a need for designing systems that can capture patient heterogeneity and relevant biomedical knowledge. Here we present PlaNet, a geometric deep learning framework that reasons over population variability, disease biology, and drug chemistry by representing knowledge in the form of a massive clinical knowledge graph that can be enhanced by language models.
View Article and Find Full Text PDFAnalysis of single-cell datasets generated from diverse organisms offers unprecedented opportunities to unravel fundamental evolutionary processes of conservation and diversification of cell types. However, interspecies genomic differences limit the joint analysis of cross-species datasets to homologous genes. Here we present SATURN, a deep learning method for learning universal cell embeddings that encodes genes' biological properties using protein language models.
View Article and Find Full Text PDFThe intestine is a complex organ that promotes digestion, extracts nutrients, participates in immune surveillance, maintains critical symbiotic relationships with microbiota and affects overall health. The intesting has a length of over nine metres, along which there are differences in structure and function. The localization of individual cell types, cell type development trajectories and detailed cell transcriptional programs probably drive these differences in function.
View Article and Find Full Text PDFAging is characterized by a decline in tissue function, but the underlying changes at cellular resolution across the organism remain unclear. Here, we present the Aging Fly Cell Atlas, a single-nucleus transcriptomic map of the whole aging . We characterized 163 distinct cell types and performed an in-depth analysis of changes in tissue cell composition, gene expression, and cell identities.
View Article and Find Full Text PDFAnalysis of single-cell datasets generated from diverse organisms offers unprecedented opportunities to unravel fundamental evolutionary processes of conservation and diversification of cell types. However, inter-species genomic differences limit the joint analysis of cross-species datasets to homologous genes. Here, we present SATURN, a deep learning method for learning universal cell embeddings that encodes genes' biological properties using protein language models.
View Article and Find Full Text PDFPac Symp Biocomput
December 2022
The following sections are included: Introduction, Understanding and Predicting Molecular Networks, Understanding and Predicting Molecular Networks, Making Use of Family Structure, Applying Traditional Graph Algorithms to Novel Tasks, Representing Uncertainty in Networks, Conclusion, References.
View Article and Find Full Text PDFAccurate cell-type annotation from spatially resolved single cells is crucial to understand functional spatial biology that is the basis of tissue organization. However, current computational methods for annotating spatially resolved single-cell data are typically based on techniques established for dissociated single-cell technologies and thus do not take spatial organization into account. Here we present STELLAR, a geometric deep learning method for cell-type discovery and identification in spatially resolved single-cell datasets.
View Article and Find Full Text PDFolfactory neurons have long been thought to express only one chemosensory receptor gene family. There are two main olfactory receptor gene families in , the odorant receptors (ORs) and the ionotropic receptors (IRs). The dozens of odorant-binding receptors in each family require at least one co-receptor gene in order to function: for ORs, and , , and for IRs.
View Article and Find Full Text PDFFor more than 100 years, the fruit fly has been one of the most studied model organisms. Here, we present a single-cell atlas of the adult fly, Tabula , that includes 580,000 nuclei from 15 individually dissected sexed tissues as well as the entire head and body, annotated to >250 distinct cell types. We provide an in-depth analysis of cell type-related gene signatures and transcription factor markers, as well as sexual dimorphism, across the whole animal.
View Article and Find Full Text PDFSingle cell technologies are rapidly generating large amounts of data that enables us to understand biological systems at single-cell resolution. However, joint analysis of datasets generated by independent labs remains challenging due to a lack of consistent terminology to describe cell types. Here, we present OnClass, an algorithm and accompanying software for automatically classifying cells into cell types that are part of the controlled vocabulary that forms the Cell Ontology.
View Article and Find Full Text PDFRecognition of environmental cues is essential for the survival of all organisms. Transcriptional changes occur to enable the generation and function of the neural circuits underlying sensory perception. To gain insight into these changes, we generated single-cell transcriptomes of olfactory- (ORNs), thermo-, and hygro-sensory neurons at an early developmental and adult stage using single-cell and single-nucleus RNA sequencing.
View Article and Find Full Text PDFNeurons undergo substantial morphological and functional changes during development to form precise synaptic connections and acquire specific physiological properties. What are the underlying transcriptomic bases? Here, we obtained the single-cell transcriptomes of olfactory projection neurons (PNs) at four developmental stages. We decoded the identity of 21 transcriptomic clusters corresponding to 20 PN types and developed methods to match transcriptomic clusters representing the same PN type across development.
View Article and Find Full Text PDFAlthough tremendous effort has been put into cell-type annotation, identification of previously uncharacterized cell types in heterogeneous single-cell RNA-seq data remains a challenge. Here we present MARS, a meta-learning approach for identifying and annotating known as well as new cell types. MARS overcomes the heterogeneity of cell types by transferring latent cell representations across multiple datasets.
View Article and Find Full Text PDFA synthesis of phenotypic and quantitative genomic traits is provided for bacteria and archaea, in the form of a scripted, reproducible workflow that standardizes and merges 26 sources. The resulting unified dataset covers 14 phenotypic traits, 5 quantitative genomic traits, and 4 environmental characteristics for approximately 170,000 strain-level and 15,000 species-aggregated records. It spans all habitats including soils, marine and fresh waters and sediments, host-associated and thermal.
View Article and Find Full Text PDFIn many applications, high-dimensional data points can be well represented by low-dimensional subspaces. To identify the subspaces, it is important to capture a global and local structure of the data which is achieved by imposing low-rank and sparseness constraints on the data representation matrix. In low-rank sparse subspace clustering (LRSSC), nuclear and l -norms are used to measure rank and sparsity.
View Article and Find Full Text PDFBacteria and Archaea display a variety of phenotypic traits and can adapt to diverse ecological niches. However, systematic annotation of prokaryotic phenotypes is lacking. We have therefore developed ProTraits, a resource containing ∼545 000 novel phenotype inferences, spanning 424 traits assigned to 3046 bacterial and archaeal species.
View Article and Find Full Text PDFThe amino acid composition (AAC) of proteomes differs greatly between microorganisms and is associated with the environmental niche they inhabit, suggesting that these changes may be adaptive. Similarly, the oligonucleotide composition of genomes varies and may confer advantages at the DNA/RNA level. These influences overlap in protein-coding sequences, making it difficult to gauge their relative contributions.
View Article and Find Full Text PDF