Artificial intelligence (AI) is revolutionizing scientific discovery because of its super capability, following the neural scaling laws, to integrate and analyze large-scale datasets to mine knowledge. Foundation models, large language models (LLMs) and large vision models (LVMs), are among the most important foundations paving the way for general AI by pre-training on massive domain-specific datasets. Different from the well annotated, formatted and integrated large textual and image datasets for LLMs and LVMs, biomedical knowledge and datasets are fragmented with data scattered across publications and inconsistent databases that often use diverse nomenclature systems in the field of AI for Precision Health and Medicine (AI4PHM).
View Article and Find Full Text PDFTriglyceride (TG)/HDL-C ratio (THR) is a surrogate predictor of hyperinsulinemia. To identify novel genetic loci for THR change over time (ΔTHR), we conducted genome-wide association study (GWAS) and genome-wide linkage scan (GWLS) among nondiabetic Europeans from the Long Life Family Study (n = 1,384). Subjects with diabetes or on dyslipidemia medications were excluded.
View Article and Find Full Text PDFMotivation: Multi-omics data, i.e. genomics, epigenomics, transcriptomics, proteomics, characterize cellular complex signaling systems from multi-level and multi-view and provide a holistic view of complex cellular signaling pathways.
View Article and Find Full Text PDFMulti-omic data can better characterize complex cellular signaling pathways from multiple views compared to individual omic data. However, integrative multi-omic data analysis to rank key disease biomarkers and infer core signaling pathways remains an open problem. In this study, our novel contributions are that we developed a novel graph AI model, , for analyzing multi-omic signaling graphs (mosGraphs), 2) analyzed multi-omic mosGraph datasets of AD, and 3) identified, visualized and evaluated a set of AD associated signaling biomarkers and network.
View Article and Find Full Text PDFGenerative pretrained models represent a significant advancement in natural language processing and computer vision, which can generate coherent and contextually relevant content based on the pre-training on large general datasets and fine-tune for specific tasks. Building foundation models using large scale omic data is promising to decode and understand the complex signaling language patterns within cells. Different from existing foundation models of omic data, we build a foundation model, , for multi-omic signaling (mos) graphs, in which the multi-omic data was integrated and interpreted using a multi-level signaling graph.
View Article and Find Full Text PDFStudying relationships between longitudinal changes in omics variables and risks of events requires specific methodologies for joint analyses of longitudinal and time-to-event outcomes. We applied two such approaches (joint models [JM], stochastic process models [SPM]) to longitudinal metabolomics data from the Long Life Family Study focusing on understudied associations of longitudinal changes in lysophosphatidylcholines (LPC) with mortality and aging-related outcomes (23 LPC species, 5,790 measurements of each in 4,011 participants, 1,431 of whom died during follow-up). JM analyses found that higher levels of the majority of LPC species were associated with lower mortality risks, with the largest effect size observed for LPC 15:0/0:0 (hazard ratio: 0.
View Article and Find Full Text PDFBackground: Previous researched has demonstrated potent health and survival advantages across three-generations in longevity-enriched families. However, the survival advantage associated with familial longevity may manifest earlier in life than previously thought.
Methods: We conducted a matched cohort study comparing early health trajectories in third-generation grandchildren (n = 5,637) and fourth-generation great-grandchildren (n = 14,908) of longevity-enriched sibships to demographically matched births (n = 41,090) in Denmark between 1973 and 2018.
Although both short and long sleep duration are associated with elevated hypertension risk, our understanding of their interplay with biological pathways governing blood pressure remains limited. To address this, we carried out genome-wide cross-population gene-by-short-sleep and long-sleep duration interaction analyses for three blood pressure traits (systolic, diastolic, and pulse pressure) in 811,405 individuals from diverse population groups. We discover 22 novel gene-sleep duration interaction loci for blood pressure, mapped to 23 genes.
View Article and Find Full Text PDFAims/hypothesis: Triglyceride (TG) /High density lipoprotein cholesterol (HDL-C) ratio (THR) represents a single surrogate predictor of hyperinsulinemia or insulin resistance that is associated with premature aging processes, risk of diabetes and increased mortality. To identify novel genetic loci for THR change over time (ΔTHR), we conducted genome-wide association study (GWAS) and genome-wide linkage scan (GWLS) among subjects of European ancestry who had complete data from two exams collected about seven years apart from the Long Life Family Study (LLFS, n=1384), a study with familial clustering of exceptional longevity in the US and Denmark.
Methods: Subjects with diabetes or using medications for dyslipidemia were excluded from this analysis.
Patients with chronic kidney disease (CKD) have increased oxidative stress and chronic inflammation, which may escalate the production of advanced glycation end-products (AGEs). High soluble receptor for AGE (sRAGE) and low estimated glomerular filtration rate (eGFR) levels are associated with CKD and aging. We evaluated whether eGFR calculated from creatinine and cystatin C share pleiotropic genetic factors with sRAGE.
View Article and Find Full Text PDFRecently, large-scale scRNA-seq datasets have been generated to understand the complex signaling mechanisms within the microenvironment of Alzheimer's Disease (AD), which are critical for identifying novel therapeutic targets and precision medicine. However, the background signaling networks are highly complex and interactive. It remains challenging to infer the core intra- and inter-multi-cell signaling communication networks using scRNA-seq data.
View Article and Find Full Text PDFOver Several years, we have developed a system for assuring the quality of whole genome sequence (WGS) data in the LLFS families. We have focused on providing data to identify germline genetic variants with the aim of releasing as many variants on as many individuals as possible. We aim to assure the quality of the individual calls.
View Article and Find Full Text PDFMulti-omics data, i.e., genomics, epigenomics, transcriptomics, proteomics, characterize cellular complex signaling systems from multi-level and multi-view and provide a holistic view of complex cellular signaling pathways.
View Article and Find Full Text PDFBackground & Aims: Steatotic liver disease (SLD), characterized by elevated liver fat content (LFC), is influenced by genetics and diet. However, whether diet has a differential effect based on genetic risk is not well-characterized. We aimed to determine how genetic factors interact with diet to affect SLD in a large national biobank.
View Article and Find Full Text PDFThe Long Life Family Study (LLFS) enrolled 4,953 participants in 539 pedigrees displaying exceptional longevity. To identify genetic mechanisms that affect cardiovascular risks in the LLFS population, we developed a multi-omics integration pipeline and applied it to 11 traits associated with cardiovascular risks. Using our pipeline, we aggregated gene-level statistics from rare-variant analysis, GWAS, and gene expression-trait association by Correlated Meta-Analysis (CMA).
View Article and Find Full Text PDFAlthough both short and long sleep duration are associated with elevated hypertension risk, our understanding of their interplay with biological pathways governing blood pressure remains limited. To address this, we carried out genome-wide cross-population gene-by-short-sleep and long-sleep duration interaction analyses for three blood pressure traits (systolic, diastolic, and pulse pressure) in 811,405 individuals from diverse population groups. We discover 22 novel gene-sleep duration interaction loci for blood pressure, mapped to genes involved in neurological, thyroidal, bone metabolism, and hematopoietic pathways.
View Article and Find Full Text PDFRecently, large-scale scRNA-seq datasets have been generated to understand the complex and poorly understood signaling mechanisms within microenvironment of Alzheimer's Disease (AD), which are critical for identifying novel therapeutic targets and precision medicine. Though a set of targets have been identified, however, it remains a challenging to infer the core intra- and inter-multi-cell signaling communication networks using the scRNA-seq data, considering the complex and highly interactive background signaling network. Herein, we introduced a novel graph transformer model, PathFinder, to infer multi-cell intra- and inter-cellular signaling pathways and signaling communications among multi-cell types.
View Article and Find Full Text PDFPatients with chronic kidney disease (CKD) have increased oxidative stress and chronic inflammation, which may escalate the production of advanced glycation end-products (AGE). High soluble receptor for AGE (sRAGE) and low estimated glomerular filtration rate (eGFR) levels are associated with CKD and aging. We evaluated whether eGFR calculated from creatinine and cystatin C share pleiotropic genetic factors with sRAGE.
View Article and Find Full Text PDF