Applications of generative models for genomic data have gained significant momentum in the past few years, with scopes ranging from data characterization to generation of genomic segments and functional sequences. In our previous study, we demonstrated that generative adversarial networks (GANs) and restricted Boltzmann machines (RBMs) can be used to create novel high-quality artificial genomes (AGs) which can preserve the complex characteristics of real genomes such as population structure, linkage disequilibrium and selection signals. However, a major drawback of these models is scalability, since the large feature space of genome-wide data increases computational complexity vastly. To address this issue, we implemented a novel convolutional Wasserstein GAN (WGAN) model along with a novel conditional RBM (CRBM) framework for generating AGs with high SNP number. These networks implicitly learn the varying landscape of haplotypic structure in order to capture complex correlation patterns along the genome and generate a wide diversity of plausible haplotypes. We performed comparative analyses to assess both the quality of these generated haplotypes and the amount of possible privacy leakage from the training data. As the importance of genetic privacy becomes more prevalent, the need for effective privacy protection measures for genomic data increases. We used generative neural networks to create large artificial genome segments which possess many characteristics of real genomes without substantial privacy leakage from the training dataset. In the near future, with further improvements in haplotype quality and privacy preservation, large-scale artificial genome databases can be assembled to provide easily accessible surrogates of real databases, allowing researchers to conduct studies with diverse genomic data within a safe ethical framework in terms of donor privacy.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10635570PMC
http://dx.doi.org/10.1371/journal.pcbi.1011584DOI Listing

Publication Analysis

Top Keywords

genomic data
16
neural networks
8
characteristics real
8
real genomes
8
data increases
8
privacy leakage
8
leakage training
8
artificial genome
8
data
7
privacy
6

Similar Publications

Sensitivity to Environmental Stress and Adversity and Lung Cancer.

JAMA Netw Open

January 2025

Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Rockville, Maryland.

Importance: Sensitivity to environmental stress and adversity may influence lung cancer risk, highlighting a critical link between psychosocial factors and cancer etiology.

Objective: To evaluate whether genetically estimated sensitivity to environmental stress and adversity is associated with lung cancer risk.

Design, Setting, And Participants: Data were obtained from a genome-wide association study identifying 37 independent genetic variants strongly associated with sensitivity to environmental stress and adversity and a cross-ancestry genome-wide meta-analysis from the International Lung Cancer Consortium.

View Article and Find Full Text PDF

Objectives: COVID-19 and systemic sclerosis (SSc) share multiple similarities in their clinical manifestations, alterations in immune response, and therapeutic options. These resemblances have also been identified in other immune-mediated inflammatory diseases where a common genetic component has been found. Thus, we decided to evaluate for the first time this shared genetic architecture with SSc.

View Article and Find Full Text PDF

PD-L1/PD-1 checkpoint inhibitors (CPIs) are mainstream agents for cancer immunotherapy, but the prognosis is unsatisfactory in solid tumor patients lacking preexisting T-cell reactivity. Adjunct therapy strategies including the intratumoral administration of immunostimulants aim to address this limitation. CpG oligodeoxynucleotides (ODNs), TLR9 agonists that can potentiate adaptive immunity, have been widely investigated to tackle PD-L1/PD-1 resistance, but clinical success has been hindered by inconsistent efficacy and immune-related toxicities caused by systemic exposure.

View Article and Find Full Text PDF

Insights and Opportunities from Multimarker Evaluation of Heart Failure: Lessons from BIOSTAT-HF.

Curr Heart Fail Rep

January 2025

Division of Cardiovascular Medicine, Department of Medicine, University of California, 9394 Medical Center Drive, La Jolla, San Diego, CA, USA.

Purpose Of Review: Heart failure is a complex and heterogenous disease state that affects millions worldwide. Over recent decades, advancements in medical therapy and device implementation have significantly transformed the landscape of heart failure outcomes, while improvements in imaging modalities and greater accessibility to genome sequencing have led to increasing recognition of distinct heart failure endotypes. There is rising evidence to suggest all patients do not benefit equally from intensification of guideline directed medical therapy (GDMT).

View Article and Find Full Text PDF

Causal associations between immune cells and psychiatric disorders: a bidirectional mendelian randomization analysis.

Naunyn Schmiedebergs Arch Pharmacol

January 2025

Graduate School of PLA Medical College, Chinese PLA General Hospital and PLA Medical College, 28 Fu Xing Road, Beijing, 100083, China.

Extensive researches illuminate a potential interplay between immune traits and psychiatric disorders. However, whether there is the causal relationship between the two remains an unresolved question. We conducted a two-sample bidirectional mendelian randomization by utilizing summary data of 731 immune cell traits from genome-wide association studies (GCST90001391-GCST90002121)) and 11 psychiatric disorders including attention deficit/hyperactivity disorder (ADHD), anxiety disorder, autism spectrum disorder (ASD), bipolar disorder (BIP), anorexia nervosa (AN), major depressive disorder (MDD), obsessive-compulsive disorder (OCD), Tourette syndrome (TS), post-traumatic stress disorder (PTSD), schizophrenia (SCZ), and substance use disorders (cannabis) (SUD) from the Psychiatric Genomics Consortium (PGC).

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!