Publications by Sunyaev S | LitMetric

Publications by authors named "Sunyaev S"

Page 1 of 8

Missing Regulation Between Genetic Association and Transcriptional Abundance for Hypercholesterolemia Genes.

Aaron Hakim Noah J Connally Gavin R Schnitzler Michael H Cho Z Gordon Jiang

Genes (Basel)

January 2025

Low-density lipoprotein cholesterol (LDL-C) is a well-established risk factor for cardiovascular disease, and it plays a causal role in the development of atherosclerosis. Genome-wide association studies (GWASs) have successfully identified hundreds of genetic variants associated with LDL-C. Most of these risk loci fall in non-coding regions of the genome, and it is unclear how these non-coding variants affect circulating lipid levels.

View Article and Find Full Text PDF

NERINE reveals rare variant associations in gene networks across multiple phenotypes and implicates an subnetwork in Parkinson's disease.

Sumaiya Nazeen Xinyuan Wang Autumn Morrow Ronya Strom Elizabeth Ethier

bioRxiv

January 2025

Gene networks encapsulate biological knowledge, often linked to polygenic diseases. While model system experiments generate many plausible gene networks, validating their role in human phenotypes requires evidence from human genetics. Rare variants provide the most straightforward path for such validation.

View Article and Find Full Text PDF

Inherent instability of simple DNA repeats shapes an evolutionarily stable distribution of repeat lengths.

Ryan J McGinty Daniel J Balick Sergei M Mirkin Shamil R Sunyaev

bioRxiv

January 2025

Using the Telomere-to-Telomere reference, we assembled the distribution of simple repeat lengths present in the human genome. Analyzing over two hundred mammalian genomes, we found remarkable consistency in the shape of the distribution across evolutionary epochs. All observed genomes harbor an excess of long repeats, which are prone to developing into repeat expansion disorders.

View Article and Find Full Text PDF

Are complex traits underpinned by polygenic molecular traits? A reflection on the complexity of gene expression.

Mohsen Hajheideri Shamil Sunyaev Juliette de Meaux

Plant Cell Physiol

November 2024

View Article and Find Full Text PDF

Quantifying constraint in the human mitochondrial genome.

Nicole J Lake Kaiyue Ma Wei Liu Stephanie L Battle Kristen M Laricchia

Nature

November 2024

Mitochondrial DNA (mtDNA) has an important yet often overlooked role in health and disease. Constraint models quantify the removal of deleterious variation from the population by selection and represent powerful tools for identifying genetic variation that underlies human phenotypes. However, nuclear constraint models are not applicable to mtDNA, owing to its distinct features.

View Article and Find Full Text PDF

Somatic mosaicism in schizophrenia brains reveals prenatal mutational processes.

Eduardo A Maury Attila Jones Vladimir Seplyarskiy Thanh Thanh L Nguyen Chaggai Rosenbluh

Science

October 2024

Article Synopsis

* A study using deep whole-genome sequencing of brain neurons found that SCZ cases had more somatic mutations in regions of active gene expression compared to controls.
* These somatic mutations, particularly at transcription factor binding sites, may affect gene expression related to SCZ and contribute to its development during brain formation.

View Article and Find Full Text PDF

FAVOR-GPT: a generative natural language interface to whole genome variant functional annotations.

Thomas Cheng Li Hufeng Zhou Vineet Verma Xiangru Tang Yanjun Shao

Bioinform Adv

September 2024

Motivation: Functional Annotation of genomic Variants Online Resources (FAVOR) offers multi-faceted, whole genome variant functional annotations, which is essential for Whole Genome and Exome Sequencing (WGS/WES) analysis and the functional prioritization of disease-associated variants. A versatile chatbot designed to facilitate informative interpretation and interactive, user-centric summary of the whole genome variant functional annotation data in the FAVOR database is needed.

Results: We have developed FAVOR-GPT, a generative natural language interface powered by integrating large language models (LLMs) and FAVOR.

View Article and Find Full Text PDF

Genetic mapping across autoimmune diseases reveals shared associations and mechanisms.

Matthew R Lincoln Noah Connally Pierre-Paul Axisa Christiane Gasperi Mitja Mitrovic

Nat Genet

May 2024

Article Synopsis

Autoimmune and inflammatory diseases involve multiple genes and often share risk alleles, making it tough to pinpoint specific causes.
A study analyzing over 129,000 cases and controls found that about 40% of related genetic associations come from the same genetic variants across six different diseases.
By improving the resolution of genetic mapping, the researchers could identify more related gene expressions, suggesting that while there are common mechanisms between these diseases, there isn't just one universal cause for all autoimmune diseases.

View Article and Find Full Text PDF

Deep sequencing of proteotoxicity modifier genes uncovers a Presenilin-2/beta-amyloid-actin genetic risk module shared among alpha-synucleinopathies.

Sumaiya Nazeen Xinyuan Wang Dina Zielinski Isabel Lam Erinc Hallacli

bioRxiv

March 2024

Article Synopsis

Research is exploring whether neurodegenerative diseases caused by similar protein misfolding share genetic risk factors, but traditional studies lack the power to conclusively determine this.
By selecting patients based on their specific protein aggregation rather than just their clinical diagnosis, researchers can better identify genetic variants associated with diseases like Parkinson's and Alzheimer's.
The study finds that genetic modifiers related to alpha-synuclein and beta-amyloid contribute to shared risk factors in neurodegenerative diseases, indicating common underlying mechanisms across different conditions.

View Article and Find Full Text PDF

Joint, multifaceted genomic analysis enables diagnosis of diverse, ultra-rare monogenic presentations.

Shilpa Nadimpalli Kobren Mikhail A Moldovan Rebecca Reimers Daniel Traviglia Xinyun Li

bioRxiv

August 2024

Article Synopsis

Recent advancements in genomics for diagnosing rare diseases focus on "N-of-1" analyses, allowing for tailored studies on individual patients with ultra-rare conditions.
The Undiagnosed Diseases Network (UDN) enables collaborative research across various U.S. clinical and research centers, which enhances the ability to analyze whole genome sequencing data from multiple patients simultaneously.
Introducing a new software package, RaMeDiES, the team provides tools for automated comparisons of genomic data, leading to novel disease associations and improving overall understanding of genetic links to these rare diseases.

View Article and Find Full Text PDF

RExPRT: a machine learning tool to predict pathogenicity of tandem repeat loci.

Sarah Fazal Matt C Danzi Isaac Xu Shilpa Nadimpalli Kobren Shamil Sunyaev

Genome Biol

January 2024

Article Synopsis

Expansions of tandem repeats (TRs) are linked to about 60 genetic diseases, and finding more pathogenic repeats could improve disease diagnosis.
RExPRT (Repeat EXpansion Pathogenicity pRediction Tool) is a machine learning tool designed to differentiate harmful TR expansions from harmless ones.
The tool has shown impressive results, achieving an average precision of 93% and recall of 83%, making it helpful for prioritizing which genetic candidates to study further in large-scale research.

View Article and Find Full Text PDF

Pervasive correlations between causal disease effects of proximal SNPs vary with functional annotations and implicate stabilizing selection.

Martin Jinye Zhang Arun Durvasula Colby Chiang Evan M Koch Benjamin J Strober

Res Sq

December 2023

Article Synopsis

The study introduces a new method called LDSPEC to estimate the relationship between causal disease effect sizes of nearby SNPs, challenging the assumption that they are independent.
It analyzes data from 70 diseases in the UK Biobank, discovering significant correlations in effect sizes among proximal SNP pairs, which vary based on different factors such as distance and allele frequency.
The research finds that SNP pairs with related functions show stronger correlations extending over longer genomic distances, and it reveals that SNP-heritability estimates are lower than previously thought, indicating a discrepancy between expected and real genetic contributions to diseases.

View Article and Find Full Text PDF

Pervasive correlations between causal disease effects of proximal SNPs vary with functional annotations and implicate stabilizing selection.

Martin Jinye Zhang Arun Durvasula Colby Chiang Evan M Koch Benjamin J Strober

medRxiv

December 2023

Article Synopsis

The study investigates the relationships between causal disease effect sizes of proximal SNPs (single nucleotide polymorphisms) using a new method called LDSPEC, suggesting that these SNPs are not independent as previously thought.
By applying LDSPEC to data from 70 diseases in the UK Biobank, researchers found that the correlations in effect sizes between nearby SNPs varied based on distance, allele frequency, and linkage disequilibrium (LD), indicating complex interactions.
The results reveal that SNP pairs with shared functions show stronger correlations over longer distances, leading to a significant discrepancy between SNP-heritability estimates and the total variance of causal effect sizes, challenging prior assumptions in genetic research.

View Article and Find Full Text PDF

Low-frequency inherited complement receptor variants are associated with purpura fulminans.

Pavan K Bendapudi Sumaiya Nazeen Justine Ryu Onuralp Söylemez Alissa Robbins

Blood

March 2024

Article Synopsis

Extreme disease phenotypes, like infectious purpura fulminans (PF), can reveal important insights into common health conditions but are hard to study due to their rarity.
Researchers utilized a new method called the rare variant trend test (RVTT) to analyze genetic risk factors associated with PF, examining both prospective patient samples and historical records from large hospital systems.
They discovered a significant increase in low-frequency variants in the complement system among PF patients, linking these genetic changes to severe hyperinflammation in sepsis through loss and gain of function in complement receptors CR3 and CR4.

View Article and Find Full Text PDF

A mutation rate model at the basepair resolution identifies the mutagenic effect of polymerase III transcription.

Vladimir Seplyarskiy Evan M Koch Daniel J Lee Joshua S Lichtman Harding H Luan

Nat Genet

December 2023

De novo mutations occur at substantially different rates depending on genomic location, sequence context and DNA strand. The success of methods to estimate selection intensity, infer demographic history and map rare disease genes, depends strongly on assumptions about the local mutation rate. Here we present Roulette, a genome-wide mutation rate model at basepair resolution that incorporates known determinants of local mutation rate.

View Article and Find Full Text PDF

The landscape of tolerated genetic variation in humans and primates.

Hong Gao Tobias Hamp Jeffrey Ede Joshua G Schraiber Jeremy McRae

Science

June 2023

Personalized genome sequencing has revealed millions of genetic differences between individuals, but our understanding of their clinical relevance remains largely incomplete. To systematically decipher the effects of human genetic variants, we obtained whole-genome sequencing data for 809 individuals from 233 primate species and identified 4.3 million common protein-altering variants with orthologs in humans.

View Article and Find Full Text PDF

The landscape of tolerated genetic variation in humans and primates.

Hong Gao Tobias Hamp Jeffrey Ede Joshua G Schraiber Jeremy McRae

bioRxiv

May 2023

Unlabelled: Personalized genome sequencing has revealed millions of genetic differences between individuals, but our understanding of their clinical relevance remains largely incomplete. To systematically decipher the effects of human genetic variants, we obtained whole genome sequencing data for 809 individuals from 233 primate species, and identified 4.3 million common protein-altering variants with orthologs in human.

View Article and Find Full Text PDF

Recurrent mutation in the ancestry of a rare variant.

John Wakeley Wai-Tong Louis Fan Evan Koch Shamil Sunyaev

Genetics

July 2023

Recurrent mutation produces multiple copies of the same allele which may be co-segregating in a population. Yet, most analyses of allele-frequency or site-frequency spectra assume that all observed copies of an allele trace back to a single mutation. We develop a sampling theory for the number of latent mutations in the ancestry of a rare variant, specifically a variant observed in relatively small count in a large sample.

View Article and Find Full Text PDF

Revisiting mutagenesis at non-B DNA motifs in the human genome.

R J McGinty S R Sunyaev

Nat Struct Mol Biol

April 2023

Non-B DNA structures formed by repetitive sequence motifs are known instigators of mutagenesis in experimental systems. Analyzing this phenomenon computationally in the human genome requires careful disentangling of intrinsic confounding factors, including overlapping and interrupted motifs and recurrent sequencing errors. Here, we show that accounting for these factors eliminates all signals of repeat-induced mutagenesis that extend beyond the motif boundary, and eliminates or dramatically shrinks the magnitude of mutagenesis within some motifs, contradicting previous reports.

View Article and Find Full Text PDF

Leveraging pleiotropy to discover and interpret GWAS results for sleep-associated traits.

Sung Chun Sebastian Akle Athanasios Teodosiadis Brian E Cade Heming Wang

PLoS Genet

December 2022

Genetic association studies of many heritable traits resulting from physiological testing often have modest sample sizes due to the cost and burden of the required phenotyping. This reduces statistical power and limits discovery of multiple genetic associations. We present a strategy to leverage pleiotropy between traits to both discover new loci and to provide mechanistic hypotheses of the underlying pathophysiology.

View Article and Find Full Text PDF

The missing link between genetic association and regulatory function.

Noah J Connally Sumaiya Nazeen Daniel Lee Huwenbo Shi John Stamatoyannopoulos

Elife

December 2022

Article Synopsis

The genetic basis of traits is mainly polygenic and influenced by non-coding alleles, which are thought to have minor regulatory roles in gene expression.
Despite having access to extensive gene expression and epigenomic data, few connections between genetic variants and gene activity have been established.
A study identified 220 gene-trait pairs influenced by protein-coding variants, revealing little evidence that typical gene expression explains associations with complex traits, indicating a need for improved models to understand these complexities.

View Article and Find Full Text PDF

FAVOR: functional annotation of variants online resource and annotator for variation across the human genome.

Hufeng Zhou Theodore Arapoglou Xihao Li Zilin Li Xiuwen Zheng

Nucleic Acids Res

January 2023

Large biobank-scale whole genome sequencing (WGS) studies are rapidly identifying a multitude of coding and non-coding variants. They provide an unprecedented resource for illuminating the genetic basis of human diseases. Variant functional annotations play a critical role in WGS analysis, result interpretation, and prioritization of disease- or trait-associated causal variants.

View Article and Find Full Text PDF

Demographic and Viral-Genetic Analyses of COVID-19 Severity in Bahrain Identify Local Risk Factors and a Protective Effect of Polymerase Mutations.

Evan M Koch Justin Du Michelle Dressner Hashmeya Erahim Alwasti Zahra Al Taif

medRxiv

October 2023

A multitude of demographic, health, and genetic factors are associated with the risk of developing severe COVID-19 following infection by the SARS-CoV-2. There is a need to perform studies across human societies and to investigate the full spectrum of genetic variation of the virus. Using data from 869 COVID-19 patients in Bahrain between March 2020 and March 2021, we analyzed paired viral sequencing and non-genetic host data to understand host and viral determinants of severe COVID-19.

View Article and Find Full Text PDF

AnFiSA: An open-source computational platform for the analysis of sequencing data for rare genetic disease.

M A Bouzinier D Etin S I Trifonov V N Evdokimova V Ulitin

J Biomed Inform

September 2022

Despite genomic sequencing rapidly transforming from being a bench-side tool to a routine procedure in a hospital, there is a noticeable lack of genomic analysis software that supports both clinical and research workflows as well as crowdsourcing. Furthermore, most existing software packages are not forward-compatible in regards to supporting ever-changing diagnostic rules adopted by the genetics community. Regular updates of genomics databases pose challenges for reproducible and traceable automated genetic diagnostics tools.

View Article and Find Full Text PDF

A cross-disorder dosage sensitivity map of the human genome.

Ryan L Collins Joseph T Glessner Eleonora Porcu Maarja Lepamets Rhonda Brandon

Cell

August 2022

Rare copy-number variants (rCNVs) include deletions and duplications that occur infrequently in the global human population and can confer substantial risk for disease. In this study, we aimed to quantify the properties of haploinsufficiency (i.e.

View Article and Find Full Text PDF