Publications by Nilah Ioannidis

Publications by authors named "Nilah Ioannidis"

Page 1 of 1

Variants in tubule epithelial regulatory elements mediate most heritable differences in human kidney function.

Gabriel B Loeb Pooja Kathail Richard W Shuai Ryan Chung Reinier J Grona Nilah M Ioannidis

Nat Genet

October 2024

Article Synopsis

- Kidney failure significantly impacts health, prompting a large-scale study of 406,504 participants to uncover genetic factors affecting kidney function, identifying 430 key genetic loci.
- The research revealed that 56% of inherited differences in kidney function are linked to regulatory elements in kidney tubule epithelial cells, while 7% relate to podocyte cells, suggesting these are crucial for gene expression.
- Further analysis using advanced techniques like enhancer assays and CRISPRi identified specific genes (NDRG1, CCNB1, and STC1) regulated by these genetic loci, shedding light on their roles in kidney function.

View Article and Find Full Text PDF

Current genomic deep learning models display decreased performance in cell type-specific accessible regions.

Pooja Kathail Richard W Shuai Ryan Chung Chun Jimmie Ye Gabriel B Loeb Nilah M Ioannidis

Genome Biol

August 2024

Article Synopsis

Deep learning models are used to predict epigenetic features, but their performance varies, especially in cell type-specific regions crucial for gene regulation.
The study compares general-purpose models and tissue-specific models, finding that tailored models can enhance accuracy in predicting chromatin accessibility in specific cells.
It emphasizes the need for novel strategies to improve predictions on genetic variants, as high reference sequence accuracy does not guarantee better variant effect predictions.

View Article and Find Full Text PDF

Current genomic deep learning models display decreased performance in cell type specific accessible regions.

Pooja Kathail Richard W Shuai Ryan Chung Chun Jimmie Ye Gabriel B Loeb Nilah M Ioannidis

bioRxiv

July 2024

Article Synopsis

A variety of deep learning models are being developed to predict chromatin accessibility from DNA sequences, but evaluation results often overlook the significance of cell type specific regulatory elements (CREs), which are crucial for gene regulation and complex disease heritability.
The study evaluates the accuracy of these genomic models, revealing that general purpose models like Enformer and Sei perform worse in regions that are specifically accessible to certain cell types.
The research highlights that tailoring models for specific tissues and enhancing their capacity for cell type specific regulation can boost performance, but improving predictions of reference sequences doesn't necessarily translate to better predictions of variant effects, suggesting the need for new approaches in the field.

View Article and Find Full Text PDF

Variants in tubule epithelial regulatory elements mediate most heritable differences in human kidney function.

Gabriel B Loeb Pooja Kathail Richard Shuai Ryan Chung Reinier J Grona Nilah Ioannidis

bioRxiv

June 2024

Article Synopsis

Kidney disease is largely influenced by genetics, yet the specific genes and mechanisms involved are still not fully understood; a recent GWAS identified 462 genetic loci associated with kidney function.
Researchers used single-cell ATAC-seq maps to explore chromatin accessibility in the kidney, finding that regulatory elements in kidney tubule epithelial cells accounted for the majority of genetic heritability related to kidney function.
The study further utilized CRISPR interference to demonstrate how inherited variations in regulatory elements impact gene expression in tubule epithelial cells, ultimately linking these differences to a predisposition for kidney disease in humans.

View Article and Find Full Text PDF

Designing Cell-Type-Specific Promoter Sequences Using Conservative Model-Based Optimization.

Aniketh Janardhan Reddy Xinyang Geng Michael H Herschl Sathvik Kolli Aviral Kumar Nilah M Ioannidis

bioRxiv

June 2024

Gene therapies have the potential to treat disease by delivering therapeutic genetic cargo to disease-associated cells. One limitation to their widespread use is the lack of short regulatory sequences, or promoters, that differentially induce the expression of delivered genetic cargo in target cells, minimizing side effects in other cell types. Such cell-type-specific promoters are difficult to discover using existing methods, requiring either manual curation or access to large datasets of promoter-driven expression from both targeted and untargeted cells.

View Article and Find Full Text PDF

Critical assessment of missense variant effect predictors on disease-relevant variant data.

Ruchir Rastogi Ryan Chung Sindy Li Chang Li Kyoungyeul Lee Nilah M Ioannidis

bioRxiv

June 2024

Regular, systematic, and independent assessment of computational tools used to predict the pathogenicity of missense variants is necessary to evaluate their clinical and research utility and suggest directions for future improvement. Here, as part of the sixth edition of the Critical Assessment of Genome Interpretation (CAGI) challenge, we assess missense variant effect predictors (or variant impact predictors) on an evaluation dataset of rare missense variants from disease-relevant databases. Our assessment evaluates predictors submitted to the CAGI6 Annotate-All-Missense challenge, predictors commonly used by the clinical genetics community, and recently developed deep learning methods for variant effect prediction.

View Article and Find Full Text PDF

Characterizing uncertainty in predictions of genomic sequence-to-activity models.

Ayesha Bajwa Ruchir Rastogi Pooja Kathail Richard W Shuai Nilah M Ioannidis

bioRxiv

December 2023

Genomic sequence-to-activity models are increasingly utilized to understand gene regulatory syntax and probe the functional consequences of regulatory variation. Current models make accurate predictions of relative activity levels across the human reference genome, but their performance is more limited for predicting the effects of genetic variants, such as explaining gene expression variation across individuals. To better understand the causes of these shortcomings, we examine the uncertainty in predictions of genomic sequence-to-activity models using an ensemble of Basenji2 model replicates.

View Article and Find Full Text PDF

Personal transcriptome variation is poorly explained by current genomic deep learning models.

Connie Huang Richard W Shuai Parth Baokar Ryan Chung Ruchir Rastogi Nilah M Ioannidis

Nat Genet

December 2023

Genomic deep learning models can predict genome-wide epigenetic features and gene expression levels directly from DNA sequence. While current models perform well at predicting gene expression levels across genes in different cell types from the reference genome, their ability to explain expression variation between individuals due to cis-regulatory genetic variants remains largely unexplored. Here, we evaluate four state-of-the-art models on paired personal genome and transcriptome data and find limited performance when explaining variation in expression across individuals.

View Article and Find Full Text PDF

GUANinE v1.0: Benchmark Datasets for Genomic AI Sequence-to-Function Models.

Eyes S Robson Nilah M Ioannidis

bioRxiv

March 2024

Computational genomics increasingly relies on machine learning methods for genome interpretation, and the recent adoption of neural sequence-to-function models highlights the need for rigorous model specification and controlled evaluation, problems familiar to other fields of AI. Research strategies that have greatly benefited other fields - including benchmarking, auditing, and algorithmic fairness - are also needed to advance the field of genomic AI and to facilitate model development. Here we propose a genomic AI benchmark, GUANinE, for evaluating model generalization across a number of distinct genomic tasks.

View Article and Find Full Text PDF

Cross-protein transfer learning substantially improves disease variant prediction.

Milind Jagota Chengzhong Ye Carlos Albors Ruchir Rastogi Antoine Koehl Nilah Ioannidis

Genome Biol

August 2023

Article Synopsis

Genetic variation in humans significantly influences disease risk, yet many missense variants remain uncharacterized; this study develops a computational model leveraging saturation mutagenesis to predict the pathogenicity of these variants.
The model, called CPT-1, is trained on deep mutational scanning data from just five proteins and outperforms existing methods in clinical variant interpretation, particularly excelling in sensitivity and specificity for detecting disease-related variants.
By incorporating various predictive features from protein sequences and structures, the framework is versatile for future enhancements and has released predictions for missense variants in 90% of human genes, showcasing the potential of mutational scanning data in variant analysis.

View Article and Find Full Text PDF

The Impact of Stability Considerations on Genetic Fine-Mapping.

Alan Aw Lionel Chentian Jin Nilah Ioannidis Yun S Song

bioRxiv

April 2023

Fine-mapping methods, which aim to identify genetic variants responsible for complex traits following genetic association studies, typically assume that sufficient adjustments for confounding within the association study cohort have been made, e.g., through regressing out the top principal components (i.

View Article and Find Full Text PDF

Strategies for effectively modelling promoter-driven gene expression using transfer learning.

Aniketh Janardhan Reddy Michael H Herschl Xinyang Geng Sathvik Kolli Amy X Lu Nilah M Ioannidis

bioRxiv

May 2024

The ability to deliver genetic cargo to human cells is enabling rapid progress in molecular medicine, but designing this cargo for precise expression in specific cell types is a major challenge. Expression is driven by regulatory DNA sequences within short synthetic promoters, but relatively few of these promoters are cell-type-specific. The ability to design cell-type-specific promoters using model-based optimization would be impactful for research and therapeutic applications.

View Article and Find Full Text PDF

Tissue-specific impacts of aging and genetics on gene expression patterns in humans.

Ryo Yamamoto Ryan Chung Juan Manuel Vazquez Huanjie Sheng Philippa L Steinberg Nilah M Ioannidis

Nat Commun

October 2022

Age is the primary risk factor for many common human diseases. Here, we quantify the relative contributions of genetics and aging to gene expression patterns across 27 tissues from 948 humans. We show that the predictive power of expression quantitative trait loci is impacted by age in many tissues.

View Article and Find Full Text PDF

Two-stage Study of Familial Prostate Cancer by Whole-exome Sequencing and Custom Capture Identifies 10 Novel Genes Associated with the Risk of Prostate Cancer.

Daniel J Schaid Shannon K McDonnell Liesel M FitzGerald Lissa DeRycke Zachary Fogarty Nilah Monnier Ioannidis

Eur Urol

March 2021

Background: Family history of prostate cancer (PCa) is a well-known risk factor, and both common and rare genetic variants are associated with the disease.

Objective: To detect new genetic variants associated with PCa, capitalizing on the role of family history and more aggressive PCa.

Design, Setting, And Participants: A two-stage design was used.

View Article and Find Full Text PDF

Predicting target genes of non-coding regulatory variants with IRT.

Zhenqin Wu Nilah M Ioannidis James Zou

Bioinformatics

August 2020

Summary: Interpreting genetic variants of unknown significance (VUS) is essential in clinical applications of genome sequencing for diagnosis and personalized care. Non-coding variants remain particularly difficult to interpret, despite making up a large majority of trait associations identified in genome-wide association studies (GWAS) analyses. Predicting the regulatory effects of non-coding variants on candidate genes is a key step in evaluating their clinical significance.

View Article and Find Full Text PDF

Estimating prevalence for limb-girdle muscular dystrophy based on public sequencing databases.

Wei Liu Sander Pajusalu Nicole J Lake Geyu Zhou Nilah Ioannidis

Genet Med

November 2019

Purpose: Limb-girdle muscular dystrophies (LGMD) are a genetically heterogeneous category of autosomal inherited muscle diseases. Many genes causing LGMD have been identified, and clinical trials are beginning for treatment of some genetic subtypes. However, even with the gene-level mechanisms known, it is still difficult to get a robust and generalizable prevalence estimation for each subtype due to the limited amount of epidemiology data and the low incidence of LGMDs.

View Article and Find Full Text PDF

A Prediction Tool to Facilitate Risk-Stratified Screening for Squamous Cell Skin Cancer.

Wei Wang Eric Jorgenson Nilah M Ioannidis Maryam M Asgari Alice S Whittemore

J Invest Dermatol

December 2018

Cutaneous squamous cell cancers (cSCCs) present an under-recognized health issue among non-Hispanic whites, one that is likely to increase as populations age. cSCC risks vary considerably among non-Hispanic whites, and this heterogeneity indicates the need for risk-stratified screening strategies that are guided by patients' personal characteristics and clinical histories. Here we describe cSCCscore, a prediction tool that uses patients' covariates and clinical histories to assign them personal probabilities of developing cSCCs within 3 years after risk assessment.

View Article and Find Full Text PDF

Gene expression imputation identifies candidate genes and susceptibility loci associated with cutaneous squamous cell carcinoma.

Nilah M Ioannidis Wei Wang Nicholas A Furlotte David A Hinds

Nat Commun

October 2018

Cutaneous squamous cell carcinoma (cSCC) is a common skin cancer with genetic susceptibility loci identified in recent genome-wide association studies (GWAS). Transcriptome-wide association studies (TWAS) using imputed gene expression levels can identify additional gene-level associations. Here we impute gene expression levels in 6891 cSCC cases and 54,566 controls in the Kaiser Permanente Genetic Epidemiology Research in Adult Health and Aging (GERA) cohort and 25,558 self-reported cSCC cases and 673,788 controls from 23andMe.

View Article and Find Full Text PDF

Genetic variants in the HLA class II region associated with risk of cutaneous squamous cell carcinoma.

Wei Wang Hanna M Ollila Alice S Whittemore Shadmehr Demehri Nilah M Ioannidis

Cancer Immunol Immunother

July 2018

Background: The immune system has been implicated in the pathophysiology of cutaneous squamous cell carcinoma (cSCC) as evidenced by the substantially increased risk of cSCC in immunosuppressed individuals. Associations between cSCC risk and single nucleotide polymorphisms (SNPs) in the HLA region have been identified by genome-wide association studies (GWAS). The translation of the associated HLA SNPs to structural amino acids changes in HLA molecules has not been previously elucidated.

View Article and Find Full Text PDF

FIRE: functional inference of genetic variants that regulate gene expression.

Nilah M Ioannidis Joe R Davis Marianne K DeGorter Nicholas B Larson Shannon K McDonnell

Bioinformatics

December 2017

Motivation: Interpreting genetic variation in noncoding regions of the genome is an important challenge for personal genome analysis. One mechanism by which noncoding single nucleotide variants (SNVs) influence downstream phenotypes is through the regulation of gene expression. Methods to predict whether or not individual SNVs are likely to regulate gene expression would aid interpretation of variants of unknown significance identified in whole-genome sequencing studies.

View Article and Find Full Text PDF

Cutaneous squamous cell cancer (cSCC) risk and the human leukocyte antigen (HLA) system.

Pooja Yesantharao Wei Wang Nilah M Ioannidis Shadmehr Demehri Alice S Whittemore

Hum Immunol

April 2017

Cutaneous squamous cell carcinoma (cSCC) is the second most common cancer among Caucasians in the United States, with rising incidence over the past decade. Treatment for non-melanoma skin cancer, including cSCC, in the United States was estimated to cost $4.8 billion in 2014.

View Article and Find Full Text PDF

REVEL: An Ensemble Method for Predicting the Pathogenicity of Rare Missense Variants.

Nilah M Ioannidis Joseph H Rothstein Vikas Pejaver Sumit Middha Shannon K McDonnell

Am J Hum Genet

October 2016

The vast majority of coding variants are rare, and assessment of the contribution of rare variants to complex traits is hampered by low statistical power and limited functional data. Improved methods for predicting the pathogenicity of rare coding variants are needed to facilitate the discovery of disease variants from exome sequencing studies. We developed REVEL (rare exome variant ensemble learner), an ensemble method for predicting the pathogenicity of missense variants on the basis of individual tools: MutPred, FATHMM, VEST, PolyPhen, SIFT, PROVEAN, MutationAssessor, MutationTaster, LRT, GERP, SiPhy, phyloP, and phastCons.

View Article and Find Full Text PDF

Identification of Susceptibility Loci for Cutaneous Squamous Cell Carcinoma.

Maryam M Asgari Wei Wang Nilah M Ioannidis Jacqueline Itnyre Thomas Hoffmann

J Invest Dermatol

May 2016

We report a genome-wide association study of cutaneous squamous cell carcinoma conducted among non-Hispanic white members of the Kaiser Permanente Northern California health care system. The study includes a genome-wide screen of 61,457 members (6,891 cases and 54,566 controls) genotyped on the Affymetrix Axiom European array and a replication phase involving an independent set of 6,410 additional members (810 cases and 5,600 controls). Combined analysis of screening and replication phases identified 10 loci containing single-nucleotide polymorphisms (SNPs) with P-values < 5 × 10(-8).

View Article and Find Full Text PDF