Motivation: Determining the relative contributions of functional genetic categories is fundamental to understanding the genetic etiology of complex human traits and diseases. Here, we present Annotation Informed-MiXeR, a likelihood-based method for estimating the number of variants influencing a phenotype and their effect sizes across different functional annotation categories of the genome using summary statistics from genome-wide association studies.

Results: Extensive simulations demonstrate that the model is valid for a broad range of genetic architectures. The model suggests that complex human phenotypes substantially differ in the number of causal variants, their localization in the genome and their effect sizes. Specifically, the exons of protein-coding genes harbor more than 90% of variants influencing type 2 diabetes and inflammatory bowel disease, making them good candidates for whole-exome studies. In contrast, <10% of the causal variants for schizophrenia, bipolar disorder and attention-deficit/hyperactivity disorder are located in protein-coding exons, indicating a more substantial role of regulatory mechanisms in the pathogenesis of these disorders.

Availability And Implementation: The software is available at: https://github.com/precimed/mixer.

Supplementary Information: Supplementary data are available at Bioinformatics online.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7750998PMC
http://dx.doi.org/10.1093/bioinformatics/btaa568DOI Listing

Publication Analysis

Top Keywords

functional annotation
8
annotation categories
8
complex human
8
variants influencing
8
phenotype-specific differences
4
differences polygenicity
4
polygenicity size
4
size distribution
4
distribution functional
4
categories revealed
4

Similar Publications

MRSA's resistance poses a global health challenge. This study investigates lysine succinylation in MRSA using proteomics and bioinformatics approaches to uncover metabolic and virulence mechanisms, with the goal of identifying novel therapeutic targets. Mass spectrometry and bioinformatics analyses mapped the MRSA succinylome, identifying 8 048 succinylation sites on 1 210 proteins.

View Article and Find Full Text PDF

Background: Individuals with cystic fibrosis (CF; a recessive disorder) have an increased risk of colorectal cancer (CRC). Evidence suggests individuals with a single CFTR variant may also have increased CRC risk.

Methods: Using population-based studies (GECCO, CORECT, CCFR, and ARIC; 53 785 CRC cases and 58 010 controls), we tested for an association between the most common CFTR variant (Phe508del) and CRC risk.

View Article and Find Full Text PDF

Complete genome sequence of Pseudarthrobacter sp. NIBRBAC000502770 from coal mine of Hongcheon on Republic of Korea.

BMC Genom Data

January 2025

Department of Applied Biosciences, College of Agriculture and Life Sciences, Kyungpook National University, Daegu, 41566, Republic of Korea.

Objectives: The data were collected to obtain the complete genome sequence of Pseudarthrobacter sp. NIBRBAC000502770, isolated from the rhizosphere of Sasamorpha in a heavy metal-contaminated coal mine in Hongcheon, Republic of Korea. The objective was to explore the strain's genetic potential for plant growth promotion and heavy metal resistance, particularly arsenate and copper.

View Article and Find Full Text PDF

Drought and heat stress significantly limit crop growth and productivity. Their simultaneous occurrence, as often observed in summer crops, leads to larger yield losses. Sorghum is well adapted to dry and hot conditions.

View Article and Find Full Text PDF

Decoding the mA epitranscriptomic landscape for biotechnological applications using a direct RNA sequencing approach.

Nat Commun

January 2025

National-Local Joint Engineering Laboratory of Druggability and New Drug Evaluation, National Engineering Research Center for New Drug and Druggability (cultivation), Guangdong Province Key Laboratory of New Drug Design and Evaluation, School of Pharmaceutical Sciences, Sun Yat-Sen University, Guangzhou, 510006, China.

Epitranscriptomic modifications, particularly N6-methyladenosine (mA), are crucial regulators of gene expression, influencing processes such as RNA stability, splicing, and translation. Traditional computational methods for detecting mA from Nanopore direct RNA sequencing (DRS) data are constrained by their reliance on experimentally validated labels, often resulting in the underestimation of modification sites. Here, we introduce pum6a, an innovative attention-based framework that integrates positive and unlabeled multi-instance learning (MIL) to address the challenges of incomplete labeling and missing read-level annotations.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!