Unified tests for fine-scale mapping and identifying sparse high-dimensional sequence associations.

Bioinformatics

Department of Biomedical Engineering, Center for Bioinformatics and Genomics, Department of Biostatistics and Bioinformatics and.

Published: February 2016

Motivation: In searching for genetic variants for complex diseases with deep sequencing data, genomic marker sets of high-dimensional genotypic data and sparse functional variants are quite common. Existing sequence association tests are incapable of identifying such marker sets or individual causal loci, although they appeared powerful to identify small marker sets with dense functional variants. In sequence association studies of admixed individuals, cryptic relatedness and population structure are known to confound the association analyses.

Method: We here propose a unified marker wise test (uFineMap) to accurately localize causal loci and a unified high-dimensional set based test (uHDSet) to identify high-dimensional sparse associations in deep sequencing genomic data of multi-ethnic individuals with random relatedness. These two novel tests are based on scaled sparse linear mixed regressions with Lp (0 < p < 1) norm regularization. They jointly adjust for cryptic relatedness, population structure and other confounders to prevent false discoveries and improve statistical power for identifying promising individual markers and marker sets that harbor functional genetic variants of a complex trait.

Results: With large scale simulation data and real data analyses, the proposed tests appropriately controlled Type I error rates and appeared to be more powerful than several prominent methods. We illustrated their practical utilities by the applications to DNA sequence data of Framingham Heart Study for osteoporosis. The proposed tests identified 11 novel significant genes that were missed by the prominent famSKAT and GEMMA. In particular, four out of six most significant pathways identified by the uHDSet but missed by famSKAT have been reported to be related to BMD or osteoporosis in the literature.

Availability And Implementation: The computational toolkit is available for academic use: https://sites.google.com/site/shaolongscode/home/uhdset

Contact: wyp@tulane.edu

Supplementary Information: Supplementary data are available at Bioinformatics online.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5006306PMC
http://dx.doi.org/10.1093/bioinformatics/btv586DOI Listing

Publication Analysis

Top Keywords

marker sets
12
deep sequencing
8
functional variants
8
sequence association
8
causal loci
8
unified tests
4
tests fine-scale
4
fine-scale mapping
4
mapping identifying
4
sparse
4

Similar Publications

The reduced cost of next-generation sequencing (NGS) has allowed researchers to generate nuclear and mitochondrial genome data to gain deeper insights into the phylogeography, evolutionary history and biology of non-model species. While the Cape buffalo () has been well-studied across its range with traditional genetic markers over the last 25 years, researchers are building on this knowledge by generating whole genome, population-level data sets to improve understanding of the genetic composition and evolutionary history of the species. Using publicly available NGS data, we assembled 40 Cape buffalo mitochondrial genomes (mitogenomes) from four protected areas in South Africa, expanding the geographical range and almost doubling the number of mitogenomes available for this species.

View Article and Find Full Text PDF

Genomic selection is a widely used quantitative method of determining the genetic value of an individual from genomic information and phenotypic data. In this study, we used a large, multi-year training population of 3248 individuals from the University of Florida strawberry (Fragaria × ananassa Duchesne) breeding program. We coupled this training population with a test population of 1460 individuals derived from 20 biparental families.

View Article and Find Full Text PDF

Biomarkers.

Alzheimers Dement

December 2024

Department of Biostatistics, Boston University School of Public Health, Boston, MA, USA.

Background: As a risk factor for Alzheimer's disease and related dementias (ADRD) in older adults, inflammatory mechanisms underlying physical frailty remain incompletely elucidated. This study aimed to characterize the inflammatory architecture of frailty and explore predictive implications of inflammatory signatures of frailty on ADRD.

Method: The study included 741 Framingham Heart Study Offspring cohort participants (52% female, mean 60 years range 40 to 85), dementia-free at Exam 7 (1998-2001), followed for incident dementia over 15.

View Article and Find Full Text PDF

Background: Blood biomarkers are essential in identifying Alzheimer's disease (AD) pathology. To ensure their clinical use, it is crucial to understand pre-analytical factors such as fasting conditions and long-term storage at -80°C. This study evaluates the effect of these factors on plasma biomarker concentrations for detecting AD pathology and neurodegeneration.

View Article and Find Full Text PDF

Background: Speech impairment appears at early stages of Alzheimer's disease. A mobile voice recognition-based cognitive assessment tool, Shanghai Cognitive Screening (SCS), was developed for detecting mild cognitive impairment (MCI) and dementia in the community. The objective of this study is to investigate speech biomarkers associated with cognitive impairments based on SCS, and to evaluate the diagnostic accuracy of speech feature-based machine learning (ML) models for detecting MCI.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!