Principal component analysis (PCA) is widely used to control for population structure in genome-wide association studies (GWAS). Top principal components (PCs) typically reflect population structure, but challenges arise in deciding how many PCs are needed and ensuring that PCs do not capture other artifacts such as regions with atypical linkage disequilibrium (LD). In response to the latter, many groups suggest performing LD pruning or excluding known high LD regions prior to PCA.
View Article and Find Full Text PDFPrincipal component analysis (PCA) is widely used to control for population structure in genome-wide association studies (GWAS). Top principal components (PCs) typically reflect population structure, but challenges arise in deciding how many PCs are needed and ensuring that PCs do not capture other artifacts such as regions with atypical linkage disequilibrium (LD). In response to the latter, many groups suggest performing LD pruning or excluding known high LD regions prior to PCA.
View Article and Find Full Text PDFAlzheimer disease (AD) is the most common form of senile dementia, with high incidence late in life in many populations including Caribbean Hispanic (CH) populations. Such admixed populations, descended from more than one ancestral population, can present challenges for genetic studies, including limited sample sizes and unique analytical constraints. Therefore, CH populations and other admixed populations have not been well represented in studies of AD, and much of the genetic variation contributing to AD risk in these populations remains unknown.
View Article and Find Full Text PDFBackground: Black children have lower incidence yet worse survival than White and Latinx children with B-cell acute lymphoblastic leukemia (B-ALL). It is unclear how reported race/ethnicity (RRE) is associated with death in B-ALL after accounting for differentially expressed genes associated with genetic ancestry.
Methods: Using Phase 1 and 2 NCI TARGET B-ALL cases (N = 273; RRE-Black = 21, RRE-White = 162, RRE-Latinx = 69, RRE-Other = 9, RRE-Unknown = 12), we estimated proportions of African (AFR), European (EUR), and Amerindian (AMR) genetic ancestry.
Background: Prospective studies have suggested higher factor VIII (FVIII) levels are an independent risk factor for coronary heart disease (CHD) and stroke. However, limited information, including on genetic and epigenetic contributors to FVIII variation, is available specifically among African Americans (AAs), who have higher FVIII levels than Europeans.
Objectives: We measured FVIII levels in ~3400 AAs from the community-based Jackson Heart Study and assessed genetic, epigenetic, and epidemiological correlates of FVIII, as well as incident cardiovascular disease (CVD) associations.
Admixture mapping studies have become more common in recent years, due in part to technological advances and growing international efforts to increase the diversity of genetic studies. However, many open questions remain about appropriate implementation of admixture mapping studies, including how best to control for multiple testing, particularly in the presence of population structure. In this study, we develop a theoretical framework to characterize the correlation of local ancestry and admixture mapping test statistics in admixed populations with contributions from any number of ancestral populations and arbitrary population structure.
View Article and Find Full Text PDFPolygenic risk scores (PRSs) are weighted sums of risk allele counts of single-nucleotide polymorphisms (SNPs) associated with a disease or trait. PRSs are typically constructed based on published results from Genome-Wide Association Studies (GWASs), and the majority of which has been performed in large populations of European ancestry (EA) individuals. Although many genotype-trait associations have generalized across populations, the optimal choice of SNPs and weights for PRSs may differ between populations due to different linkage disequilibrium (LD) and allele frequency patterns.
View Article and Find Full Text PDFTo date, gene-based rare variant testing approaches have focused on aggregating information across sets of variants to maximize statistical power in identifying genes showing significant association with diseases. Beyond identifying genes that are associated with diseases, the identification of causal variant(s) in those genes and estimation of their effect is crucial for planning replication studies and characterizing the genetic architecture of the locus. However, we illustrate that straightforward single-marker association statistics can suffer from substantial bias introduced by conditioning on gene-based test significance, due to the phenomenon often referred to as "winner's curse.
View Article and Find Full Text PDF