Publications by authors named "Sharon Browning"

Principal component analysis (PCA) is widely used to control for population structure in genome-wide association studies (GWAS). Top principal components (PCs) typically reflect population structure, but challenges arise in deciding how many PCs are needed and ensuring that PCs do not capture other artifacts such as regions with atypical linkage disequilibrium (LD). In response to the latter, many groups suggest performing LD pruning or excluding known high LD regions prior to PCA.

View Article and Find Full Text PDF
Article Synopsis
  • Recent positive selection can lead to an accumulation of long identity-by-descent (IBD) haplotypes near specific genetic loci, which can help identify areas of adaptive evolution.
  • The proposed statistical methods aim to locate these regions, identify potential sweeping alleles, and estimate the selection coefficient (s) through innovative techniques like selection scans and parametric bootstrap for uncertainty quantification.
  • In extensive simulations, these methods outperform existing techniques, providing more accurate estimates of selection in data from European ancestry samples, demonstrating their effectiveness in studying recent adaptive evolution without needing known causal alleles or time series information.
View Article and Find Full Text PDF

Background: Chronic kidney disease (CKD) is highly prevalent in Central America, and genetic factors may contribute to CKD risk. To understand the influences of genetic admixture on CKD susceptibility, we conducted an admixture mapping screening of CKD traits and risk factors in US Hispanic and Latino individuals from Central America country of origin.

Methods: We analyzed 1023 participants of HCHS/SOL (Hispanic Community Health Study/Study of Latinos) who reported 4 grandparents originating from the same Central America country.

View Article and Find Full Text PDF

Genotype data include errors that may influence conclusions reached by downstream statistical analyses. Previous studies have estimated genotype error rates from discrepancies in human pedigree data, such as Mendelian inconsistent genotypes or apparent phase violations. However, uncalled deletions, which generally have not been accounted for in these studies, can lead to biased error rate estimates.

View Article and Find Full Text PDF

Principal component analysis (PCA) is widely used to control for population structure in genome-wide association studies (GWAS). Top principal components (PCs) typically reflect population structure, but challenges arise in deciding how many PCs are needed and ensuring that PCs do not capture other artifacts such as regions with atypical linkage disequilibrium (LD). In response to the latter, many groups suggest performing LD pruning or excluding known high LD regions prior to PCA.

View Article and Find Full Text PDF

We present a method for efficiently identifying clusters of identical-by-descent haplotypes in biobank-scale sequence data. Our multi-individual approach enables much more computationally efficient inference of identity by descent (IBD) than approaches that infer pairwise IBD segments and provides locus-specific IBD clusters rather than IBD segments. Our method's computation time, memory requirements, and output size scale linearly with the number of individuals in the dataset.

View Article and Find Full Text PDF

We present a method for efficiently identifying clusters of identical-by-descent haplotypes in biobank-scale sequence data. Our multi-individual approach enables much more efficient collection and storage of identity by descent (IBD) information than approaches that detect and store pairwise IBD segments. Our method's computation time, memory requirements, and output size scale linearly with the number of individuals in the dataset.

View Article and Find Full Text PDF

The effective size of a population (Ne) in the recent past can be estimated through analysis of identity-by-descent (IBD) segments. Several methods have been developed for estimating Ne from autosomal IBD segments, but no such effort has been made with X chromosome IBD segments. In this work, we propose a method to estimate the X chromosome effective population size from X chromosome IBD segments.

View Article and Find Full Text PDF
Article Synopsis
  • This study explores how genetic variation and environmental factors influence metabolic pathways and disease risks in Hispanic/Latino individuals, particularly focusing on their distinct ancestries.
  • Researchers analyzed 640 metabolites in a large cohort, identifying significant associations between specific metabolites and ancestry-derived genomic regions using advanced mass spectrometry techniques.
  • Key discoveries include finding that Native American and African ancestry regions are linked to the quantity of certain metabolites, which could have implications for understanding common diseases affecting these populations.
View Article and Find Full Text PDF

Alzheimer disease (AD) is the most common form of senile dementia, with high incidence late in life in many populations including Caribbean Hispanic (CH) populations. Such admixed populations, descended from more than one ancestral population, can present challenges for genetic studies, including limited sample sizes and unique analytical constraints. Therefore, CH populations and other admixed populations have not been well represented in studies of AD, and much of the genetic variation contributing to AD risk in these populations remains unknown.

View Article and Find Full Text PDF

Local ancestry is the source ancestry at each point in the genome of an admixed individual. Inferred local ancestry is used for admixture mapping and population genetic analyses. We present FLARE (fast local ancestry estimation), a method for local ancestry inference.

View Article and Find Full Text PDF

The first release of UK Biobank whole-genome sequence data contains 150,119 genomes. We present an open-source pipeline for filtering, phasing, and indexing these genomes on the cloud-based UK Biobank Research Analysis Platform. This pipeline makes it possible to apply haplotype-based methods to UK Biobank whole-genome sequence data.

View Article and Find Full Text PDF

We provide a method for estimating the genome-wide mutation rate from sequence data on unrelated individuals by using segments of identity by descent (IBD). The length of an IBD segment indicates the time to shared ancestor of the segment, and mutations that have occurred since the shared ancestor result in discordances between the two IBD haplotypes. Previous methods for IBD-based estimation of mutation rate have required the use of family data for accurate phasing of the genotypes.

View Article and Find Full Text PDF

Haplotypes can be estimated from unphased genotype data via statistical methods. When parent-offspring trios are available for inferring the true phase from Mendelian inheritance rules, the accuracy of statistical phasing is usually measured by the switch error rate, which is the proportion of pairs of consecutive heterozygotes that are incorrectly phased. We present a method for estimating the genotype error rate from parent-offspring trios and a method for estimating the bias that occurs in the observed switch error rate as a result of genotype error.

View Article and Find Full Text PDF

Allele frequency estimates in admixed populations, such as Hispanics and Latinos, rely on the sample's specific admixture composition and thus may differ between two seemingly similar populations. However, ancestry-specific allele frequencies, i.e.

View Article and Find Full Text PDF
Article Synopsis
  • The Colombian population shows a unique genetic background due to a mix of Native American, Spanish, and African ancestries, influenced by past population bottlenecks caused by diseases.
  • Through genetic analysis of 900 individuals, including those with Alzheimer's and other neurodegenerative disorders, researchers identified how historical admixture has shaped the occurrence of disease-related mutations.
  • The study found 21 pathogenic variants related to neurodegenerative diseases, with significant variation in risk based on ancestry, highlighting the importance of demographic history in understanding genetic diseases in the Colombian population.
View Article and Find Full Text PDF

Haplotype phasing is the estimation of haplotypes from genotype data. We present a fast, accurate, and memory-efficient haplotype phasing method that scales to large-scale SNP array and sequence data. The method uses marker windowing and composite reference haplotypes to reduce memory usage and computation time.

View Article and Find Full Text PDF

The SPrime program detects the variants in current-day populations that were introgressed from an archaic source in the past. It is optimized for detecting introgression from Neanderthals and Denisovans in modern humans. We provide a protocol for detecting Neanderthal and Denisovan introgression in 1000 Genomes Project data, specifically focusing on the CHB (Han Chinese in Beijing) population.

View Article and Find Full Text PDF
Article Synopsis
  • The Trans-Omics for Precision Medicine (TOPMed) programme aims to understand the genetic factors behind heart, lung, blood, and sleep disorders to enhance their diagnosis, treatment, and prevention.
  • TOPMed uses whole-genome sequencing from diverse individuals, revealing over 400 million genetic variants, many of which are rare and offer insights into human evolution and disease mechanisms.
  • The programme provides tools like a variant browser and access to genomic data, improving the capability of genome-wide association studies to include rare variants that could have significant health implications.
View Article and Find Full Text PDF
Article Synopsis
  • The study focuses on identifying genetic factors that impact kidney function, specifically estimated glomerular filtration rate (eGFR), by using whole genome sequencing data from over 23,000 participants of various ancestries.
  • Researchers discovered three new genetic loci linked to eGFR that are primarily found in non-European populations, indicating the importance of low-frequency variants.
  • The findings underscore the need for more diverse genetic research, as many influential kidney traits may be underrepresented in studies focused on predominantly European ancestries.
View Article and Find Full Text PDF
Article Synopsis
  • Most existing methods for identifying identity by descent (IBD) segments only provide basic data without assessing the uncertainty of segment endpoints and lengths.
  • We introduce a new method that estimates the probability distribution of IBD segment endpoints, factoring in issues like genotype errors and mutations, and is suitable for large datasets.
  • Our findings indicate that this method improves the accuracy of identifying regions with recent natural selection, effectively filtering out false signals and aligning with previous studies on selection.
View Article and Find Full Text PDF

Motivation: Estimation of pairwise kinship coefficients in large datasets is computationally challenging because the number of related individuals increases quadratically with sample size.

Results: We present IBDkin, a software package written in C for estimating kinship coefficients from identity by descent (IBD) segments. We use IBDkin to estimate kinship coefficients for 7.

View Article and Find Full Text PDF

Recombination rates vary significantly across the genome, and estimates of recombination rates are needed for downstream analyses such as haplotype phasing and genotype imputation. Existing methods for recombination rate estimation are limited by insufficient amounts of informative genetic data or by high computational cost. We present a method and software, called IBDrecomb, for using segments of identity by descent to infer recombination rates.

View Article and Find Full Text PDF

Archaeological studies estimate the initial settlement of Samoa at 2,750 to 2,880 y ago and identify only limited settlement and human modification to the landscape until about 1,000 to 1,500 y ago. At this point, a complex history of migration is thought to have begun with the arrival of people sharing ancestry with Near Oceanic groups (i.e.

View Article and Find Full Text PDF

Segments of identity by descent (IBD) are used in many genetic analyses. We present a method for detecting identical-by-descent haplotype segments in phased genotype data. Our method, called hap-IBD, combines a compressed representation of haplotype data, the positional Burrows-Wheeler transform, and multi-threaded execution to produce very fast analysis times.

View Article and Find Full Text PDF