Backgrounds: Biomedical research requires sophisticated understanding and reasoning across multiple specializations. While large language models (LLMs) show promise in scientific applications, their capability to safely and accurately support complex biomedical research remains uncertain.
Methods: We present , a novel question-and-answer benchmark for evaluating LLMs in biomedical research.
Elucidating the genetic contributions to Parkinson's disease (PD) etiology across diverse ancestries is a critical priority for the development of targeted therapies in a global context. We conducted the largest sequencing characterization of potentially disease-causing, protein-altering and splicing mutations in 710 cases and 11,827 controls from genetically predicted African or African admixed ancestries. We explored copy number variants (CNVs) and runs of homozygosity (ROHs) in prioritized early onset and familial cases.
View Article and Find Full Text PDFBackground: Known pathogenic variants in Parkinson's disease (PD) contribute to disease development but have yet to be fully explored by arrays at scale.
Objectives: This study evaluated genotyping success of the NeuroBooster array (NBA) and determined the frequencies of pathogenic variants across ancestries.
Method: We analyzed the presence and allele frequency of 34 pathogenic variants in 28,710 PD cases, 9,614 other neurodegenerative disorder cases, and 15,821 controls across 11 ancestries within the Global Parkinson's Genetics Program dataset.
Background: Commercial genome-wide genotyping arrays have historically neglected coverage of genetic variation across populations.
Objective: We aimed to create a multi-ancestry genome-wide array that would include a wide range of neuro-specific genetic content to facilitate genetic research in neurological disorders across multiple ancestral groups, fostering diversity and inclusivity in research studies.
Methods: We developed the Illumina NeuroBooster Array (NBA), a custom high-throughput and cost-effective platform on a backbone of 1,914,934 variants from the Infinium Global Diversity Array and added custom content comprising 95,273 variants associated with more than 70 neurological conditions or traits, and we further tested its performance on more than 2000 patient samples.
Genotyping single nucleotide polymorphisms (SNPs) is fundamental to disease research, as researchers seek to establish links between genetic variation and disease. Although significant advances in genome technology have been made with the development of bead-based SNP genotyping and Genome Studio software, some SNPs still fail to be genotyped, resulting in "no-calls" that impede downstream analyses. To recover these genotypes, we introduce Cluster Buster, a genotyping neural network and visual inspection system designed to improve the quality of neurodegenerative disease (NDD) research.
View Article and Find Full Text PDFGenoTools, a Python package, streamlines population genetics research by integrating ancestry estimation, quality control (QC), and genome-wide association studies (GWAS) capabilities into efficient pipelines. By tracking samples, variants, and quality-specific measures throughout fully customizable pipelines, users can easily manage genetics data for large and small studies. GenoTools' "Ancestry" module renders highly accurate predictions, allowing for high-quality ancestry-specific studies, and enables custom ancestry model training and serialization, specified to the user's genotyping or sequencing platform.
View Article and Find Full Text PDFGenome-wide genotyping platforms have the capacity to capture genetic variation across different populations, but there have been disparities in the representation of population-dependent genetic diversity. The motivation for pursuing this endeavor was to create a comprehensive genome-wide array capable of encompassing a wide range of neuro-specific content for the Global Parkinson's Genetics Program (GP2) and the Center for Alzheimer's and Related Dementias (CARD). CARD aims to increase diversity in genetic studies, using this array as a tool to foster inclusivity.
View Article and Find Full Text PDFLancet Neurol
November 2023
Background: An understanding of the genetic mechanisms underlying diseases in ancestrally diverse populations is an important step towards development of targeted treatments. Research in African and African admixed populations can enable mapping of complex traits, because of their genetic diversity, extensive population substructure, and distinct linkage disequilibrium patterns. We aimed to do a comprehensive genome-wide assessment in African and African admixed individuals to better understand the genetic architecture of Parkinson's disease in these underserved populations.
View Article and Find Full Text PDFHigh-dimensional data analysis starts with projecting the data to low dimensions to visualize and understand the underlying data structure. Several methods have been developed for dimensionality reduction, but they are limited to cross-sectional datasets. The recently proposed Aligned-UMAP, an extension of the uniform manifold approximation and projection (UMAP) algorithm, can visualize high-dimensional longitudinal datasets.
View Article and Find Full Text PDFBackground: Biallelic pathogenic variants in GBA1 are the cause of Gaucher disease (GD) type 1 (GD1), a lysosomal storage disorder resulting from deficient glucocerebrosidase. Heterozygous GBA1 variants are also a common genetic risk factor for Parkinson's disease (PD). GD manifests with considerable clinical heterogeneity and is also associated with an increased risk for PD.
View Article and Find Full Text PDF