Data within biobanks capture broad yet detailed indices of human variation, but biobank-wide insights can be difficult to extract due to complexity and scale. Here, using large-scale factor analysis, we distill hundreds of variables (diagnoses, assessments and survey items) into 35 latent constructs, using data from unrelated individuals with predominantly estimated European genetic ancestry in UK Biobank. These factors recapitulate known disease classifications, disentangle elements of socioeconomic status, highlight the relevance of psychiatric constructs to health and improve measurement of pro-health behaviours.
View Article and Find Full Text PDFWhile blood gene signatures have shown promise in tuberculosis (TB) diagnosis and treatment monitoring, most signatures derived from a single cohort may be insufficient to capture TB heterogeneity in populations and individuals. Here we report a new generalized approach combining a network-based meta-analysis with machine-learning modeling to leverage the power of heterogeneity among studies. The transcriptome datasets from 57 studies (37 TB and 20 viral infections) across demographics and TB disease states were used for gene signature discovery and model training and validation.
View Article and Find Full Text PDFClassical statistical genetics theory defines dominance as any deviation from a purely additive, or dosage, effect of a genotype on a trait, which is known as the dominance deviation. Dominance is well documented in plant and animal breeding. Outside of rare monogenic traits, however, evidence in humans is limited.
View Article and Find Full Text PDFAn amendment to this paper has been published and can be accessed via a link at the top of the paper.
View Article and Find Full Text PDFTo discover novel genes underlying amyotrophic lateral sclerosis (ALS), we aggregated exomes from 3,864 cases and 7,839 ancestry-matched controls. We observed a significant excess of rare protein-truncating variants among ALS cases, and these variants were concentrated in constrained genes. Through gene level analyses, we replicated known ALS genes including SOD1, NEK1 and FUS.
View Article and Find Full Text PDF