Objective: Retention to complete follow-up surveys in extensive longitudinal epidemiological cohort studies is vital yet challenging. All of Us developed pilot interventions to improve response rates for follow-up surveys.
Study Design And Setting: The pilot interventions occurred from April 27, 2020, to August 3, 2020.
J Am Med Inform Assoc
February 2023
Objective: The aim of this study was to analyze a publicly available sample of rule-based phenotype definitions to characterize and evaluate the variability of logical constructs used.
Materials And Methods: A sample of 33 preexisting phenotype definitions used in research that are represented using Fast Healthcare Interoperability Resources and Clinical Quality Language (CQL) was analyzed using automated analysis of the computable representation of the CQL libraries.
Results: Most of the phenotype definitions include narrative descriptions and flowcharts, while few provide pseudocode or executable artifacts.
Uterine fibroids (UF) are common pelvic tumors in women, heritable, and genome-wide association studies (GWAS) have identified ~ 30 loci associated with increased risk in UF. Using summary statistics from a previously published UF GWAS performed in a non-Hispanic European Ancestry (NHW) female subset from the Electronic Medical Records and Genomics (eMERGE) Network, we constructed a polygenic risk score (PRS) for UF. UF-PRS was developed using PRSice and optimized in the separate clinical population of BioVU.
View Article and Find Full Text PDFPurpose: The goal of Electronic Medical Records and Genomics (eMERGE) Phase III Network was to return actionable sequence variants to 25,084 consenting participants from 10 different health care institutions across the United States. The purpose of this study was to evaluate system-based issues relating to the return of results (RoR) disclosure process for clinical grade research genomic tests to eMERGE3 participants.
Methods: RoR processes were developed and approved by each eMERGE institution's internal review board.
Increasingly, clinical phenotypes with matched genetic data from bio-bank linked electronic health records (EHRs) have been used for pleiotropy analyses. Thus far, pleiotropy analysis using individual-level EHR data has been limited to data from one site. However, it is desirable to integrate EHR data from multiple sites to improve the detection power and generalizability of the results.
View Article and Find Full Text PDFBackground: Abdominal aortic aneurysm (AAA) is an important cause of cardiovascular mortality; however, its genetic determinants remain incompletely defined. In total, 10 previously identified risk loci explain a small fraction of AAA heritability.
Methods: We performed a genome-wide association study in the Million Veteran Program testing ≈18 million DNA sequence variants with AAA (7642 cases and 172 172 controls) in veterans of European ancestry with independent replication in up to 4972 cases and 99 858 controls.
An amendment to this paper has been published and can be accessed via a link at the top of the paper.
View Article and Find Full Text PDFBenign prostatic hyperplasia (BPH) results in a significant public health burden due to the morbidity caused by the disease and many of the available remedies. As much as 70% of men over 70 will develop BPH. Few studies have been conducted to discover the genetic determinants of BPH risk.
View Article and Find Full Text PDFBody-fat distribution is a risk factor for adverse cardiovascular health consequences. We analyzed the association of body-fat distribution, assessed by waist-to-hip ratio adjusted for body mass index, with 228,985 predicted coding and splice site variants available on exome arrays in up to 344,369 individuals from five major ancestries (discovery) and 132,177 European-ancestry individuals (validation). We identified 15 common (minor allele frequency, MAF ≥5%) and nine low-frequency or rare (MAF <5%) coding novel variants.
View Article and Find Full Text PDFBackground: Proteomic approaches allow measurement of thousands of proteins in a single specimen, which can accelerate biomarker discovery. However, applying these technologies to massive biobanks is not currently feasible because of the practical barriers and costs of implementing such assays at scale. To overcome these challenges, we used a "virtual proteomic" approach, linking genetically predicted protein levels to clinical diagnoses in >40 000 individuals.
View Article and Find Full Text PDFThe cytochrome P450 (CYP)4F2 gene is known to influence mean coumarin dose. The aim of the present study was to undertake a meta-analysis at the individual patients level to capture the possible effect of ethnicity, gene-gene interaction, or other drugs on the association and to verify if inclusion of CYP4F2*3 variant into dosing algorithms improves the prediction of mean coumarin dose. We asked the authors of our previous meta-analysis (30 articles) and of 38 new articles retrieved by a systematic review to send us individual patients' data.
View Article and Find Full Text PDFDefining the full spectrum of human disease associated with a biomarker is necessary to advance the biomarker into clinical practice. We hypothesize that associating biomarker measurements with electronic health record (EHR) populations based on shared genetic architectures would establish the clinical epidemiology of the biomarker. We use Bayesian sparse linear mixed modeling to calculate SNP weightings for 53 biomarkers from the Atherosclerosis Risk in Communities study.
View Article and Find Full Text PDFElectrocardiographic PR interval measures atrio-ventricular depolarization and conduction, and abnormal PR interval is a risk factor for atrial fibrillation and heart block. Our genome-wide association study of over 92,000 European-descent individuals identifies 44 PR interval loci (34 novel). Examination of these loci reveals known and previously not-yet-reported biological processes involved in cardiac atrial electrical activity.
View Article and Find Full Text PDFContext: Mutations in alkaline phosphatase (AlkP), liver/bone/kidney (ALPL), which encodes tissue-nonspecific isozyme AlkP, cause hypophosphatasia (HPP). HPP is suspected by a low-serum AlkP. We hypothesized that some patients with bone or dental disease have undiagnosed HPP, caused by ALPL variants.
View Article and Find Full Text PDFDiastolic dysfunction (DD), an abnormality in cardiac left ventricular (LV) chamber compliance, is associated with increased morbidity and mortality. Although DD has been extensively studied in older populations, co-morbidity patterns are less well characterized in middle-aged subjects. We screened 156,434 subjects with transthoracic echocardiogram reports available through Vanderbilt's electronic heath record and identified 6,612 subjects 40 to 55 years old with an LV ejection fraction ≥50% and diastolic function staging.
View Article and Find Full Text PDFIn the version of this article originally published, one of the two authors with the name Wei Zhao was omitted from the author list and the affiliations for both authors were assigned to the single Wei Zhao in the author list. In addition, the ORCID for Wei Zhao (Department of Biostatistics and Epidemiology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA) was incorrectly assigned to author Wei Zhou. The errors have been corrected in the HTML and PDF versions of the article.
View Article and Find Full Text PDFIn the published version of this paper, the name of author Emanuele Di Angelantonio was misspelled. This error has now been corrected in the HTML and PDF versions of the article.
View Article and Find Full Text PDFGenome-wide association studies (GWAS) have identified >250 loci for body mass index (BMI), implicating pathways related to neuronal biology. Most GWAS loci represent clusters of common, noncoding variants from which pinpointing causal genes remains challenging. Here we combined data from 718,734 individuals to discover rare and low-frequency (minor allele frequency (MAF) < 5%) coding variants associated with BMI.
View Article and Find Full Text PDFObjective: Birth month and climate impact lifetime disease risk, while the underlying exposures remain largely elusive. We seek to uncover distal risk factors underlying these relationships by probing the relationship between global exposure variance and disease risk variance by birth season.
Material And Methods: This study utilizes electronic health record data from 6 sites representing 10.
Electronic health records (EHRs) linked with biobanks have been recognized as valuable data sources for pharmacogenomic studies, which require identification of patients with certain adverse drug reactions (ADRs) from a large population. Since manual chart review is costly and time-consuming, automatic methods to accurately identify patients with ADRs have been called for. In this study, we developed and compared different informatics approaches to identify ADRs from EHRs, using clopidogrel-induced bleeding as our case study.
View Article and Find Full Text PDFBackground: One potential use for the PR interval is as a biomarker of disease risk. We hypothesized that quantifying the shared genetic architectures of the PR interval and a set of clinical phenotypes would identify genetic mechanisms contributing to PR variability and identify diseases associated with a genetic predictor of PR variability.
Methods And Results: We used ECG measurements from the ARIC study (Atherosclerosis Risk in Communities; n=6731 subjects) and 63 genetically modulated diseases from the eMERGE network (Electronic Medical Records and Genomics; n=12 978).
Background: Genome-wide association studies have so far identified 56 loci associated with risk of coronary artery disease (CAD). Many CAD loci show pleiotropy; that is, they are also associated with other diseases or traits.
Objectives: This study sought to systematically test if genetic variants identified for non-CAD diseases/traits also associate with CAD and to undertake a comprehensive analysis of the extent of pleiotropy of all CAD loci.
The major goal of precision medicine is to improve human health. A feature that unites much research in the field is the use of large datasets such as genomic data and electronic health records. Research in this field includes examination of variation in the core bases of DNA and their methylation status, through variations in metabolic and signaling molecules, all the way up to broader systems level changes in physiology and disease presentation.
View Article and Find Full Text PDFBackground: Continued reductions in morbidity and mortality attributable to ischemic heart disease (IHD) require an understanding of the changing epidemiology of this disease. We hypothesized that we could use genetic correlations, which quantify the shared genetic architectures of phenotype pairs and extant risk factors from a historical prospective study to define the risk profile of a contemporary IHD phenotype.
Methods And Results: We used 37 phenotypes measured in the ARIC study (Atherosclerosis Risk in Communities; n=7716, European ancestry subjects) and clinical diagnoses from an electronic health record (EHR) data set (n=19 093).