The following sections are included: Overview, Advancing multi-ancestry genetic research, Integrating social determinants of health to enhance genetic risk models, Methods to detect and mitigate disparities, Addressing Disparities in Adverse Drug Reactions, Conclusion, Acknowledgments,References.
View Article and Find Full Text PDFBackground: There are known disparities in incidence and outcomes of colorectal cancer (CRC) by race and ethnicity. Some of these disparities may be mediated by molecular changes in tumors that occur at different rates across populations. Genetic ancestry is a measure complementary to race and ethnicity that can overcome missing data issues and better capture genetic similarity in admixed populations.
View Article and Find Full Text PDFThe incompleteness of race and ethnicity information in real-world data (RWD) hampers its utility in promoting healthcare equity. This study introduces two methods-one heuristic and the other machine learning-based-to impute race and ethnicity from genetic ancestry using tumor profiling data. Analyzing de-identified data from over 100,000 cancer patients sequenced with the Tempus xT panel, we demonstrate that both methods outperform existing geolocation and surname-based methods, with the machine learning approach achieving high recall (range: 0.
View Article and Find Full Text PDFThe following sections are included:OverviewDealing with the lack of diversity in current research datasetsDevelopment of fair machine learning algorithmsRace, genetic ancestry, and population structureConclusionAcknowledgments.
View Article and Find Full Text PDFScientists have been trying to identify every gene in the human genome since the initial draft was published in 2001. In the years since, much progress has been made in identifying protein-coding genes, currently estimated to number fewer than 20,000, with an ever-expanding number of distinct protein-coding isoforms. Here we review the status of the human gene catalogue and the efforts to complete it in recent years.
View Article and Find Full Text PDFSummary: RNA sequencing (RNA-seq) can be applied to diverse tasks including quantifying gene expression, discovering quantitative trait loci and identifying gene fusion events. Although RNA-seq can detect germline variants, the complexities of variable transcript abundance, target capture and amplification introduce challenging sources of error. Here, we extend DeepVariant, a deep-learning-based variant caller, to learn and account for the unique challenges presented by RNA-seq data.
View Article and Find Full Text PDFBackground: Endocrine-resistant HR+/HER2- breast cancer (BC) and triple-negative BC (TNBC) are of interest for molecularly informed treatment due to their aggressive natures and limited treatment profiles. Patients of African Ancestry (AA) experience higher rates of TNBC and mortality than European Ancestry (EA) patients, despite lower overall BC incidence. Here, we compare the molecular landscapes of AA and EA patients with HR+/HER2- BC and TNBC in a real-world cohort to promote equity in precision oncology by illuminating the heterogeneity of potentially druggable genomic and transcriptomic pathways.
View Article and Find Full Text PDFScientists have been trying to identify all of the genes in the human genome since the initial draft of the genome was published in 2001. Over the intervening years, much progress has been made in identifying protein-coding genes, and the estimated number has shrunk to fewer than 20,000, although the number of distinct protein-coding isoforms has expanded dramatically. The invention of high-throughput RNA sequencing and other technological breakthroughs have led to an explosion in the number of reported non-coding RNA genes, although most of them do not yet have any known function.
View Article and Find Full Text PDFThe following sections are included: Overview, Equitable risk prediction, Pharmacoequity, Race, genetic ancestry, and population structure, Conclusion, Acknowledgments, References.
View Article and Find Full Text PDFWhile many genetic diseases have effective treatments, they frequently progress rapidly to severe morbidity or mortality if those treatments are not implemented immediately. Since front-line physicians frequently lack familiarity with these diseases, timely molecular diagnosis may not improve outcomes. Herein we describe Genome-to-Treatment, an automated, virtual system for genetic disease diagnosis and acute management guidance.
View Article and Find Full Text PDFThe incidence and mortality of early onset colorectal cancer (EOCRC) is rising; outcomes appear to differ by race and ethnicity. We aimed to assess differences in mutational landscape and gene expression of EOCRC by racial and ethnic groups (non-Hispanic Asian, non-Hispanic Black, non-Hispanic White, White Hispanic) using data from the American Association for Cancer Research Project GENIE (10.2) and University of Texas Southwestern, the latter enriched in Hispanic patients.
View Article and Find Full Text PDFBackground: Clinical interpretation of genetic variants in the context of the patient's phenotype is becoming the largest component of cost and time expenditure for genome-based diagnosis of rare genetic diseases. Artificial intelligence (AI) holds promise to greatly simplify and speed genome interpretation by integrating predictive methods with the growing knowledge of genetic disease. Here we assess the diagnostic performance of Fabric GEM, a new, AI-based, clinical decision support tool for expediting genome interpretation.
View Article and Find Full Text PDFIn patients with invasive breast cancer, fluorescence in situ hybridization (FISH) testing for HER2 typically demonstrates the clear presence or lack of ERBB2 (HER2) amplification (i.e., groups 1 or 5).
View Article and Find Full Text PDFIn the version of this article initially published online, two pairs of headings were switched with each other in Table 4: "Recall (PCR free)" was switched with "Recall (with PCR)," and "Precision (PCR free)" was switched with "Precision (with PCR)." The error has been corrected in the print, PDF and HTML versions of this article.
View Article and Find Full Text PDFStandardized benchmarking approaches are required to assess the accuracy of variants called from sequence data. Although variant-calling tools and the metrics used to assess their performance continue to improve, important challenges remain. Here, as part of the Global Alliance for Genomics and Health (GA4GH), we present a benchmarking framework for variant calling.
View Article and Find Full Text PDFA new study highlights the biases and inaccuracies of polygenic risk scores (PRS) when predicting disease risk in individuals from populations other than those used in their derivation. The design bias of workhorse tools used for research, particularly genotyping arrays, contributes to these distortions. To avoid further inequities in health outcomes, the inclusion of diverse populations in research, unbiased genotyping, and methods of bias reduction in PRS are critical.
View Article and Find Full Text PDFIn vitro cancer cultures, including three-dimensional organoids, typically contain exclusively neoplastic epithelium but require artificial reconstitution to recapitulate the tumor microenvironment (TME). The co-culture of primary tumor epithelia with endogenous, syngeneic tumor-infiltrating lymphocytes (TILs) as a cohesive unit has been particularly elusive. Here, an air-liquid interface (ALI) method propagated patient-derived organoids (PDOs) from >100 human biopsies or mouse tumors in syngeneic immunocompetent hosts as tumor epithelia with native embedded immune cells (T, B, NK, macrophages).
View Article and Find Full Text PDFBackground: Medulloblastoma is associated with rare hereditary cancer predisposition syndromes; however, consensus medulloblastoma predisposition genes have not been defined and screening guidelines for genetic counselling and testing for paediatric patients are not available. We aimed to assess and define these genes to provide evidence for future screening guidelines.
Methods: In this international, multicentre study, we analysed patients with medulloblastoma from retrospective cohorts (International Cancer Genome Consortium [ICGC] PedBrain, Medulloblastoma Advanced Genomics International Consortium [MAGIC], and the CEFALO series) and from prospective cohorts from four clinical studies (SJMB03, SJMB12, SJYC07, and I-HIT-MED).
Next-generation deep sequencing of gene panels is being adopted as a diagnostic test to identify actionable mutations in cancer patient samples. However, clinical samples, such as formalin-fixed, paraffin-embedded specimens, frequently provide low quantities of degraded, poor quality DNA. To overcome these issues, many sequencing assays rely on extensive PCR amplification leading to an accumulation of bias and artifacts.
View Article and Find Full Text PDFMotivation: Variant calling from next-generation sequencing (NGS) data is susceptible to false positive calls due to sequencing, mapping and other errors. To better distinguish true from false positive calls, we present a method that uses genotype array data from the sequenced samples, rather than public data such as HapMap or dbSNP, to train an accurate classifier using Random Forests. We demonstrate our method on a set of variant calls obtained from 642 African-ancestry genomes from the Consortium on Asthma among African-ancestry Populations in the Americas (CAAPA), sequenced to high depth (30X).
View Article and Find Full Text PDFProc Natl Acad Sci U S A
November 2015
Although a variety of genetic alterations have been found across cancer types, the identification and functional characterization of candidate driver genetic lesions in an individual patient and their translation into clinically actionable strategies remain major hurdles. Here, we use whole genome sequencing of a prostate cancer tumor, computational analyses, and experimental validation to identify and predict novel oncogenic activity arising from a point mutation in the phosphatase and tensin homolog (PTEN) tumor suppressor protein. We demonstrate that this mutation (p.
View Article and Find Full Text PDFPopulation scale sequencing of whole human genomes is becoming economically feasible; however, data management and analysis remains a formidable challenge for many research groups. Large sequencing studies, like the 1000 Genomes Project, have improved our understanding of human demography and the effect of rare genetic variation in disease. Variant calling on datasets of hundreds or thousands of genomes is time-consuming, expensive, and not easily reproducible given the myriad components of a variant calling pipeline.
View Article and Find Full Text PDF