Transposable elements are ubiquitous mobile DNA sequences generating insertion polymorphisms, contributing to genomic diversity. We present GraffiTE, a flexible pipeline to analyze polymorphic mobile elements insertions. By integrating state-of-the-art structural variant detection algorithms and graph genomes, GraffiTE identifies polymorphic mobile elements from genomic assemblies or long-read sequencing data, and genotypes these variants using short or long read sets.
View Article and Find Full Text PDFPosterior fossa group A (PFA) ependymoma is a lethal brain cancer diagnosed in infants and young children. The lack of driver events in the PFA linear genome led us to search its 3D genome for characteristic features. Here, we reconstructed 3D genomes from diverse childhood tumor types and uncovered a global topology in PFA that is highly reminiscent of stem and progenitor cells in a variety of human tissues.
View Article and Find Full Text PDFThe COVID-19 pandemic led to a large global effort to sequence SARS-CoV-2 genomes from patient samples to track viral evolution and inform public health response. Millions of SARS-CoV-2 genome sequences have been deposited in global public repositories. The Canadian COVID-19 Genomics Network (CanCOGeN - VirusSeq), a consortium tasked with coordinating expanded sequencing of SARS-CoV-2 genomes across Canada early in the pandemic, created the Canadian VirusSeq Data Portal, with associated data pipelines and procedures, to support these efforts.
View Article and Find Full Text PDFThe basal breast cancer subtype is enriched for triple-negative breast cancer (TNBC) and displays consistent large chromosomal deletions. Here, we characterize evolution and maintenance of chromosome 4p (chr4p) loss in basal breast cancer. Analysis of The Cancer Genome Atlas data shows recurrent deletion of chr4p in basal breast cancer.
View Article and Find Full Text PDFMotivation: Human epigenomic data has been generated by large consortia for thousands of cell types to be used as a reference map of normal and disease chromatin states. Since epigenetic data contains potentially identifiable information, similarly to genetic data, most raw files generated by these consortia are stored in controlled-access databases. It is important to protect identifiable information, but this should not hinder secure sharing of these valuable datasets.
View Article and Find Full Text PDFHumans display remarkable interindividual variation in their immune response to identical challenges. Yet, our understanding of the genetic and epigenetic factors contributing to such variation remains limited. Here we performed in-depth genetic, epigenetic and transcriptional profiling on primary macrophages derived from individuals of European and African ancestry before and after infection with influenza A virus.
View Article and Find Full Text PDFChildhood B-cell acute lymphoblastic leukemia (B-ALL) is a heterogeneous disease comprising multiple molecular subgroups with subtype-specific expression profiles. Recently, a new type of ncRNA, termed circular RNA (circRNA), has emerged as a promising biomarker in cancer, but little is known about their role in childhood B-ALL. Here, through RNA-seq analysis in 105 childhood B-ALL patients comprising six genetic subtypes and seven B-cell controls from two independent cohorts we demonstrated that circRNAs properly stratified B-ALL subtypes.
View Article and Find Full Text PDFWhole genome sequencing (WGS) at high-depth (30X) allows the accurate discovery of variants in the coding and non-coding DNA regions and helps elucidate the genetic underpinnings of human health and diseases. Yet, due to the prohibitive cost of high-depth WGS, most large-scale genetic association studies use genotyping arrays or high-depth whole exome sequencing (WES). Here we propose a cost-effective method which we call "Whole Exome Genome Sequencing" (WEGS), that combines low-depth WGS and high-depth WES with up to 8 samples pooled and sequenced simultaneously (multiplexed).
View Article and Find Full Text PDFRare DNA alterations that cause heritable diseases are only partially resolvable by clinical next-generation sequencing due to the difficulty of detecting structural variation (SV) in all genomic contexts. Long-read, high fidelity genome sequencing (HiFi-GS) detects SVs with increased sensitivity and enables assembling personal and graph genomes. We leverage standard reference genomes, public assemblies (n = 94) and a large collection of HiFi-GS data from a rare disease program (Genomic Answers for Kids, GA4K, n = 574 assemblies) to build a graph genome representing a unified SV callset in GA4K, identify common variation and prioritize SVs that are more likely to cause genetic disease (MAF < 0.
View Article and Find Full Text PDFHere, we exploit a deep serological profiling strategy coupled with an integrated, computational framework for the analysis of SARS-CoV-2 humoral immune responses. Applying a high-density peptide array (HDPA) spanning the entire proteomes of SARS-CoV-2 and endemic human coronaviruses allowed identification of B cell epitopes and relate them to their evolutionary and structural properties. We identify hotspots of pre-existing immunity and identify cross-reactive epitopes that contribute to increasing the overall humoral immune response to SARS-CoV-2.
View Article and Find Full Text PDFMotivation: Human epigenomic data has been generated by large consortia for thousands of cell types to be used as a reference map of normal and disease chromatin states. Since epigenetic data contains potentially identifiable information, similarly to genetic data, most raw files generated by these consortia are stored in controlled-access databases. It is important to protect identifiable information, but this should not hinder secure sharing of these valuable datasets.
View Article and Find Full Text PDFTranscription-factor binding to cis-regulatory regions regulates the gene expression program of a cell, but occupancy is often a poor predictor of the gene response. Here, we show that glucocorticoid stimulation led to the reorganization of transcriptional coregulators MED1 and BRD4 within topologically associating domains (TADs), resulting in active or repressive gene environments. Indeed, we observed a bias toward the activation or repression of a TAD when their activities were defined by the number of regions gaining and losing MED1 and BRD4 following dexamethasone (Dex) stimulation.
View Article and Find Full Text PDFZMYM2 is a transcriptional repressor whose role in development is largely unexplored. We found that Zmym2-/- mice show embryonic lethality by E10.5.
View Article and Find Full Text PDFInfluenza A virus (IAV) infections are frequent every year and result in a range of disease severity. Here, we wanted to explore the potential contribution of transposable elements (TEs) to the variable human immune response. Transcriptome profiling in monocyte-derived macrophages from 39 individuals following IAV infection revealed significant inter-individual variation in viral load post-infection.
View Article and Find Full Text PDFBackground: Children conceived through assisted reproduction are at an increased risk for growth and genomic imprinting disorders, often linked to DNA methylation defects. It has been suggested that assisted reproductive technology (ART) and underlying parental infertility can induce epigenetic instability, specifically interfering with DNA methylation reprogramming events during germ cell and preimplantation development. To date, human studies exploring the association between ART and DNA methylation defects have reported inconsistent or inconclusive results, likely due to population heterogeneity and the use of technologies with limited coverage of the epigenome.
View Article and Find Full Text PDFHere the Human Pangenome Reference Consortium presents a first draft of the human pangenome reference. The pangenome contains 47 phased, diploid assemblies from a cohort of genetically diverse individuals. These assemblies cover more than 99% of the expected sequence in each genome and are more than 99% accurate at the structural and base pair levels.
View Article and Find Full Text PDFHostSeq was launched in April 2020 as a national initiative to integrate whole genome sequencing data from 10,000 Canadians infected with SARS-CoV-2 with clinical information related to their disease experience. The mandate of HostSeq is to support the Canadian and international research communities in their efforts to understand the risk factors for disease and associated health outcomes and support the development of interventions such as vaccines and therapeutics. HostSeq is a collaboration among 13 independent epidemiological studies of SARS-CoV-2 across five provinces in Canada.
View Article and Find Full Text PDFPerivascular space (PVS) burden is an emerging, poorly understood, magnetic resonance imaging marker of cerebral small vessel disease, a leading cause of stroke and dementia. Genome-wide association studies in up to 40,095 participants (18 population-based cohorts, 66.3 ± 8.
View Article and Find Full Text PDFSummary: Large-scale sharing of genomic quantification data requires standardized access interfaces. In this Global Alliance for Genomics and Health project, we developed RNAget, an API for secure access to genomic quantification data in matrix form. RNAget provides for slicing matrices to extract desired subsets of data and is applicable to all expression matrix-format data, including RNA sequencing and microarrays.
View Article and Find Full Text PDFBackground: The prevalence of medication nonadherence in the setting of resistant hypertension (RH) varies from 5% to 80% in the published literature. The aim of this systematic review was to establish the overall prevalence of nonadherence and evaluate the effect of the method of assessment on this estimate.
Methods: MEDLINE, EMBASE, Cochrane, CINAHL, and Web of Science (database inception to November 2020) were searched for relevant articles.
Papillary thyroid carcinoma (PTC) is the most common malignancy of the thyroid gland and early stages are curable. However, a subset of PTCs shows an unusually aggressive phenotype with extensive lymph node metastasis and higher incidence of locoregional recurrence. In this study, we investigated a large cohort of PTC cases with an unusual aggressive phenotype using a high-throughput RNA sequencing (RNA-Seq) to identify differentially regulated genes associated with metastatic PTC.
View Article and Find Full Text PDF