Publications by authors named "Wouter De Coster"

Structural variants (SVs) are important contributors to human disease. Their characterization remains however difficult due to their size and association with repetitive regions. Long-read sequencing (LRS) and optical genome mapping (OGM) can aid as their molecules span multiple kilobases and capture SVs in full.

View Article and Find Full Text PDF
Article Synopsis
  • * The 1000 Genomes Project and Oxford Nanopore Technologies are working together to produce LRS data from at least 800 samples to enhance the identification of genetic variations and better understand human genetic diversity.
  • * Initial analysis of 100 samples shows high accuracy in detecting genetic variants, including structural variants that disrupt gene function, and provides valuable data for the clinical genetics community to advance research on pathogenic variations.
View Article and Find Full Text PDF
Article Synopsis
  • Scientists found that existing tools for analyzing single-cell data couldn't handle big data very well.
  • They created a new tool called scywalker that makes it easier to analyze long-read sequencing data from single cells or nuclei.
  • Scywalker can do many tasks in one go, like checking data quality and identifying different cell types, and it's available for anyone to use online.
View Article and Find Full Text PDF

The lack of population-scale databases hampers research and diagnostics for medically relevant tandem repeats and repeat expansions. We attempt to fill this gap using our pathSTR web tool, which leverages long-read sequencing of large cohorts to determine repeat length and sequence composition in a healthy population. The current version includes 1040 individuals of The 1000 Genomes Project cohort sequenced on the Oxford Nanopore Technologies PromethION.

View Article and Find Full Text PDF

Frontotemporal lobar degeneration with neuronal inclusions of the TAR DNA-binding protein 43 (FTLD-TDP) is a fatal neurodegenerative disorder with only a limited number of risk loci identified. We report our comprehensive genome-wide association study as part of the International FTLD-TDP Whole-Genome Sequencing Consortium, including 985 cases and 3,153 controls, and meta-analysis with the Dementia-seq cohort, compiled from 26 institutions/brain banks in the United States, Europe and Australia. We confirm as the strongest overall FTLD-TDP risk factor and identify as a novel FTLD-TDP risk factor.

View Article and Find Full Text PDF

Tandem repeats (TRs) are highly polymorphic in the human genome, have thousands of associated molecular traits and are linked to over 60 disease phenotypes. However, they are often excluded from at-scale studies because of challenges with variant calling and representation, as well as a lack of a genome-wide standard. Here, to promote the development of TR methods, we created a catalog of TR regions and explored TR properties across 86 haplotype-resolved long-read human assemblies.

View Article and Find Full Text PDF
Article Synopsis
  • * The 1000 Genomes Project ONT Sequencing Consortium is working to generate LRS data from at least 800 samples to better understand human genetic variation and improve variant detection.
  • * Initial data from the first 100 samples show high accuracy in identifying structural variants and methylation signatures, creating a useful public resource for finding disease-related genetic changes.
View Article and Find Full Text PDF

Tandem repeats (TRs) are highly polymorphic in the human genome, have thousands of associated molecular traits, and are linked to over 60 disease phenotypes. However, their complexity often excludes them from at-scale studies due to challenges with variant calling, representation, and lack of a genome-wide standard. To promote TR methods development, we create a comprehensive catalog of TR regions and explore its properties across 86 samples.

View Article and Find Full Text PDF

Summary: Increases in the cohort size in long-read sequencing projects necessitate more efficient software for quality assessment and processing of sequencing data from Oxford Nanopore Technologies and Pacific Biosciences. Here, we describe novel tools for summarizing experiments, filtering datasets, visualizing phased alignments results, and updates to the NanoPack software suite.

Availability And Implementation: The cramino, chopper, kyber, and phasius tools are written in Rust and available as executable binaries without requiring installation or managing dependencies.

View Article and Find Full Text PDF

Acinetobacter baumannii is an opportunistic pathogenic bacterium prioritized by WHO and CDC because of its increasing antibiotic resistance. Heterogeneity among strains represents the hallmark of A. baumannii bacteria.

View Article and Find Full Text PDF

In October 2021, 59 scientists from 14 countries and 13 U.S. states collaborated virtually in the Third Annual Baylor College of Medicine & DNANexus Structural Variation hackathon.

View Article and Find Full Text PDF

In this study, we characterize a new collection that comprises multidrug-resistant (MDR), extensively drug-resistant (XDR), pandrug-resistant (PDR), and carbapenem-resistant modern clinical isolates of Acinetobacter baumannii collected from hospitals through national microbiological surveillance in Belgium. Bacterial isolates ( = 43) were subjected to whole-genome sequencing (WGS), combining Illumina (MiSeq) and Nanopore (MinION) technologies, from which high-quality genomes (chromosome and plasmids) were assembled. Antimicrobial susceptibility testing was performed along with genome analyses, which identified intrinsic and acquired resistance determinants along with their genetic environments and vehicles.

View Article and Find Full Text PDF
Article Synopsis
  • * The event aimed to assess the current status of research, highlight ongoing challenges, and explore how to leverage various strengths to enhance scientific progress.
  • * Over four days, eight groups developed new open-source methods to improve species variation analysis and created a resource for the research community, with daily summaries and methods available on GitHub.
View Article and Find Full Text PDF

Long-read sequencing technologies have now reached a level of accuracy and yield that allows their application to variant detection at a scale of tens to thousands of samples. Concomitant with the development of new computational tools, the first population-scale studies involving long-read sequencing have emerged over the past 2 years and, given the continuous advancement of the field, many more are likely to follow. In this Review, we survey recent developments in population-scale long-read sequencing, highlight potential challenges of a scaled-up approach and provide guidance regarding experimental design.

View Article and Find Full Text PDF

Inactivating variants as well as a missense variant in the centrosomal gene have been identified in autosomal recessive cone-rod dystrophy with hearing loss (CRDHL), a rare syndromic inherited retinal disease distinct from Usher syndrome. Apart from this, a complex structural variant (SV) implicating has been reported in CRDHL. Here we aimed to expand the genetic architecture of typical CRDHL by the identification of complex SVs of the region and characterization of their underlying mechanisms.

View Article and Find Full Text PDF

Long-read sequencing has substantial advantages for structural variant discovery and phasing of variants compared to short-read technologies, but the required and optimal read length has not been assessed. In this work, we used long reads simulated from human genomes and evaluated structural variant discovery and variant phasing using current best practice bioinformatics methods. We determined that optimal discovery of structural variants from human genomes can be obtained with reads of minimally 20 kb.

View Article and Find Full Text PDF

Summary: Modified nucleotides play a crucial role in gene expression regulation. Here, we describe methplotlib, a tool developed for the visualization of modified nucleotides detected from Oxford Nanopore Technologies sequencing platforms, together with additional scripts for statistical analysis of allele-specific modification within-subjects and differential modification frequency across subjects.

Availability And Implementation: The methplotlib command-line tool is written in Python3, is compatible with Linux, Mac OS and the MS Windows 10 Subsystem for Linux and released under the MIT license.

View Article and Find Full Text PDF

Technological limitations have hindered the large-scale genetic investigation of tandem repeats in disease. We show that long-read sequencing with a single Oxford Nanopore Technologies PromethION flow cell per individual achieves 30× human genome coverage and enables accurate assessment of tandem repeats including the 10,000-bp Alzheimer's disease-associated ABCA7 VNTR. The Guppy "flip-flop" base caller and tandem-genotypes tandem repeat caller are efficient for large-scale tandem repeat assessment, but base calling and alignment challenges persist.

View Article and Find Full Text PDF

We sequenced the genome of the Yoruban reference individual NA19240 on the long-read sequencing platform Oxford Nanopore PromethION for evaluation and benchmarking of recently published aligners and germline structural variant calling tools, as well as a comparison with the performance of structural variant calling from short-read sequencing data. The structural variant caller Sniffles after NGMLR or minimap2 alignment provides the most accurate results, but additional confidence or sensitivity can be obtained by a combination of multiple variant callers. Sensitive and fast results can be obtained by minimap2 for alignment and a combination of Sniffles and SVIM for variant identification.

View Article and Find Full Text PDF

A substantial amount of structural variation in the human genome remains uninvestigated due to the limitations of existing technologies, the presence of repetitive sequences, and the complexity of a diploid genome. New technologies have been developed, increasing resolution and appreciation of structural variation and how it affects human diversity and disease. The genetic etiology of most patients with complex disorders such as neurodegenerative brain diseases is not yet elucidated, complicating disease diagnosis, genetic counseling, and understanding of underlying pathological mechanisms needed to develop therapeutic interventions.

View Article and Find Full Text PDF

Emerging evidence suggested a converging mechanism in neurodegenerative brain diseases (NBD) involving early neuronal network dysfunctions and alterations in the homeostasis of neuronal firing as culprits of neurodegeneration. In this study, we used paired-end short-read and direct long-read whole genome sequencing to investigate an unresolved autosomal dominant dementia family significantly linked to 7q36. We identified and validated a chromosomal inversion of ca.

View Article and Find Full Text PDF

We previously reported a granulin (GRN) null mutation, originating from a common founder, in multiple Belgian families with frontotemporal dementia. Here, we used data of a 10-year follow-up study to describe in detail the clinical heterogeneity observed in this extended founder pedigree. We identified 85 patients and 40 unaffected mutation carriers, belonging to 29 branches of the founder pedigree.

View Article and Find Full Text PDF

Summary: Here we describe NanoPack, a set of tools developed for visualization and processing of long-read sequencing data from Oxford Nanopore Technologies and Pacific Biosciences.

Availability And Implementation: The NanoPack tools are written in Python3 and released under the GNU GPL3.0 License.

View Article and Find Full Text PDF

Premature termination codon (PTC) mutations in the ATP-Binding Cassette, Sub-Family A, Member 7 gene (ABCA7) have recently been identified as intermediate-to-high penetrant risk factor for late-onset Alzheimer's disease (LOAD). High variability, however, is observed in downstream ABCA7 mRNA and protein expression, disease penetrance, and onset age, indicative of unknown modifying factors. Here, we investigated the prevalence and disease penetrance of ABCA7 PTC mutations in a large early onset AD (EOAD)-control cohort, and examined the effect on transcript level with comprehensive third-generation long-read sequencing.

View Article and Find Full Text PDF