Publications by authors named "William Harvey"

Motivation: Centromeres are chromosomal regions historically understudied with sequencing technologies due to their repetitive nature and short-read mapping limitations. However, recent improvements in long-read sequencing allow for the investigation of complex regions of the genome at the sequence and epigenetic levels.

Results: Here, we present Centromere Dip Region (CDR)-Finder: a tool to identify regions of hypomethylation within the centromeres of high-quality, contiguous genome assemblies.

View Article and Find Full Text PDF

Centromeres are chromosomal regions historically understudied with sequencing technologies due to their repetitive nature and short-read mapping limitations. However, recent improvements in long-read sequencing allowed for the investigation of complex regions of the genome at the sequence and epigenetic levels. Here, we present Centromere Dip Region (CDR)-Finder: a tool to identify regions of hypomethylation within the centromeres of high-quality, contiguous genome assemblies.

View Article and Find Full Text PDF

Clinical prediction models (CPMs) are tools that compute the risk of an outcome given a set of patient characteristics and are routinely used to inform patients, guide treatment decision-making, and resource allocation. Although much hope has been placed on CPMs to mitigate human biases, CPMs may potentially contribute to racial disparities in decision-making and resource allocation. While some policymakers, professional organizations, and scholars have called for eliminating race as a variable from CPMs, others raise concerns that excluding race may exacerbate healthcare disparities and this controversy remains unresolved.

View Article and Find Full Text PDF

Previous studies suggested that the copy number of the human salivary amylase gene, , correlates with starch-rich diets. However, evolutionary analyses are hampered by the absence of accurate, sequence-resolved haplotype variation maps. We identified 30 structurally distinct haplotypes at nucleotide resolution among 98 present-day humans, revealing that the coding sequences of copies are evolving under negative selection.

View Article and Find Full Text PDF

Haplotype information is crucial for biomedical and population genetics research. However, current strategies to produce de novo haplotype-resolved assemblies often require either difficult-to-acquire parental data or an intermediate haplotype-collapsed assembly. Here, we present Graphasing, a workflow which synthesizes the global phase signal of Strand-seq with assembly graph topology to produce chromosome-scale de novo haplotypes for diploid genomes.

View Article and Find Full Text PDF
Article Synopsis
  • * It achieves a high level of completeness, closing 92% of previous assembly gaps and fully assembling complex regions, including 1,852 complex structural variants and 1,246 human centromeres.
  • * The findings lead to significant improvements in genotyping accuracy and enable the detection of over 26,000 structural variants per sample, enhancing the potential for future disease association research.
View Article and Find Full Text PDF

Motivation: We are now in the era of being able to routinely generate highly contiguous (near telomere-to-telomere) genome assemblies of human and nonhuman species. Complex structural variation and regions of rapid evolutionary turnover are being discovered for the first time. Thus, efficient and informative visualization tools are needed to evaluate and directly observe structural differences between two or more genomes.

View Article and Find Full Text PDF

The 10q11.22 chromosomal region is a duplication-rich interval of the human genome and one of the last to be fully assembled. It carries copy number-variable genes associated with intellectual disability, bipolar disorder, and obesity.

View Article and Find Full Text PDF

Using five complementary short- and long-read sequencing technologies, we phased and assembled >95% of each diploid human genome in a four-generation, 28-member family (CEPH 1463) allowing us to systematically assess mutations (DNMs) and recombination. From this family, we estimate an average of 192 DNMs per generation, including 75.5 single-nucleotide variants (SNVs), 7.

View Article and Find Full Text PDF
Article Synopsis
  • The study presents detailed genomes of six ape species, achieving high accuracy and complete sequencing of all their chromosomes.
  • It addresses complex genomic regions, leading to enhanced understanding of evolutionary relationships among these species.
  • The findings will serve as a crucial resource for future research on human evolution and our closest ape relatives.
View Article and Find Full Text PDF
Article Synopsis
  • MUC5AC and MUC5B are special proteins that help protect our bodies by catching germs and helping us clear mucus!
  • Researchers studied the differences in these proteins by looking at DNA from humans and primates and found that MUC5B is mostly the same in humans, while MUC5AC has many variations!
  • The study also showed that people from East Asia have unique versions of the MUC5AC protein that might have helped them in survival, while another version is more common in Europeans!
View Article and Find Full Text PDF

Segmental duplications (SDs) contribute significantly to human disease, evolution, and diversity yet have been difficult to resolve at the sequence level. We present a population genetics survey of SDs by analyzing 170 human genome assemblies where the majority of SDs are fully resolved using long-read sequence assembly. Excluding the acrocentric short arms, we identify 173.

View Article and Find Full Text PDF
Article Synopsis
  • Apes have two sex chromosomes: the essential Y chromosome for male reproduction and the X chromosome necessary for both reproduction and cognition, with differences in mating patterns affecting their function.
  • Studying these chromosomes is challenging due to their repetitive structures, but researchers created gapless assemblies for five great apes and one lesser ape to explore their evolutionary complexities.
  • The Y chromosomes are highly variable and undergo significant changes compared to the more stable X chromosomes, and this research can provide insights into human evolution and aid in the conservation of endangered ape species.
View Article and Find Full Text PDF

Conventional life-history theory predicts that energy-demanding events such as reproduction and migration must be temporally segregated to avoid resource limitation. Here, we provide, to our knowledge, the first direct evidence of 'itinerant breeding' in a migratory bird, an incredibly rare breeding strategy (less than 0.1% of extant bird species) that involves the temporal overlap of migratory and reproductive periods of the annual cycle.

View Article and Find Full Text PDF

The secreted mucins MUC5AC and MUC5B play critical defensive roles in airway pathogen entrapment and mucociliary clearance by encoding large glycoproteins with variable number tandem repeats (VNTRs). These polymorphic and degenerate protein coding VNTRs make the loci difficult to investigate with short reads. We characterize the structural diversity of and by long-read sequencing and assembly of 206 human and 20 nonhuman primate (NHP) haplotypes.

View Article and Find Full Text PDF

Haplotype information is crucial for biomedical and population genetics research. However, current strategies to produce haplotype-resolved assemblies often require either difficult-to-acquire parental data or an intermediate haplotype-collapsed assembly. Here, we present Graphasing, a workflow which synthesizes the global phase signal of Strand-seq with assembly graph topology to produce chromosome-scale haplotypes for diploid genomes.

View Article and Find Full Text PDF

Down syndrome is the most common form of human intellectual disability caused by precocious segregation and nondisjunction of chromosome 21. Differences in centromere structure have been hypothesized to play a potential role in this process in addition to the well-established risk of advancing maternal age. Using long-read sequencing, we completely sequenced and assembled the centromeres from a parent-child trio where Trisomy 21 arose in the child as a result of a meiosis I error.

View Article and Find Full Text PDF
Article Synopsis
  • * We discovered over 1.3 million lineage-specific structural variants (SVs) that impact thousands of protein-coding genes and regulatory elements, revealing significant genomic differences among primates, especially compared to humans.
  • * Our research identified 1,607 regions with structural variations that are hotspots for gene loss and creation, indicating areas in the genome subject to rapid evolution and natural selection across primate species.
View Article and Find Full Text PDF

Advances in long-read sequencing (LRS) technologies continue to make whole-genome sequencing more complete, affordable, and accurate. LRS provides significant advantages over short-read sequencing approaches, including phased de novo genome assembly, access to previously excluded genomic regions, and discovery of more complex structural variants (SVs) associated with disease. Limitations remain with respect to cost, scalability, and platform-dependent read accuracy and the tradeoffs between sequence coverage and sensitivity of variant discovery are important experimental considerations for the application of LRS.

View Article and Find Full Text PDF
Article Synopsis
  • Apes have two main sex chromosomes, X and Y, where Y is crucial for male reproduction and its deletions can lead to infertility, while X is important for both reproduction and brain function.
  • Recent advancements in genomic techniques helped researchers create complete structures of the X and Y chromosomes for multiple great ape species, allowing them to explore their evolutionary complexities.
  • Findings indicate that Y chromosomes are highly variable and undergo rapid changes due to unique genetic regions and transposable elements, while X chromosomes are more stable, highlighting differing evolutionary paths among great ape species.
View Article and Find Full Text PDF

Background: Animals select and interact with their environment in various ways, including to ensure their physiology is at its optimal capacity, access to prey is possible, and predators can be avoided. Often conflicting, the balance of choices made may vary depending on an individual's life-history and condition. The common lizard (Zootoca vivipara) has egg-laying and live-bearing lineages and displays a variety of dorsal patterns and colouration.

View Article and Find Full Text PDF

The human Y chromosome has been notoriously difficult to sequence and assemble because of its complex repeat structure that includes long palindromes, tandem repeats and segmental duplications. As a result, more than half of the Y chromosome is missing from the GRCh38 reference sequence and it remains the last human chromosome to be finished. Here, the Telomere-to-Telomere (T2T) consortium presents the complete 62,460,029-base-pair sequence of a human Y chromosome from the HG002 genome (T2T-Y) that corrects multiple errors in GRCh38-Y and adds over 30 million base pairs of sequence to the reference, showing the complete ampliconic structures of gene families TSPY, DAZ and RBMY; 41 additional protein-coding genes, mostly from the TSPY family; and an alternating pattern of human satellite 1 and 3 blocks in the heterochromatic Yq12 region.

View Article and Find Full Text PDF

The prevalence of highly repetitive sequences within the human Y chromosome has prevented its complete assembly to date and led to its systematic omission from genomic analyses. Here we present de novo assemblies of 43 Y chromosomes spanning 182,900 years of human evolution and report considerable diversity in size and structure. Half of the male-specific euchromatic region is subject to large inversions with a greater than twofold higher recurrence rate compared with all other chromosomes.

View Article and Find Full Text PDF

Due to the COVID-19 pandemic the virus responsible, SARS-CoV-2, became a source of intense interest for non-expert audiences. The viral spike protein gained particular public interest as the main target for protective immune responses, including those elicited by vaccines. The rapid evolution of SARS-CoV-2 resulted in variations in the spike that enhanced transmissibility or weakened vaccine protection.

View Article and Find Full Text PDF

Objective: Missed or cancelled imaging tests may be invisible to the ordering clinician and result in diagnostic delay. We developed an outpatient results notification tool (ORNT) to alert physicians of patients' missed radiology studies.

Design: Randomised controlled evaluation of a quality improvement intervention.

View Article and Find Full Text PDF

A PHP Error was encountered

Severity: Warning

Message: fopen(/var/lib/php/sessions/ci_sessionc06lieg207q957ht0md056lc6simgel2): Failed to open stream: No space left on device

Filename: drivers/Session_files_driver.php

Line Number: 177

Backtrace:

File: /var/www/html/index.php
Line: 316
Function: require_once

A PHP Error was encountered

Severity: Warning

Message: session_start(): Failed to read session data: user (path: /var/lib/php/sessions)

Filename: Session/Session.php

Line Number: 137

Backtrace:

File: /var/www/html/index.php
Line: 316
Function: require_once