Publications by authors named "Stoye J"

Article Synopsis
  • - Using only one linear reference genome limits the understanding of genomic diversity; the draft human pangenome shows the need for pangenomics to address these gaps and capture more genetic variation.
  • - A new tool called Panacus (pangenome-abacus) has been developed to efficiently analyze pangenomes, capable of processing large human pangenome graphs quickly, producing interactive visualizations in under an hour.
  • - Panacus is open-source and built in Rust, available for installation through Bioconda, with its source code and documentation accessible on GitHub.
View Article and Find Full Text PDF

Plasmids are a key vector of antibiotic resistance, but the current bioinformatics toolkit is not well suited to tracking them. The rapid structural changes seen in plasmid genomes present considerable challenges to evolutionary and epidemiological analysis. Typical approaches are either low resolution (replicon typing) or use shared k-mer content to define a genetic distance.

View Article and Find Full Text PDF

Motivation: Using a single linear reference genome poses a limitation to exploring the full genomic diversity of a species. The release of a draft human pangenome underscores the increasing relevance of pangenomics to overcome these limitations. Pangenomes are commonly represented as graphs, which can represent billions of base pairs of sequence.

View Article and Find Full Text PDF

Computational pangenomics deals with the joint analysis of all genomic sequences of a species. It has already been successfully applied to various tasks in many research areas. Further advances in DNA sequencing technologies constantly let more and more genomic sequences become available for many species, leading to an increasing attractiveness of pangenomic studies.

View Article and Find Full Text PDF

The comparison of large-scale genome structures across distinct species offers valuable insights into the species' phylogeny, genome organization, and gene associations. In this chapter, we review the family-free genome comparison tool FFGC that, relying on built-in interfaces with a sequence comparison tool (either BLAST+ or DIAMOND) and with an ILP solver (either CPLEX or Gurobi), provides several methods for analyses that do not require prior classification of genes across the studied genomes. Taking annotated genome sequences as input, FFGC is a complete workflow for genome comparison allowing not only the computation of measures of similarity and dissimilarity but also the inference of gene families, simultaneously based on sequence similarities and large-scale genomic features.

View Article and Find Full Text PDF

Background: Two genomes [Formula: see text] and [Formula: see text] over the same set of gene families form a canonical pair when each of them has exactly one gene from each family. Denote by [Formula: see text] the number of common families of [Formula: see text] and [Formula: see text]. Different distances of canonical genomes can be derived from a structure called breakpoint graph, which represents the relation between the two given genomes as a collection of cycles of even length and paths.

View Article and Find Full Text PDF

Genomic regions under positive selection harbor variation linked for example to adaptation. Most tools for detecting positively selected variants have computational resource requirements rendering them impractical on population genomic datasets with hundreds of thousands of individuals or more. We have developed and implemented an efficient haplotype-based approach able to scan large datasets and accurately detect positive selection.

View Article and Find Full Text PDF

X-linked retinoschisis (XLRS) is a rare vitreoretinal dystrophy caused by molecular genetic changes in the gene. It usually manifests itself at a young age with symmetrical splitting within different layers of the retina and leads to a significant reduction in visual acuity. Correct diagnosis at older ages is difficult due to nonspecific changes in OCT scans.

View Article and Find Full Text PDF

SARS-CoV-2 is a betacoronavirus and the etiological agent of COVID-19, a devastating infectious disease. Due to its far-reaching effect on human health, there is an urgent and growing need to understand the viral molecular biology of SARS-CoV-2 and its interaction with the host cell. SARS-CoV-2 encodes 9 predicted accessory proteins, which are presumed to be dispensable for in vitro replication, most likely having a role in modulating the host cell environment to aid viral replication.

View Article and Find Full Text PDF

One of the most basic kinds of analysis to be performed on a pangenome is the detection of its core, i.e., the information shared among all members.

View Article and Find Full Text PDF

Eukaryotes are continually subjected to viral infections and, in response, have evolved a wide range of defence mechanisms. Two recent studies show how a duplicated copy of a cellular protein needed for cell growth and virus egress evolved to inhibit viruses while preserving cell viability.

View Article and Find Full Text PDF

Viruses in the family are found in a wide variety of vertebrate hosts. Enveloped virions are 80-100 nm in diameter with an inner core containing the viral genome and replicative enzymes. Core morphology is often characteristic for viruses within the same genus.

View Article and Find Full Text PDF

Excessive replication of Saccharomyces cerevisiae Ty1 retrotransposons is regulated by Copy Number Control, a process requiring the p22/p18 protein produced from a sub-genomic transcript initiated within Ty1 GAG. In retrotransposition, Gag performs the capsid functions required for replication and re-integration. To minimize genomic damage, p22/p18 interrupts virus-like particle function by interaction with Gag.

View Article and Find Full Text PDF

Sterile α-motif/histidine-aspartate domain-containing protein 1 (SAMHD1) inhibits replication of HIV-1 in quiescent myeloid cells. U937 cells are widely used as a convenient cell system for analyzing SAMHD1 activity due to a low level of SAMHD1 RNA expression, leading to undetectable endogenous protein expression. Based on similar assays developed in the Stoye laboratory to characterize other retroviral restriction factors, the Bishop lab developed a two-color restriction assay to analyze SAMHD1 in U937 cells.

View Article and Find Full Text PDF

We use unique data from seven intermediate economics courses taught at four R1 institutions to examine the effects of the COVID-19 pandemic on student learning. Because the same assessments of course knowledge mastery were administered across semesters, we can cleanly infer the impact of the unanticipated switch to remote teaching in Spring 2020. During the pandemic, total assessment scores declined by 0.

View Article and Find Full Text PDF

The genomes of inbred mice harbor around 50 endogenous murine leukemia virus (MLV) loci, although the specific complement varies greatly between strains. The Gv1 locus is known to control the transcription of endogenous MLVs and to be the dominant determinant of cell-surface presentation of MLV envelope, the GIX antigen. Here, we identify a single Krüppel-associated box zinc finger protein (ZFP) gene, Zfp998, as Gv1 and show it to be necessary and sufficient to determine the GIX+ phenotype.

View Article and Find Full Text PDF

Motivation: Increasing amounts of individual genomes sequenced per species motivate the usage of pangenomic approaches. Pangenomes may be represented as graphical structures, e.g.

View Article and Find Full Text PDF

Tumors contain multiple subpopulations of genetically distinct cancer cells. Reconstructing their evolutionary history can improve our understanding of how cancers develop and respond to treatment. Subclonal reconstruction methods cluster mutations into groups that co-occur within the same subpopulations, estimate the frequency of cells belonging to each subpopulation, and infer the ancestral relationships among the subpopulations by constructing a clone tree.

View Article and Find Full Text PDF

The computation of genomic distances has been a very active field of computational comparative genomics over the past 25 years. Substantial results include the polynomial-time computability of the inversion distance by Hannenhalli and Pevzner in 1995 and the introduction of the double cut and join distance by Yancopoulos et al. in 2005.

View Article and Find Full Text PDF

Third-generation sequencing technologies from companies such as Oxford Nanopore and Pacific Biosciences have paved the way for building more contiguous and potentially gap-free assemblies. The larger effective length of their reads has provided a means to overcome the challenges of short to mid-range repeats. Currently, accurate long read assemblers are computationally expensive, whereas faster methods are not as accurate.

View Article and Find Full Text PDF

Viruses and their hosts are locked in an evolutionary race where resistance to infection is acquired by the hosts while viruses develop strategies to circumvent these host defenses. Forming one arm of the host defense armory are cell autonomous restriction factors like Fv1. Originally described as protecting laboratory mice from infection by murine leukemia virus (MLV), Fv1s from some wild mice have also been found to restrict non-MLV retroviruses, suggesting an important role in the protection against viruses in nature.

View Article and Find Full Text PDF

As obligate parasites, viruses highjack, modify and repurpose the cellular machinery for their own replication. Viral proteins have, therefore, evolved biological functions, such as signalling potential, that alter host cell physiology in ways that are still incompletely understood. Retroviral envelope glycoproteins interact with several host proteins, extracellularly with their cellular receptor and anti-envelope antibodies, and intracellularly with proteins of the cytoskeleton or sorting, endocytosis and recirculation pathways.

View Article and Find Full Text PDF
Computing the Inversion-Indel Distance.

IEEE/ACM Trans Comput Biol Bioinform

January 2022

The inversion distance, that is the distance between two unichromosomal genomes with the same content allowing only inversions of DNA segments, can be exactly computed thanks to a pioneering approach of Hannenhalli and Pevzner from 1995. In 2000, El-Mabrouk extended the inversion model to perform the comparison of unichromosomal genomes with unequal contents, combining inversions with insertions and deletions (indels) of DNA segments, giving rise to the inversion-indel distance. However, only a heuristic was provided for its computation.

View Article and Find Full Text PDF