An annotation is a set of genomic intervals sharing a particular function or property. Examples include genes or their exons, sequence repeats, regions with a particular epigenetic state, and copy number variants. A common task is to compare two annotations to determine if one is enriched or depleted in the regions covered by the other.
View Article and Find Full Text PDFMicrobiol Resour Announc
September 2024
We report the genome sequence of the pathogenic yeast strain SR23 (CBS 7157) used in a number of experimental studies. The nuclear genome assembly consists of eight chromosome-sized contigs with a total size of 13.04 Mbp (N50 2.
View Article and Find Full Text PDFWe generalize a problem of finding maximum-scoring segment sets, previously studied by Csűrös (IEEE/ACM Transactions on Computational Biology and Bioinformatics, 2004, 1, 139-150), from sequences to graphs. Namely, given a vertex-weighted graph and a non-negative startup penalty , we can find a set of vertex-disjoint paths in with maximum total score when each path's score is its vertices' total weight minus . We call this new problem (MSPS).
View Article and Find Full Text PDFLodderomyces beijingensis is an ascosporic ascomycetous yeast. In contrast to related species Lodderomyces elongisporus, which is a recently emerging human pathogen, L. beijingensis is associated with insects.
View Article and Find Full Text PDFAn annotation is a set of genomic intervals sharing a particular function or property. Examples include genes or their exons, evolutionarily conserved elements, and regions with a particular epigenetic state. A common task is to compare two annotations to determine if one is enriched or depleted in the regions covered by the other.
View Article and Find Full Text PDFIdentification of plasmids from sequencing data is an important and challenging problem related to antimicrobial resistance spread and other One-Health issues. We provide a new architecture for identifying plasmid contigs in fragmented genome assemblies built from short-read data. We employ graph neural networks (GNNs) and the assembly graph to propagate the information from nearby nodes, which leads to more accurate classification, especially for short contigs that are difficult to classify based on sequence features or database searches alone.
View Article and Find Full Text PDFMotivation: The analysis of bacterial isolates to detect plasmids is important due to their role in the propagation of antimicrobial resistance. In short-read sequence assemblies, both plasmids and bacterial chromosomes are typically split into several contigs of various lengths, making identification of plasmids a challenging problem. In plasmid contig binning, the goal is to distinguish short-read assembly contigs based on their origin into plasmid and chromosomal contigs and subsequently sort plasmid contigs into bins, each bin corresponding to a single plasmid.
View Article and Find Full Text PDFMotivation: Short tandem repeats (STRs) are regions of a genome containing many consecutive copies of the same short motif, possibly with small variations. Analysis of STRs has many clinical uses but is limited by technology mainly due to STRs surpassing the used read length. Nanopore sequencing, as one of long-read sequencing technologies, produces very long reads, thus offering more possibilities to study and analyze STRs.
View Article and Find Full Text PDFCandida verbasci is an anamorphic ascomycetous yeast. We report the genome sequence of its type strain, 11-1055 (CBS 12699). The nuclear genome assembly consists of seven chromosome-sized contigs with a total size of 12.
View Article and Find Full Text PDFBackground: The genomes of SARS-CoV-2 are classified into variants, some of which are monitored as variants of concern (e.g. the Delta variant B.
View Article and Find Full Text PDFMotivation: Genome annotations are a common way to represent genomic features such as genes, regulatory elements or epigenetic modifications. The amount of overlap between two annotations is often used to ascertain if there is an underlying biological connection between them. In order to distinguish between true biological association and overlap by pure chance, a robust measure of significance is required.
View Article and Find Full Text PDFMany fungal species utilize hydroxyderivatives of benzene and benzoic acid as carbon sources. The yeast Candida parapsilosis metabolizes these compounds via the 3-oxoadipate and gentisate pathways, whose components are encoded by two metabolic gene clusters. In this study, we determine the chromosome level assembly of the C.
View Article and Find Full Text PDFIEEE/ACM Trans Comput Biol Bioinform
April 2023
In nanopore sequencing, electrical signal is measured as DNA molecules pass through the sequencing pores. Translating these signals into DNA bases (base calling) is a highly non-trivial task, and its quality has a large impact on the sequencing accuracy. The most successful nanopore base callers to date use convolutional neural networks (CNN) to accomplish the task.
View Article and Find Full Text PDFSurveillance of the SARS-CoV-2 variants including the quickly spreading mutants by rapid and near real-time sequencing of the viral genome provides an important tool for effective health policy decision making in the ongoing COVID-19 pandemic. Here we evaluated PCR-tiling of short (~400-bp) and long (~2 and ~2.5-kb) amplicons combined with nanopore sequencing on a MinION device for analysis of the SARS-CoV-2 genome sequences.
View Article and Find Full Text PDFSARS-CoV-2 mutants carrying the ∆H69/∆V70 deletion in the amino-terminal domain of the Spike protein emerged independently in at least six lineages of the virus (namely, B.1.1.
View Article and Find Full Text PDFMotivation: MinION is a portable nanopore sequencing device that can be easily operated in the field with features including monitoring of run progress and selective sequencing. To fully exploit these features, real-time base calling is required. Up to date, this has only been achieved at the cost of high computing requirements that pose limitations in terms of hardware availability in common laptops and energy consumption.
View Article and Find Full Text PDFMotivation: Oxford Nanopore MinION is a portable DNA sequencer that is marketed as a device that can be deployed anywhere. Current base callers, however, require a powerful GPU to analyze data produced by MinION in real time, which hampers field applications.
Results: We have developed a fast base caller DeepNano-blitz that can analyze stream from up to two MinION runs in real time using a common laptop CPU (i7-7700HQ), with no GPU requirements.
Chromosome-scale genome assembly of the yeast CBS 517.90 was determined by a combination of technologies producing short (HiSeq X; Illumina) and long (MinION; Oxford Nanopore Technologies) reads. The 21.
View Article and Find Full Text PDFComputing similarity between 2 nucleotide sequences is one of the fundamental problems in bioinformatics. Current methods are based mainly on 2 major approaches: (1) sequence alignment, which is computationally expensive, and (2) faster, but less accurate, alignment-free methods based on various statistical summaries, for example, short word counts. We propose a new distance measure based on mathematical transforms from the domain of signal processing.
View Article and Find Full Text PDFMicrobiol Resour Announc
April 2019
is an arthroconidial yeast classified in the / clade of the subphylum Saccharomycotina. Here, we report the genome sequence of holotype strain CBS 625.85, assembled to five putative chromosomes.
View Article and Find Full Text PDFMicrobiol Resour Announc
February 2019
Saprochaete suaveolens is an ascomycetous yeast that produces a range of fruity flavors and fragrances. Here, we report the high-contiguity genome sequence of the ex-holotype strain, NRRL Y-17571 (CBS 152.25).
View Article and Find Full Text PDF