Publications by authors named "Sri Parameswaran"

Motivation: Nanopore sequencing current signal data can be 'basecalled' into sequence information or analysed directly, with the capacity to identify diverse molecular features, such as DNA/RNA base modifications and secondary structures. However, raw signal data is large and complex, and there is a need for improved visualization strategies to facilitate signal analysis, exploration and tool development.

Results: Squigualiser (Squiggle visualiser) is a toolkit for intuitive, interactive visualization of sequence-aligned signal data, which currently supports both DNA and RNA sequencing data from Oxford Nanopore Technologies instruments.

View Article and Find Full Text PDF

minimap2 is the gold-standard software for reference-based sequence mapping in third-generation long-read sequencing. While minimap2 is relatively fast, further speedup is desirable, especially when processing a multitude of large datasets. In this work, we present minimap2-fpga, a hardware-accelerated version of minimap2 that speeds up the mapping process by integrating an FPGA kernel optimised for chaining.

View Article and Find Full Text PDF

Background: Third-generation nanopore sequencers offer selective sequencing or "Read Until" that allows genomic reads to be analyzed in real time and abandoned halfway if not belonging to a genomic region of "interest." This selective sequencing opens the door to important applications such as rapid and low-cost genetic tests. The latency in analyzing should be as low as possible for selective sequencing to be effective so that unnecessary reads can be rejected as early as possible.

View Article and Find Full Text PDF

Nanopore sequencing is being rapidly adopted in genomics. We recently developed SLOW5, a new file format with advantages for storage and analysis of raw signal data from nanopore experiments. Here we introduce slow5tools, an intuitive toolkit for handling nanopore data in SLOW5 format.

View Article and Find Full Text PDF

Nanopore sequencing depends on the FAST5 file format, which does not allow efficient parallel analysis. Here we introduce SLOW5, an alternative format engineered for efficient parallelization and acceleration of nanopore data analysis. Using the example of DNA methylation profiling of a human genome, analysis runtime is reduced from more than two weeks to approximately 10.

View Article and Find Full Text PDF

Background: Nanopore sequencing enables portable, real-time sequencing applications, including point-of-care diagnostics and in-the-field genotyping. Achieving these outcomes requires efficient bioinformatic algorithms for the analysis of raw nanopore signal data. However, comparing raw nanopore signals to a biological reference sequence is a computationally complex task.

View Article and Find Full Text PDF

Background: Pairwise alignment of short DNA sequences with affine-gap scoring is a common processing step performed in a range of bioinformatics analyses. Dynamic programming (i.e.

View Article and Find Full Text PDF

The advent of Nanopore sequencing has realised portable genomic research and applications. However, state of the art long read aligners and large reference genomes are not compatible with most mobile computing devices due to their high memory requirements. We show how memory requirements can be reduced through parameter optimisation and reference genome partitioning, but highlight the associated limitations and caveats of these approaches.

View Article and Find Full Text PDF

A variant caller is used to identify variations in an individual genome (compared to the reference genome) in a genome processing pipeline. For the sake of accuracy, modern variant callers perform many local re-assemblies on small regions of the genome using a graph-based algorithm. However, such graph-based data structures are inefficiently stored in the linear memory of modern computers, which in turn reduces computing efficiency.

View Article and Find Full Text PDF

The de-novo genome assembly is a challenging computational problem for which several pipelines have been developed. The advent of long-read sequencing technology has resulted in a new set of algorithmic approaches for the assembly process. In this work, we identify that one of these new and fast long-read assembly techniques (using Minimap2 and Miniasm) can be modified for the short-read assembly process.

View Article and Find Full Text PDF

Motivation: The Variant Call Format (VCF) is widely used to store data about genetic variation. Variant calling workflows detect potential variants in large numbers of short sequence reads generated by DNA sequencing and report them in VCF format. To evaluate the accuracy of variant callers, it is critical to correctly compare their output against a reference VCF file containing a gold standard set of variants.

View Article and Find Full Text PDF