DNA sequencing is the physical/biochemical process of identifying the location of the four bases (Adenine, Guanine, Cytosine, Thymine) in a DNA strand. As semiconductor technology revolutionized computing, modern DNA sequencing technology (termed Next Generation Sequencing, NGS) revolutionized genomic research. As a result, modern NGS platforms can sequence hundreds of millions of short DNA fragments in parallel. The sequenced DNA fragments, representing the output of NGS platforms, are termed reads. Besides genomic variations, NGS imperfections induce noise in reads. Mapping each read to (the most similar portion of) a reference genome of the same species, i.e., read mapping, is a common critical first step in a diverse set of emerging bioinformatics applications. Mapping represents a search-heavy memory-intensive similarity matching problem, therefore, can greatly benefit from near-memory processing. Intuition suggests using fast associative search enabled by Ternary Content Addressable Memory (TCAM) by construction. However, the excessive energy consumption and lack of support for similarity matching (under NGS and genomic variation induced noise) renders direct application of TCAM infeasible, irrespective of volatility, where only non-volatile TCAM can accommodate the large memory footprint in an area-efficient way. This paper introduces GeNVoM, a scalable, energy-efficient and high-throughput solution. Instead of optimizing an algorithm developed for general-purpose computers or GPUs, GeNVoM rethinks the algorithm and non-volatile TCAM-based accelerator design together from the ground up. Thereby GeNVoM can improve the throughput by up to 3.67×; the energy consumption, by up to 1.36×, when compared to an ASIC baseline, which represents one of the highest-throughput implementations known.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TCBB.2021.3118018DOI Listing

Publication Analysis

Top Keywords

read mapping
8
dna sequencing
8
ngs platforms
8
dna fragments
8
similarity matching
8
energy consumption
8
dna
5
ngs
5
genvom
4
genvom read
4

Similar Publications

Background: The rising number of cancer survivors and the shortage of health care professionals challenge the accessibility of cancer care. Health technologies are necessary for sustaining optimal patient journeys. To understand individuals' daily lives during their patient journey, qualitative studies are crucial.

View Article and Find Full Text PDF

Unlabelled: To overcome the paucity of known tumor-specific surface antigens in pediatric high-grade glioma (pHGG), we contrasted splicing patterns in pHGGs and normal brain samples. Among alternative splicing events affecting extracellular protein domains, the most pervasive alteration was the skipping of ≤30 nucleotide-long microexons. Several of these skipped microexons mapped to L1-IgCAM family members, such as .

View Article and Find Full Text PDF

Deep learning sequence models trained on personalized genomics can improve variant effect prediction, however, applications of these models are limited by computational requirements for storing and reading large datasets. We address this with GenVarLoader, which stores personalized genomic data in new memory-mapped formats with optimal data locality to achieve ∼1,000x faster throughput and ∼2,000x better compression compared to existing alternatives.

View Article and Find Full Text PDF

Mitochondrial genome of : features, RNA editing, and insights into male sterility.

Front Plant Sci

January 2025

Bio-resource Research and Utilization Joint Key Laboratory of Sichuan and Chongqing, Chongqing Institute of Medicinal Plant Cultivation, Nanchuan, Chongqing, China.

Introduction: Mitochondria are essential organelles that provide energy for plants. They are semi-autonomous, maternally inherited, and closely linked to cytoplasmic male sterility (CMS) in plants. , a widely used medicinal plant from the Caprifoliaceae family, is rich in chlorogenic acid (CGA) and its analogues, which are known for their antiviral and anticancer properties.

View Article and Find Full Text PDF

While all native tRNAs undergo extensive post-transcriptional modifications as a mechanism to regulate gene expression, mapping these modifications remains challenging. The critical barrier is the difficulty of readthrough of modifications by reverse transcriptases (RTs). Here we use Induro-a new group-II intron-encoded RT-to map and quantify genome-wide tRNA modifications in Induro-tRNAseq.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!