cljam: a library for handling DNA sequence alignment/map (SAM) with parallel processing.

Source Code Biol Med

Xcoo, Inc., 4-2-5, Hongo, Bunkyo-ku, Tokyo, Japan.

Published: August 2016

Background: Next-generation sequencing can determine DNA bases and the results of sequence alignments are generally stored in files in the Sequence Alignment/Map (SAM) format and the compressed binary version (BAM) of it. SAMtools is a typical tool for dealing with files in the SAM/BAM format. SAMtools has various functions, including detection of variants, visualization of alignments, indexing, extraction of parts of the data and loci, and conversion of file formats. It is written in C and can execute fast. However, SAMtools requires an additional implementation to be used in parallel with, for example, OpenMP (Open Multi-Processing) libraries. For the accumulation of next-generation sequencing data, a simple parallelization program, which can support cloud and PC cluster environments, is required.

Results: We have developed cljam using the Clojure programming language, which simplifies parallel programming, to handle SAM/BAM data. Cljam can run in a Java runtime environment (e.g., Windows, Linux, Mac OS X) with Clojure.

Conclusions: Cljam can process and analyze SAM/BAM files in parallel and at high speed. The execution time with cljam is almost the same as with SAMtools. The cljam code is written in Clojure and has fewer lines than other similar tools.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4987990PMC
http://dx.doi.org/10.1186/s13029-016-0058-6DOI Listing

Publication Analysis

Top Keywords

sequence alignment/map
8
alignment/map sam
8
next-generation sequencing
8
cljam
6
cljam library
4
library handling
4
handling dna
4
dna sequence
4
parallel
4
sam parallel
4

Similar Publications

The T2T-CHM13 complete human reference genome contains ∼200 Mb of newly resolved sequence, improving read mapping and variant calling compared to GRCh38. However, the benefits of using complete reference genomes in other contexts are unclear. Here, we present a reference T2T-CHM13 recombination map and phased haplotype panel derived from 3202 samples from the 1000 Genomes Project (1KGP).

View Article and Find Full Text PDF

FLEXR-MSA: electron-density map comparisons of sequence-diverse structures.

IUCrJ

March 2025

Department of Chemical Biology and Therapeutics, MS 1000, St Jude Children's Research Hospital, 262 Danny Thomas Place, Memphis, TN 38105, USA.

Proteins with near-identical sequences often share similar static structures. Yet, comparing crystal structures is limited or even biased by what has been included or omitted in the deposited model. Information about unique dynamics is often hidden in electron-density maps.

View Article and Find Full Text PDF

Nedd4 family interacting protein 1 directly regulates the NF-κB signaling pathway without promoting the ubiquitination of Tak1 in Nile tilapia.

Fish Shellfish Immunol

March 2025

Guangzhou Key Laboratory of Subtropical Biodiversity and Biomonitoring, Guangdong Provincial Engineering Technology Research Center for Environmentally-Friendly Aquaculture, School of Life Sciences, South China Normal University, Guangzhou, 510631, China. Electronic address:

Mammalian Nedd4 family interacting protein 1 (Ndfip1) serves as an activator of the E3 ubiquitin ligase, promoting ubiquitination and limiting the production of pro-inflammatory cytokines. However, the functional role of teleost Ndfip1 is not completely understood. In the current study, an Ndfip1 gene designated as OnNdfip1 was characterized in Nile tilapia.

View Article and Find Full Text PDF

BetaAlign: a deep learning approach for multiple sequence alignment.

Bioinformatics

December 2024

The Shmunis School of Biomedicine and Cancer Research, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv 69978, Israel.

Article Synopsis
  • The study explores a novel method for multiple sequence alignments in bioinformatics using natural language processing (NLP) techniques.
  • Researchers developed BetaAlign, a deep learning aligner that outperforms traditional alignment algorithms and offers highly accurate results by leveraging transformer models.
  • The findings highlight the potential of AI-based approaches to improve alignment tasks and advance phylogenomics, with training data and tools made available through Hugging Face.
View Article and Find Full Text PDF

Background: The evolution and development of flowers are biologically essential and of broad interest. Maize and sorghum have similar morphologies and phylogeny while harboring different inflorescence architecture. The difference in flower architecture between these two species is likely due to spatiotemporal gene expression regulation, and they are a good model for researching the evolution of flower development.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!