Mingle: A Command Line Utility for Merging Multi-fasta Files.

J Comput Biol

Department of Experimental Surgery-Cancer Metastasis, Medical Faculty Mannheim, Centre for Biomedicine and Medical Technology Mannheim (CBTM), Ruprecht-Karls University of Heidelberg, Mannheim, Germany.

Published: April 2019

Massively parallel sequencing (MPS) has become a standard technique in molecular biology whose application has spread from the analysis of the human genome to that of virtually all other organisms. MPS requires reference genomes to be performed and, in some cases, multiple genomes need to be handled as a single unit to carry out genetic analysis. Nucleic acid sequences are typically stored in "fasta" files, which can contain multiple genomes ("multi-fasta"). Although it is possible to convert a multi-fasta file into a single sequence using specific computer commands, the resulting file will not keep track of the boundaries of the original sequences, making it difficult to determine to what genome read obtained from MPS belong to. In this study we introduce mingle, a shell script that can be used to create custom reference genome by merging multi-fasta files while providing a list of boundaries of the individual genomes that can be used for downstream analysis.

Download full-text PDF

Source
http://dx.doi.org/10.1089/cmb.2018.0243DOI Listing

Publication Analysis

Top Keywords

merging multi-fasta
8
multi-fasta files
8
multiple genomes
8
mingle command
4
command utility
4
utility merging
4
files massively
4
massively parallel
4
parallel sequencing
4
sequencing mps
4

Similar Publications

VirusViz: comparative analysis and effective visualization of viral nucleotide and amino acid variants.

Nucleic Acids Res

September 2021

Dipartimento di Elettronica, Informazione e Bioingegneria, Politecnico di Milano, Via Ponzio 34/5, 20133 Milano, Italy.

Variant visualization plays an important role in supporting the viral evolution analysis, extremely valuable during the COVID-19 pandemic. VirusViz is a web-based application for comparing variants of selected viral populations and their sub-populations; it is primarily focused on SARS-CoV-2 variants, although the tool also supports other viral species (SARS-CoV, MERS-CoV, Dengue, Ebola). As input, VirusViz imports results of queries extracting variants and metadata from the large database ViruSurf, which integrates information about most SARS-CoV-2 sequences publicly deposited worldwide.

View Article and Find Full Text PDF

Mingle: A Command Line Utility for Merging Multi-fasta Files.

J Comput Biol

April 2019

Department of Experimental Surgery-Cancer Metastasis, Medical Faculty Mannheim, Centre for Biomedicine and Medical Technology Mannheim (CBTM), Ruprecht-Karls University of Heidelberg, Mannheim, Germany.

Massively parallel sequencing (MPS) has become a standard technique in molecular biology whose application has spread from the analysis of the human genome to that of virtually all other organisms. MPS requires reference genomes to be performed and, in some cases, multiple genomes need to be handled as a single unit to carry out genetic analysis. Nucleic acid sequences are typically stored in "fasta" files, which can contain multiple genomes ("multi-fasta").

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!