AI Article Synopsis

  • SEDA is a user-friendly desktop application for managing FASTA files containing DNA or protein sequences, designed for researchers without programming skills.
  • It offers both simple (like filtering and sorting) and advanced tools (such as BLAST searching and sequence alignment) that are not commonly found in similar software.
  • SEDA is open-source, easy to install, and helps in building high-quality datasets for various genetic studies, making it a valuable resource for life science researchers.

Article Abstract

SEDA (SEquence DAtaset builder) is a multiplatform desktop application for the manipulation of FASTA files containing DNA or protein sequences. The convenient graphical user interface gives access to a collection of simple (filtering, sorting, or file reformatting, among others) and advanced (BLAST searching, protein domain annotation, gene annotation, and sequence alignment) utilities not present in similar applications, which eases the work of life science researchers working with DNA and/or protein sequences, especially those who have no programming skills. This paper presents general guidelines on how to build efficient data handling protocols using SEDA, as well as practical examples on how to prepare high-quality datasets for single gene phylogenetic studies, the characterization of protein families, or phylogenomic studies. The user-friendliness of SEDA also relies on two important features: (i) the availability of easy-to-install distributable versions and installers of SEDA, including a Docker image for Linux, and (ii) the facility with which users can manage large datasets. SEDA is open-source, with GNU General Public License v3.0 license, and publicly available at GitHub (https://github.com/sing-group/seda). SEDA installers and documentation are available at https://www.sing-group.org/seda/.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TCBB.2020.3040383DOI Listing

Publication Analysis

Top Keywords

fasta files
8
protein sequences
8
seda
7
seda desktop
4
desktop tool
4
tool suite
4
suite fasta
4
files processing
4
processing seda
4
seda sequence
4

Similar Publications

Disocatus ackermannii, commonly referred to as Orchid Cactus, is a striking succulent belonging to the Cactaceae family. Its unique appearance and captivating characteristics make it a sought-after addition to gardens and courtyards beautification. In June 2023, 20-30% of D.

View Article and Find Full Text PDF

Background: To understand the emergence and spread of drug-resistant parasites in malaria-endemic areas, accurate assessment and monitoring of antimalarial drug resistance markers is critical. Recent advances in next-generation sequencing (NGS) technologies have enabled the tracking of drug-resistant malaria parasites.

Methods: In this study, we used Targeted Amplicon Deep Sequencing (TADS) to characterise the genetic diversity of the Pfk13, Pfdhfr, Pfdhps, and Pfmdr1 genes among primary school-going children in 15 counties in Kenya (Bungoma, Busia, Homa Bay, Migori, Kakamega, Kilifi, Kirinyaga, Kisii, Kisumu, Kwale, Siaya, Tana River, Turkana, Vihiga and West Pokot).

View Article and Find Full Text PDF

Deoxyribonucleic acid (DNA) or ribonucleic acid (RNA) sequence compressors for novel species frequently face challenges when processing wide-scale raw, FASTA, or multi-FASTA structured data. For years, molecular sequence databases have favored the widely used general-purpose Gzip and Zstd compressors. The absence of sequence-specific characteristics in these encoders results in subpar performance, and their use depends on time-consuming parameter adjustments.

View Article and Find Full Text PDF

Lossless and reference-free compression of FASTQ/A files using GeneSqueeze.

Sci Rep

January 2025

Rajant Health Incorporated, 200 Chesterfield Parkway, Malvern, PA, 19355PA, USA.

As sequencing becomes more accessible, there is an acute need for novel compression methods to efficiently store sequencing files. Omics analytics can leverage sequencing technologies to enhance biomedical research and individualize patient care, but sequencing files demand immense storage capabilities, particularly when sequencing is utilized for longitudinal studies. Addressing the storage challenges posed by these technologies is crucial for omics analytics to achieve their full potential.

View Article and Find Full Text PDF

[Clinical Impact of Current Variants in COVID-19 Patients: Clinical Characteristics in Variant EG.5].

Mikrobiyol Bul

October 2024

University of Health Sciences, Ankara Bilkent City Health Application and Research Center, Clinic of Medical Microbiology, Ankara, Türkiye.

The severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) virus has mutated at a high rate since the beginning of the pandemic, leading to the formation of different variants. Alpha, Beta, Gamma, Delta and Omicron have emerged as concerning variants identified by the World Health Organization (WHO). The Omicron variant and its sublineages became dominant worldwide in 2022.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!