AI Article Synopsis

  • Genome annotation tools help standardize outputs but limit user customization, which affects their flexibility in further analysis.
  • StandEnA is a new command-line tool for Linux that allows users to create custom databases by pulling protein sequences from various sources, using a defined list of standard names.
  • StandEnA is open-source and can be accessed online, with supplementary data available for further research.

Article Abstract

Motivation: Several genome annotation tools standardize annotation outputs for comparability. During standardization, these tools do not allow user-friendly customization of annotation databases; limiting their flexibility and applicability in downstream analysis.

Results: StandEnA is a user-friendly command-line tool for Linux that facilitates the generation of custom databases by retrieving protein sequences from multiple databases. Directed by a user-defined list of standard names, StandEnA retrieves synonyms to search for corresponding sequences in a set of public databases. Custom databases are used in prokaryotic genome annotation to generate standardized presence-absence matrices and reference files containing standard database identifiers. To showcase StandEnA, we applied it to six metagenome-assembled genomes to analyze three different pathways.

Availability And Implementation: StandEnA is an open-source software available at https://github.com/mdsufz/StandEnA.

Supplementary Information: Supplementary data are available at online.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10336186PMC
http://dx.doi.org/10.1093/bioadv/vbad069DOI Listing

Publication Analysis

Top Keywords

genome annotation
8
custom databases
8
standena
5
annotation
5
databases
5
standena customizable
4
customizable workflow
4
workflow standardized
4
standardized annotation
4
annotation generating
4

Similar Publications

Background: Mining functional gene modules from genomic data is an important step to detect gene members of pathways or other relations such as protein-protein interactions. This work explores the plausibility of detecting functional gene modules by factorizing gene-phenotype association matrix from the phenotype ontology data rather than the conventionally used gene expression data. Recently, the hierarchical structure of phenotype ontologies has not been sufficiently utilized in gene clustering while functionally related genes are consistently associated with phenotypes on the same path in phenotype ontologies.

View Article and Find Full Text PDF

Background: Phaius Lour. (Collabieae, Orchidaceae) is a small genus consisting of about 45 species, with highly ornamental and medicinal values. However, the phylogenetic relationship of Phaius among Calanthe s.

View Article and Find Full Text PDF

Comprehensive analysis of the multi-rings mitochondrial genome of Populus tomentosa.

BMC Genomics

January 2025

State Key Laboratory of Tree Genetics and Breeding, National Engineering Research Center of Tree Breeding and Ecological Restoration, Beijing Advanced Innovation Center for Tree Breeding by Molecular Design, College of Biological Sciences and Technology, Beijing Forestry University, Beijing, 100083, China.

Background: Populus tomentosa, known as Chinese white poplar, is indigenous and distributed across large areas of China, where it plays multiple important roles in forestry, agriculture, conservation, and urban horticulture. However, limited accessibility to the mitochondrial (mt) genome of P. tomentosa impedes phylogenetic and population genetic analyses and restricts functional gene research in Salicaceae family.

View Article and Find Full Text PDF

The current mortality rates for breast cancer underscore the need for better prognostic tools; moreover, LIM and calponin homology domain 1 (LIMCH1), which is a protein with dual roles in cancer, is a promising candidate for investigation. This study employed an integrative approach combining bioinformatics analysis of The Cancer Genome Atlas (TCGA) cohort and clinical immunohistochemistry (IHC) cohort data. We analysed LIMCH1 expression patterns, its associations with clinicopathological features and prognosis, and its impact on the tumour immune microenvironment (TIME).

View Article and Find Full Text PDF

The trait-specific timing of accelerated genomic change in the human lineage.

Cell Genom

January 2025

Department of Integrative Biology, The University of Texas at Austin, Austin, TX, USA; Department of Statistics and Data Science, The University of Texas at Austin, Austin, TX, USA. Electronic address:

Humans exhibit distinct characteristics compared to our primate and ancient hominin ancestors. To investigate genomic bursts in the evolution of these traits, we use two complementary approaches to examine enrichment among genome-wide association study loci spanning diseases and AI-based image-derived brain, heart, and skeletal tissue phenotypes with genomic regions reflecting four evolutionary divergence points. These regions cover epigenetic differences among humans and rhesus macaques, human accelerated regions (HARs), ancient selective sweeps, and Neanderthal-introgressed alleles.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!