AnnotaPipeline: An integrated tool to annotate eukaryotic proteins using multi-omics data.

Front Genet

Laboratório de Bioinformática, Universidade Federal de Santa Catarina (UFSC), Campus João David Ferreira Lima, Florianópolis, Brazil.

Published: November 2022

Assignment of gene function has been a crucial, laborious, and time-consuming step in genomics. Due to a variety of sequencing platforms that generates increasing amounts of data, manual annotation is no longer feasible. Thus, the need for an integrated, automated pipeline allowing the use of experimental data towards validation of prediction of gene function is of utmost relevance. Here, we present a computational workflow named AnnotaPipeline that integrates distinct software and data types on a proteogenomic approach to annotate and validate predicted features in genomic sequences. Based on FASTA (i) nucleotide or (ii) protein sequences or (iii) structural annotation files (GFF3), users can input FASTQ RNA-seq data, MS/MS data from mzXML or similar formats, as the pipeline uses both transcriptomic and proteomic information to corroborate annotations and validate gene prediction, providing transcription and expression evidence for functional annotation. Reannotation of the available , and genomes was performed using the AnnotaPipeline, resulting in a higher proportion of annotated proteins and a reduced proportion of hypothetical proteins when compared to the annotations publicly available for these organisms. AnnotaPipeline is a Unix-based pipeline developed using Python and is available at: https://github.com/bioinformatics-ufsc/AnnotaPipeline.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9723129PMC
http://dx.doi.org/10.3389/fgene.2022.1020100DOI Listing

Publication Analysis

Top Keywords

gene function
8
data
6
annotapipeline
4
annotapipeline integrated
4
integrated tool
4
tool annotate
4
annotate eukaryotic
4
eukaryotic proteins
4
proteins multi-omics
4
multi-omics data
4

Similar Publications

scRNA + BCR-seq identifies proportions and characteristics of dual BCR B cells in the peritoneal cavity of mice and peripheral blood of healthy human donors across different ages.

Immun Ageing

December 2024

Department of Immunology, Center of Immuno-molecular Engineering, Innovation & Practice Base for Graduate Students Education, Zunyi Medical University, Zunyi, China.

The increased incidence of inflammatory diseases, infectious diseases, autoimmune disorders, and tumors in elderly individuals is closely associated with several well-established features of immunosenescence, including reduced B cell genesis and dampened immune responses. Recent studies have highlighted the critical role of dual receptor lymphocytes in tumors and autoimmune diseases. This study utilized shared data generated through scRNA-seq + scBCR-seq technology to investigate the presence of dual receptor-expressing B cells in the peritoneum of mouse and peripheral blood of healthy volunteers, and whether there are age-related differences in dual receptor B cell populations.

View Article and Find Full Text PDF

Background: Cutaneous melanoma is one of the most invasive and lethal skin malignant tumors. Compared to primary melanoma, metastatic melanoma (MM) presents poorer treatment outcomes and a higher mortality rate. The tumor microenvironment (TME) plays a critical role in MM progression and immunotherapy resistance.

View Article and Find Full Text PDF

As molecular research on hemp (Cannabis sativa L.) continues to advance, there is a growing need for the accumulation of more diverse genome data and more accurate genome assemblies. In this study, we report the three-way assembly data of a cannabidiol (CBD)-rich cannabis variety, 'Pink Pepper' cultivar using sequencing technology: PacBio Single Molecule Real-Time (SMRT) technology, Illumina sequencing technology, and Oxford Nanopore Technology (ONT).

View Article and Find Full Text PDF

Background: Epistasis, the phenomenon where the effect of one gene (or variant) is masked or modified by one or more other genes, significantly contributes to the phenotypic variance of complex traits. Traditionally, epistasis has been modeled using the Cartesian epistatic model, a multiplicative approach based on standard statistical regression. However, a recent study investigating epistasis in obesity-related traits has identified potential limitations of the Cartesian epistatic model, revealing that it likely only detects a fraction of the genetic interactions occurring in natural systems.

View Article and Find Full Text PDF

Background: Long-term consumption of Western Diet (WD) is a well-established risk factor for the development of cardiovascular disease (CVD); however, there is a paucity of studies on the long-term effects of WD on the pathophysiology of CVD and sex-specific responses.

Methods: Our study aimed to investigate the sex-specific pathophysiological changes in left ventricular (LV) function using transthoracic echocardiography (ECHO) and LV tissue transcriptomics in WD-fed C57BL/6 J mice for 125 days, starting at the age of 300 through 425 days.

Results: In female mice, consumption of the WD diet showed long-term effects on LV structure and possible development of HFpEF-like phenotype with compensatory cardiac structural changes later in life.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!