Background: Different genome annotation services have been developed in recent years and widely used. However, the functional annotation results from different services are often not the same and a scheme to obtain consensus functional annotations by integrating different results is in demand.
Results: This article presents a semi-automated scheme that is capable of comparing functional annotations from different sources and consequently obtaining a consensus genome functional annotation result. In this study, we used four automated annotation services to annotate a newly sequenced genome--Arcobacter butzleri ED-1. Our scheme is divided into annotation comparison and annotation determination sections. In the functional annotation comparison section, we employed gene synonym lists to tackle term difference problems. Multiple techniques from information retrieval were used to preprocess the functional annotations. Based on the functional annotation comparison results, we designed a decision tree to obtain a consensus functional annotation result. Experimental results show that our approach can greatly reduce the workload of manual comparison by automatically comparing 87% of the functional annotations. In addition, it automatically determined 87% of the functional annotations, leaving only 13% of the genes for manual curation. We applied this approach across six phylogenetically different genomes in order to assess the performance consistency. The results showed that our scheme is able to automatically perform, on average, 73% and 86% of the annotation comparison and determination tasks, respectively.
Conclusions: We propose a semi-automatic and effective scheme to compare and determine genome functional annotations. It greatly reduces the manual work required in genome functional annotation. As this scheme does not require any specific biological knowledge, it is readily applicable for genome annotation comparison and genome re-annotation projects.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3680241 | PMC |
http://dx.doi.org/10.1186/1471-2105-14-172 | DOI Listing |
Am J Hum Genet
January 2025
UC Santa Cruz Genomics Institute, University of California, Santa Cruz, Santa Cruz, CA, USA. Electronic address:
More than 50% of families with suspected rare monogenic diseases remain unsolved after whole-genome analysis by short-read sequencing (SRS). Long-read sequencing (LRS) could help bridge this diagnostic gap by capturing variants inaccessible to SRS, facilitating long-range mapping and phasing and providing haplotype-resolved methylation profiling. To evaluate LRS's additional diagnostic yield, we sequenced a rare-disease cohort of 98 samples from 41 families, using nanopore sequencing, achieving per sample ∼36× average coverage and 32-kb read N50 from a single flow cell.
View Article and Find Full Text PDFComput Methods Programs Biomed
January 2025
Regional Institute of Ophthalmology, Indira Gandhi Institute of Medical Sciences, Patna, 800025, Bihar, India.
Background And Objectives: Hypertensive Retinopathy (HR) is a retinal manifestation resulting from persistently elevated blood pressure. Severity grading of HR is essential for patient risk stratification, effective management, progression monitoring, timely intervention, and minimizing the risk of vision impairment. Computer-aided diagnosis and artificial intelligence (AI) systems play vital roles in the diagnosis and grading of HR.
View Article and Find Full Text PDFPlants (Basel)
January 2025
Instituto Tecnológico de Sonora, 5 de Febrero 818, Col. Centro, Cd. Obregón 85000, Mexico.
Strain TE5 was isolated from a wheat ( L. subsp. ) rhizosphere grown in a commercial field of wheat in the Yaqui Valley in Mexico.
View Article and Find Full Text PDFSensors (Basel)
January 2025
School of Information and Communications Engineering, Xi'an Jiaotong University, Xi'an 710049, China.
This review offers a comprehensive and in-depth analysis of face mask detection and recognition technologies, emphasizing their critical role in both public health and technological advancements. Existing detection methods are systematically categorized into three primary classes: feaRture-extraction-and-classification-based approaches, object-detection-models-based methods and multi-sensor-fusion-based methods. Through a detailed comparison, their respective workflows, strengths, limitations, and applicability across different contexts are examined.
View Article and Find Full Text PDFBioinformatics
January 2025
European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom.
Summary: In recent years there has been a surge in prokaryotic genome assemblies, coming from both isolated organisms and environmental samples. These assemblies often include novel species that are poorly represented in reference databases creating a need for a tool that can annotate both well-described and novel taxa, and can run at scale. Here, we present mettannotator-a comprehensive, scalable Nextflow pipeline for prokaryotic genome annotation that identifies coding and non-coding regions, predicts protein functions, including antimicrobial resistance, and delineates gene clusters.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!