The KEGG Orthology (KO) database is a widely used molecular function reference database which can be used to conduct functional annotation of most microorganisms. At present, there are many KEGG tools based on the KO entries for annotating functional orthologs. However, determining how to efficiently extract and sort the annotation results of KEGG still hinders the subsequent genome analysis. There is a lack of effective measures used to quickly extract and classify the gene sequences and species information of the KEGG annotations. Here, we present a supporting tool: KEGG_Extractor for species-specific genes extraction and classification, which can output the results through an iterative keyword matching algorithm. It can not only extract and classify the amino acid sequences, but also the nucleotide sequences, and it has proved to be fast and efficient for microbial analysis. Analysis of the ancient Wood Ljungdahl (WL) pathway through the KEGG_Extractor reveals that ~226 archaeal strains contained the WL pathway-related genes. Most of them were , and members of the , and genus. Using the KEGG_Extractor, the ARWL database was constructed, which had a high accuracy and complement. This tool helps to link genes with the KEGG pathway and promote the reconstruction of molecular networks. Availability and implementation: KEGG_Extractor is freely available from the GitHub.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9956942 | PMC |
http://dx.doi.org/10.3390/genes14020386 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!