airpg: automatically accessing the inverted repeats of archived plastid genomes.

BMC Bioinformatics

Institut für Biologie, Freie Universität Berlin, 14195, Berlin, Germany.

Published: August 2021

Background: In most flowering plants, the plastid genome exhibits a quadripartite genome structure, comprising a large and a small single copy as well as two inverted repeat regions. Thousands of plastid genomes have been sequenced and submitted to public sequence repositories in recent years. The quality of sequence annotations in many of these submissions is known to be problematic, especially regarding annotations that specify the length and location of the inverted repeats: such annotations are either missing or portray the length or location of the repeats incorrectly. However, many biological investigations employ publicly available plastid genomes at face value and implicitly assume the correctness of their sequence annotations.

Results: We introduce airpg, a Python package that automatically assesses the frequency of incomplete or incorrect annotations of the inverted repeats among publicly available plastid genomes. Specifically, the tool automatically retrieves plastid genomes from NCBI Nucleotide under variable search parameters, surveys them for length and location specifications of inverted repeats, and confirms any inverted repeat annotations through self-comparisons of the genome sequences. The package also includes functionality for automatic identification and removal of duplicate genome records and accounts for taxa that genuinely lack inverted repeats. A survey of the presence of inverted repeat annotations among all plastid genomes of flowering plants submitted to NCBI Nucleotide until the end of 2020 using airpg, followed by a statistical analysis of potential associations with record metadata, highlights that release year and publication status of the genome records have a significant effect on the frequency of complete and equal-length inverted repeat annotations.

Conclusion: The number of plastid genomes on NCBI Nucleotide has increased dramatically in recent years, and many more genomes will likely be submitted over the next decade. airpg enables researchers to automatically access and evaluate the inverted repeats of these plastid genomes as well as their sequence annotations and, thus, contributes to increasing the reliability of publicly available plastid genomes. The software is freely available via the Python package index at http://pypi.python.org/pypi/airpg .

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8379869PMC
http://dx.doi.org/10.1186/s12859-021-04309-yDOI Listing

Publication Analysis

Top Keywords

plastid genomes
36
inverted repeats
24
inverted repeat
16
length location
12
publicly plastid
12
ncbi nucleotide
12
inverted
10
plastid
10
genomes
10
flowering plants
8

Similar Publications

Plant mitochondrial genomes (mitogenomes) experience remarkable levels of horizontal gene transfer (HGT), including the recent discovery that orchids anciently acquired DNA from fungal mitogenomes. Thus far, however, there is no evidence that any of the genes from this interkingdom HGT are functional in orchid mitogenomes. Here, we applied a specialized sequencing approach to the orchid Corallorhiza maculata and found that some fungal-derived tRNA genes in the transferred region are transcribed, post-transcriptionally modified, and aminoacylated.

View Article and Find Full Text PDF

The chloroplast RNA-binding protein CP29A supports expression during cold acclimation.

Proc Natl Acad Sci U S A

February 2025

Molecular Genetics, Institute of Biology, Faculty of Life Sciences, Humboldt Universität zu Berlin, Berlin 10115, Germany.

The chloroplast genome encodes key components of the photosynthetic light reaction machinery as well as the large subunit of the enzyme central for carbon fixation, Ribulose-1,5-bisphosphat-carboxylase/-oxygenase (RuBisCo). Its expression is predominantly regulated posttranscriptionally, with nuclear-encoded RNA-binding proteins (RBPs) playing a key role. Mutants of chloroplast gene expression factors often exhibit impaired chloroplast biogenesis, especially in cold conditions.

View Article and Find Full Text PDF

Background: Paeonia lactiflora Pall., a member of Paeoniaceae family, is a medicinal herb widely used in traditional Chinese medicine. Chloroplasts are multifunctional organelles containing distinct genetic material.

View Article and Find Full Text PDF

is a fully mycoheterotrophic orchid that lacks both leaves and roots, belonging to the genus in the subtribe Calypsoinae. In this study, we assembled and annotated its mitochondrial genome (397,867 bp, GC content: 42.70%), identifying 55 genes, including 37 protein-coding genes (PCGs), 16 tRNAs, and 2 rRNAs, and conducted analyses of relative synonymous codon usage (RSCU), repeat sequences, horizontal gene transfers (HGTs), and gene selective pressure (dN/dS).

View Article and Find Full Text PDF

Complete Chloroplast Genomes of 9 Species: Genome Structure, Comparative Analysis, and Phylogenetic Relationships.

Int J Mol Sci

January 2025

College of Landscape Architecture and Horticulture Sciences, Southwest Research Center for Engineering Technology of Landscape Architecture (State Forestry and Grassland Administration), Yunnan Engineering Research Center for Functional Flower Resources and Industrialization, Research and Development Center of Landscape Plants and Horticulture Flowers, Southwest Forestry University, Kunming 650224, China.

is a genus of functional herbaceous plants in the Balsaminaceae, which are not only of great ornamental value and one of the world's top three flower bedding plants but also have a wide range of medicinal and edible uses. Currently, the taxonomy and phylogenetic relationships of species are still controversial. In order to better understand their chloroplast properties and phylogenetic evolution, nine plants (, , , , , , , , ) were sequenced, and their complete chloroplast genomes were analysed.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!