Every day more plant genomes are available in public databases and additional massive sequencing projects (i.e., that aim to sequence thousands of individuals) are formulated and released. Nevertheless, there are not enough automatic tools to analyze this large amount of genomic information. LTR retrotransposons are the most frequent repetitive sequences in plant genomes; however, their detection and classification are commonly performed using semi-automatic and time-consuming programs. Despite the availability of several bioinformatic tools that follow different approaches to detect and classify them, none of these tools can individually obtain accurate results. Here, we used Machine Learning algorithms based on -mer counts to classify LTR retrotransposons from other genomic sequences and into lineages/families with an F1-Score of 95%, contributing to develop a free-alignment and automatic method to analyze these sequences.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8140598PMC
http://dx.doi.org/10.7717/peerj.11456DOI Listing

Publication Analysis

Top Keywords

plant genomes
12
machine learning
8
ltr retrotransposons
8
-mer-based machine
4
learning method
4
method classify
4
classify ltr-retrotransposons
4
ltr-retrotransposons plant
4
genomes day
4
day plant
4

Similar Publications

Photosynthetic microalgae are promising green cell factories for the sustainable production of high-value chemicals and biopharmaceuticals. The chloroplast organelle is being developed as a chassis for synthetic biology as it contains its own genome (the plastome) and some interesting advantages, such as high recombinant protein titers and a diverse and dynamic metabolism. However, chloroplast engineering is currently hampered by the lack of standardized cloning tools and Design-Build-Test-Learn workflows to ease genomic and metabolic engineering.

View Article and Find Full Text PDF

Gamma-aminobutyric acid (GABA) functions as an inhibitory neurotransmitter which blocks the impulses between nerve cells in the brain. Due to the increasing awareness about the health promoting benefits associated with GABA, it is also artificially synthesized and consumed as a nutritional supplement by people in some regions of the world. Though among the fresh vegetables, tomato fruits do contain a comparatively higher amount of GABA (0.

View Article and Find Full Text PDF

Maize transcription factor ZmEREB167 negatively regulates starch accumulation and kernel size.

J Genet Genomics

January 2025

State Key Laboratory of Maize Bio-breeding, Key Laboratory of Genome Editing Research and Application, Ministry of Agriculture and Rural Affairs, Department of Plant Genetics and Breeding, National Maize Improvement Center, College of Agronomy and Biotechnology, China Agricultural University, Beijing 100193, China; Frontiers Science Center for Molecular Design Breeding, Beijing 100193, China. Electronic address:

Transcription factors play critical roles in the regulation of gene expression during maize kernel development. The maize endosperm is a large storage organ, accounting for nearly 90% of the dry weight of mature kernel, and is also the main place for starch storage. In this study, we identify an endosperm-specific EREB gene, ZmEREB167, which encodes a nucleus-localized EREB protein.

View Article and Find Full Text PDF

Genomic Epidemiology of Strains That Caused the Fire Blight Outbreak in Korea.

Plant Dis

January 2025

50 Yonsei-ro, Seodaemun-guSeoul, Korea (the Republic of), 03722;

Fire blight, a devastating bacterial disease affecting rosaceous plants such as apples and pears, is caused by . The disease, known for its rapid spread and destructive potential, can lead to severe symptoms and often result in the death of infected plants. In Korea, the observation of was first recorded in 2015, and subsequent dissemination has been noted across the peninsula.

View Article and Find Full Text PDF

Modulation of stomatal development and movement is a promising approach for creating water-conserving plants. Here, we identified and characterized the PagHCF106 gene of poplar (Populus alba × Populus glandulosa). The PagHCF106 protein localized predominantly to the chloroplast, and the PagHCF106 gene exhibited tissue-specific expression pattern.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!