Automatic curation of LTR retrotransposon libraries from plant genomes through machine learning.

J Integr Bioinform

Department of Electronics and Automation, Universidad Autónoma de Manizales, Manizales, Colombia.

Published: September 2022

Transposable elements are mobile sequences that can move and insert themselves into chromosomes, activating under internal or external stimuli, giving the organism the ability to adapt to the environment. Annotating transposable elements in genomic data is currently considered a crucial task to understand key aspects of organisms such as phenotype variability, species evolution, and genome size, among others. Because of the way they replicate, LTR retrotransposons are the most common transposable elements in plants, accounting in some cases for up to 80% of all DNA information. To annotate these elements, a reference library is usually created, a curation process is performed, eliminating TE fragments and false positives and then annotated in the genome using the homology method. However, the curation process can take weeks, requires extensive manual work and the execution of multiple time-consuming bioinformatics software. Here, we propose a machine learning-based approach to perform this process automatically on plant genomes, obtaining up to 91.18% F1-score. This approach was tested with four plant species, obtaining up to 93.6% F1-score () in only 22.61 s, where bioinformatics methods took approximately 6 h. This acceleration demonstrates that the ML-based approach is efficient and could be used in massive sequencing projects.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9521825PMC
http://dx.doi.org/10.1515/jib-2021-0036DOI Listing

Publication Analysis

Top Keywords

transposable elements
12
plant genomes
8
curation process
8
automatic curation
4
curation ltr
4
ltr retrotransposon
4
retrotransposon libraries
4
libraries plant
4
genomes machine
4
machine learning
4

Similar Publications

Emergence of fungal hybrids - potential threat to humans.

Microb Pathog

January 2025

Cell Biology and Molecular Genetics, Yenepoya Research Centre, Yenepoya (Deemed to be University), Mangalore 575018, INDIA. Electronic address:

Fungal hybrids arise through the interbreeding of distinct species. This hybridization process fosters increased genetic diversity and the emergence of new traits. Mechanisms driving hybridization include the loss of heterozygosity, copy number variations, and horizontal gene transfer.

View Article and Find Full Text PDF

Unlabelled: is one of the three most frequently mutated genes in age-related clonal hematopoiesis (CH), alongside and . CH can progress to myeloid malignancies including chronic monomyelocytic leukemia (CMML), and is also strongly associated with inflammatory cardiovascular disease and all-cause mortality in humans. DNMT3A and TET2 regulate DNA methylation and demethylation pathways respectively, and loss-of-function mutations in these genes reduce DNA methylation in heterochromatin, allowing de-repression of silenced elements in heterochromatin.

View Article and Find Full Text PDF

Comparative genomic analysis of Fusarium oxysporum f. sp. lycopersici reveals telomeric duplications of a lineage-specific region carrying SIX8 and PSL1 and genome-wide expansion of Foxy transposable elements.

Int J Biol Macromol

January 2025

State Key Laboratory of North China Crop Improvement and Regulation, Hebei Agricultural University, Baoding 071000, China; Key Laboratory of Vegetable Germplasm Innovation and Utilization of Hebei, Ministry of Education of China-Hebei Province Joint Innovation Center for Efficient Green Vegetable Industry, College of Horticulture, Hebei Agricultural University, Baoding 071000, China; Division of Plant Sciences, Research School of Biology, Australian National University, Canberra, ACT 2601, Australia. Electronic address:

Fusarium oxysporum f. sp. lycopersici (Fol), the causal agent of tomato wilt disease, is a soil-borne, vascular-colonizing fungal pathogen that severely impacts tomato production in most growing regions worldwide.

View Article and Find Full Text PDF

Stinging nettles () have a long history of association with human civilization, having been used as a source of textile fibers, food and medicine. Here, we present a chromosome-level, phased genome assembly for a diploid female clone of from Romania. Using a combination of PacBio HiFi, Oxford Nanopore, and Illumina sequencing, as well as Hi-C long-range interaction data (using a novel Hi-C protocol presented here), we assembled two haplotypes of 574.

View Article and Find Full Text PDF

LINE-1, the NORth star of nucleolar organization.

Genes Dev

January 2025

Institute for Research on Cancer and Aging of Nice (IRCAN), Institut National de la Santé et de la Recherche Médicale (INSERM), Centre National de la Recherche Scientifique (CNRS), University Cote d'Azur, Nice 06107, France

Long interspersed element-1 (LINE-1) retrotransposons are abundant transposable elements in mammals and significantly influence chromosome structure, chromatin organization, and 3D genome architecture. In this issue of , Ataei et al. (doi:10.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!