Motivation: The annotation of the Arabidopsis thaliana genome remains a problem in terms of time and quality. To improve the annotation process, we want to choose the most appropriate tools to use inside a computer-assisted annotation platform. We therefore need evaluation of prediction programs with Arabidopsis sequences containing multiple genes.
Results: We have developed AraSet, a data set of contigs of validated genes, enabling the evaluation of multi-gene models for the Arabidopsis genome. Besides conventional metrics to evaluate gene prediction at the site and the exon levels, new measures were introduced for the prediction at the protein sequence level as well as for the evaluation of gene models. This evaluation method is of general interest and could apply to any new gene prediction software and to any eukaryotic genome. The GeneMark.hmm program appears to be the most accurate software at all three levels for the Arabidopsis genomic sequences. Gene modeling could be further improved by combination of prediction software.
Availability: The AraSet sequence set, the Perl programs and complementary results and notes are available at http://sphinx.rug.ac.be:8080/biocomp/napav/.
Contact: Pierre.Rouze@gengenp.rug.ac.be.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1093/bioinformatics/15.11.887 | DOI Listing |
In 2021, a year before ChatGPT took the world by storm amid the excitement about generative artificial intelligence (AI), AlphaFold 2 cracked the 50-year-old protein-folding problem, predicting three-dimensional (3D) structures for more than 200 million proteins from their amino acid sequences. This accomplishment was a precursor to an unprecedented burgeoning of large language models (LLMs) in the life sciences. That was just the beginning.
View Article and Find Full Text PDFPLoS Biol
January 2025
Institute for Biological Physics, University of Cologne, Cologne, Germany.
Type 4 pili (T4P) are multifunctional filaments involved in adhesion, surface motility, biofilm formation, and horizontal gene transfer. These extracellular polymers are surface-exposed and, therefore, act as antigens. The human pathogen Neisseria gonorrhoeae uses pilin antigenic variation to escape immune surveillance, yet it is unclear how antigenic variation impacts most other functions of T4P.
View Article and Find Full Text PDFPLoS One
January 2025
Department of Reproductive Medicine, Guangzhou Women and Children's Medical center Liuzhou Hospital, Liuzhou, Guangxi, China.
Endometrial cancer (UCEC) is the most prevalent gynecological malignancy in high-income countries, and its incidence is rising globally. Although early-stage UCEC can be treated with surgery, advanced cases have a poor prognosis, highlighting the need for effective molecular biomarkers to improve diagnosis and prognosis. In this study, we analyzed mRNA and miRNA sequencing data from UCEC tissues and adjacent non-cancerous tissues from the TCGA database.
View Article and Find Full Text PDFPLoS One
January 2025
Department of Gastroenterology, The First Affiliated Hospital of Chongqing Medical University, Chongqing, China.
Colon cancer, as a highly prevalent malignant tumor globally, poses a significant threat to human health. In recent years, ferroptosis and cuproptosis, as two novel forms of cell death, have attracted widespread attention for their potential roles in the development and treatment of colon cancer. However, the investigation into the subtypes and their impact on the survival of colon cancer patients remains understudied.
View Article and Find Full Text PDFPlant Dis
January 2025
Henan University of Science and Technology, agricultural college, Luoyang, [Select a State/Province], China;
Sweetpotato Stem Rot Nematode () causes the most devastating disease affecting sweetpotato production in China. The objectives of this study were: i) establish a quantification method using real-time PCR for of sweetpotato; ii) analyze the effect of density at harvest on the percentage of disease incidence in sweetpotatoes; and iii) evaluate the effect of soil physical properties on disease incidence. Populations of isolated from 28 different production areas in Henan Province exhibited identical sequences, and then real-time PCR specific primers (PRNf and PRNr) were designed.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!