We present a highly accurate gene-prediction system for eukaryotic genomes, called mGene. It combines in an unprecedented manner the flexibility of generalized hidden Markov models (gHMMs) with the predictive power of modern machine learning methods, such as Support Vector Machines (SVMs). Its excellent performance was proved in an objective competition based on the genome of the nematode Caenorhabditis elegans. Considering the average of sensitivity and specificity, the developmental version of mGene exhibited the best prediction performance on nucleotide, exon, and transcript level for ab initio and multiple-genome gene-prediction tasks. The fully developed version shows superior performance in 10 out of 12 evaluation criteria compared with the other participating gene finders, including Fgenesh++ and Augustus. An in-depth analysis of mGene's genome-wide predictions revealed that approximately 2200 predicted genes were not contained in the current genome annotation. Testing a subset of 57 of these genes by RT-PCR and sequencing, we confirmed expression for 24 (42%) of them. mGene missed 300 annotated genes, out of which 205 were unconfirmed. RT-PCR testing of 24 of these genes resulted in a success rate of merely 8%. These findings suggest that even the gene catalog of a well-studied organism such as C. elegans can be substantially improved by mGene's predictions. We also provide gene predictions for the four nematodes C. briggsae, C. brenneri, C. japonica, and C. remanei. Comparing the resulting proteomes among these organisms and to the known protein universe, we identified many species-specific gene inventions. In a quality assessment of several available annotations for these genomes, we find that mGene's predictions are most accurate.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2775605 | PMC |
http://dx.doi.org/10.1101/gr.090597.108 | DOI Listing |
Sci Rep
December 2024
KAUST Center of Excellence for Smart Health (KCSH), King Abdullah University of Science and Technology, Thuwal, 23955, Saudi Arabia.
Analyzing microbial samples remains computationally challenging due to their diversity and complexity. The lack of robust de novo protein function prediction methods exacerbates the difficulty in deriving functional insights from these samples. Traditional prediction methods, dependent on homology and sequence similarity, often fail to predict functions for novel proteins and proteins without known homologs.
View Article and Find Full Text PDFSci Rep
December 2024
Department of Critical Care Medicine, The Quzhou Affiliated Hospital of Wenzhou Medical University, Quzhou People's Hospital, Quzhou, 324000, Zhejiang, China.
Fluid administration is widely used to treat hypotension in patients undergoing veno-venous extracorporeal membrane oxygenation (VV-ECMO). However, excessive fluid administration may lead to fluid overload can aggravate acute respiratory distress syndrome (ARDS) and increase patient mortality, predicting fluid responsiveness is of great significance for VV-ECMO patients. This prospective single-center study was conducted in a medical intensive care unit (ICU) and finally included 51 VV-ECMO patients with ARDS in the prone position (PP).
View Article and Find Full Text PDFSci Rep
December 2024
Shandong Agricultural University, Taian, 271018, China.
Acoustic emission information can describe the damage degree of rock samples in the process of failure. However, as a discrete non-stationary signal, acoustic emission information is difficult to be effectively processed by conventional methods, while wavelet analysis is an effective method for non-stationary signal processing. Therefore, acoustic emission signal is deeply studied by using wavelet analysis method.
View Article and Find Full Text PDFSci Rep
December 2024
School of Civil Engineering and Architecture, Wuhan University of Technology, Wuhan, 430070, China.
Urban rail transit systems, represented by subways, have significantly alleviated the traffic pressure brought by urbanization and have addressed issues such as traffic congestion. However, as a commonly used construction method for subway tunnels, shield tunneling inevitably disturbs the surrounding soil, leading to uneven ground surface settlement, which can impact the safety of nearby buildings. Therefore, it is crucial to promptly obtain and predict the ground surface settlement induced by shield tunneling construction to enable safety warnings and evaluations.
View Article and Find Full Text PDFSci Rep
December 2024
College of Mining Engineering, Guizhou University of Engineering Science, Bijie, 551700, China.
The Laurani high-sulfidation epithermal deposit, located in the northeastern Altiplano of Bolivia, is a representative gold-polymetallic deposit linked to the late Miocene volcanic rocks that were formed approximately at about 7.5 Ma. At Laurani, four mineralization stages are defined.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!