Transductive learning as an alternative to translation initiation site identification.

Cristiano Lacerda Nunes Pinto Cristiane Neri Nobre Luis Enrique Zárate

BMC Bioinformatics

Pontifical Catholic University of Minas Gerais - PUC-MG, 255, Walter Ianni Street, Belo Horizonte, 31980-110, Brazil.

Published: February 2017

Background: The correct protein coding region identification is an important and latent problem in the molecular biology field. This problem becomes a challenge due to the lack of deep knowledge about the biological systems and unfamiliarity of conservative characteristics in the messenger RNA (mRNA). Therefore, it is fundamental to research for computational methods aiming to help the patterns discovery for identification of the Translation Initiation Sites (TIS). In the field of Bioinformatics, machine learning methods have been widely applied based on the inductive inference, as Inductive Support Vector Machine (ISVM). On the other hand, not so much attention has been given to transductive inference-based machine learning methods such as Transductive Support Vector Machine (TSVM). The transductive inference performs well for problems in which the amount of unlabeled sequences is considerably greater than the labeled ones. Similarly, the problem of predicting the TIS may take advantage of transductive methods due to the fact that the amount of new sequences grows rapidly with the progress of Genome Project that allows the study of new organisms. Consequently, this work aims to investigate the transductive learning towards TIS identification and compare the results with those obtained in inductive method.

Results: The transductive inference presents better results both in F-measure and in sensitivity in comparison with the inductive method for predicting the TIS. Additionally, it presents the least failure rate for identifying the TIS, presenting a smaller number of False Negatives (FN) than the ISVM. The ISVM and TSVM methods were validated with the molecules from the most representative organisms contained in the RefSeq database: Rattus norvegicus, Mus musculus, Homo sapiens, Drosophila melanogaster and Arabidopsis thaliana. The transductive method presented F-measure and sensitivity higher than 90% and also higher than the results obtained with ISVM. The ISVM and TSVM approaches were implemented in the TransduTIS tool, TransduTIS-I and TransduTIS-T respectively, available in a web interface. These approaches were compared with the TISHunter, TIS Miner, NetStart tools, presenting satisfactory results.

Conclusions: In relation to precision, the results are similar for the ISVM and TSVM classifiers. However, the results show that the application of TSVM approach ensured an improvement, specially for F-measure and sensitivity. Moreover, it was possible to identify a potential for the application of TSVM, which is for organisms in the initial study phase with few identified sequences in the databases.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5290616	PMC
http://dx.doi.org/10.1186/s12859-017-1502-6	DOI Listing

Publication Analysis

Top Keywords

f-measure sensitivity

isvm tsvm

transductive

transductive learning

translation initiation

machine learning

learning methods

support vector

vector machine

transductive inference

Similar Publications

An efficient cardiovascular disease prediction model through AI-driven IoT technology.

Comput Biol Med

December 2024

Department of Economics, University of Foggia, Foggia, Italy.

Agostino Marengo Alessandro Pagano Vito Santamato

Conditions affecting the circulatory system and blood vessels are referred to as cardiovascular diseases that include strokes and heart attacks. Internet of Things (IoT) technologies monitor health metrics, identify irregularities and enable remote patient care, resulting in earlier intervention and more individualized therapy. This research aims to establish an efficient cardiovascular disease prediction model through Artificial intelligence (AI)-driven IoT technology.

View Article and Find Full Text PDF

Similar Publications

Smart monitoring solution for dengue infection control: A digital twin-inspired approach.

Comput Methods Programs Biomed

December 2024

Lovely Professional University, Jalandhar, 144001, Punjab, India. Electronic address:

Ankush Manocha Munish Bhatia Gulshan Kumar

Article Synopsis

The study introduces a digital twin-inspired monitoring system that improves accuracy in smart healthcare by using a hybrid modeling approach, particularly to monitor and predict dengue fever susceptibility.
It utilizes advanced technologies like IoT, k-means clustering, and artificial neural networks for real-time observation and predictions about the risk of dengue infection.
The experimental results show high performance metrics, including a classification accuracy of 92.86% and a significant 48% reduction in prediction errors, highlighting the system's effectiveness in identifying health vulnerabilities related to dengue.

View Article and Find Full Text PDF

Similar Publications

Machine learning-based classification of valvular heart disease using cardiovascular risk factors.

Sci Rep

October 2024

Department of Statistics, Metropolitan International University, Kampala, Uganda.

Muhammad Usman Aslam Songhua Xu Sajid Hussain Muhammad Waqas Nafiu Lukman Abiodun

Article Synopsis

Valvular Heart Disease (VHD) is a major cause of death, especially in older people, and this study explores the unknown risk factors associated with it.
The research utilizes machine learning techniques, including various classifiers like SVM, to analyze VHD cases and assess the effectiveness of these methods in diagnosis.
Findings indicate that combining SVM with Principal Component Analysis (PCA) offers the best performance, emphasizing the need for a comprehensive strategy to address the prevalence of VHD based on identified risk factors.

View Article and Find Full Text PDF

Similar Publications

Cascaded Thinning in Upscale and Downscale Representation for EEG Signal Processing.

IEEE Trans Neural Syst Rehabil Eng

October 2024

Quang Manh Doan Tran Hiep Dinh Avinash Kumar Singh Chin-Teng Lin Nguyen Linh Trung

Smoothing filters are widely used in EEG signal processing for noise removal while preserving signals' features. Inspired by our recent work on Upscale and Downscale Representation (UDR), this paper proposes a cascade arrangement of some effective image-processing techniques for signal filtering in the image domain. The UDR concept is to visualize EEG signals at an appropriate line width and convert it to a binary image.

View Article and Find Full Text PDF

Similar Publications

RP squeeze U-SegNet model for lesion segmentation and optimization enabled ShuffleNet based multi-level severity diabetic retinopathy classification.

Network

September 2024

Department of Computer Science and Engineering, Karpagam College of Engineering, Coimbatore, India.

Zulaikha Beevi Sulaiman

In Diabetic Retinopathy (DR), the retina is harmed due to the high blood pressure in small blood vessels. Manual screening is time-consuming, which can be overcome by using automated techniques. Hence, this paper proposed a new method for classifying the multi-level severity of DR.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!