SubFeat: Feature subspacing ensemble classifier for function prediction of DNA, RNA and protein sequences.

Comput Biol Chem

Department of Computer Science and Engineering, United International University, United City, Madani Avenue, Badda, Dhaka 1212, Bangladesh. Electronic address:

Published: June 2021

The information of a cell is primarily contained in deoxyribonucleic acid (DNA). There is a flow of DNA information to protein sequences via ribonucleic acids (RNA) through transcription and translation. These entities are vital for the genetic process. Recent epigenetics developments also show the importance of the genetic material and knowledge of their attributes and functions. However, the growth in these entities' available features or functionalities is still slow due to the time-consuming and expensive in vitro experimental methods. In this paper, we have proposed an ensemble classification algorithm called SubFeat to predict biological entities' functionalities from different types of datasets. Our model uses a feature subspace-based novel ensemble method. It divides the feature space into sub-spaces, which are then passed to learn individual classifier models. The ensemble is built on these base classifiers that use a weighted majority voting mechanism. SubFeat tested on four datasets comprising two DNA, one RNA, and one protein dataset, and it outperformed all the existing single classifiers and the ensemble classifiers. SubFeat is made available as a Python-based tool. We have made the package SubFeat available online along with a user manual. It is freely accessible from here: https://github.com/fazlulhaquejony/SubFeat.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.compbiolchem.2021.107489DOI Listing

Publication Analysis

Top Keywords

dna rna
8
rna protein
8
protein sequences
8
subfeat
5
ensemble
5
subfeat feature
4
feature subspacing
4
subspacing ensemble
4
ensemble classifier
4
classifier function
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!