Background: High dimensional feature space generally degrades classification in several applications. In this paper, we propose a strategy called gene masking, in which non-contributing dimensions are heuristically removed from the data to improve classification accuracy.

Methods: Gene masking is implemented via a binary encoded genetic algorithm that can be integrated seamlessly with classifiers during the training phase of classification to perform feature selection. It can also be used to discriminate between features that contribute most to the classification, thereby, allowing researchers to isolate features that may have special significance.

Results: This technique was applied on publicly available datasets whereby it substantially reduced the number of features used for classification while maintaining high accuracies.

Conclusion: The proposed technique can be extremely useful in feature selection as it heuristically removes non-contributing features to improve the performance of classifiers.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5260793PMC
http://dx.doi.org/10.1186/s12920-016-0233-2DOI Listing

Publication Analysis

Top Keywords

gene masking
12
feature selection
8
classification
6
masking technique
4
technique improve
4
improve accuracy
4
accuracy cancer
4
cancer classification
4
classification high
4
high dimensionality
4

Similar Publications

The established consensus sequence for human 5' splice sites masks the presence of two major splice site classes defined by preferential base-pairing potentials with either U5 snRNA loop 1 or the U6 snRNA ACAGA box. The two 5' splice site classes are separable in genome sequences, sensitized by specific genotypes and associated with splicing complexity. The two classes reflect the commitment to 5' splice site usage occurring primarily during 5' splice site transfer to U6 snRNA.

View Article and Find Full Text PDF

Timescale and genetic linkage explain the variable impact of defense systems on horizontal gene transfer.

Genome Res

January 2025

Centro de Astrobiología (CAB), CSIC-INTA, Institute for Biocomputation and Physics of Complex Systems (BIFI), University of Zaragoza

Prokaryotes have evolved a wide repertoire of defense systems to prevent invasion by mobile genetic elements (MGE). However, because MGE are vehicles for the exchange of beneficial accessory genes, defense systems could consequently impede rapid adaptation in microbial populations. Here, we study how defense systems impact horizontal gene transfer (HGT) in the short and long terms.

View Article and Find Full Text PDF

Chromosome-level haplotype-resolved genome of the tropical loach (Oreonectes platycephalus).

Sci Data

January 2025

Area of Ecology and Biodiversity, School of Biological Sciences, The University of Hong Kong, Hong Kong SAR, China.

The flat-headed loach (Oreonectes platycephalus) is a small fish inhabiting headwaters of hillstreams of southern China. Its local populations are characterized by low genetic diversity and exceptionally high differentiation, making it an ideal model for studying small population isolates' persistence and adaptive potential. However, the lack of Oreonectes reference genomes limits endeavours toward these ambitions.

View Article and Find Full Text PDF

Small RNA CjNC110 regulates the activated methyl cycle to enable optimal chicken colonization by .

mSphere

January 2025

Department of Veterinary Microbiology and Preventive Medicine, College of Veterinary Medicine, Iowa State University, Ames, Iowa, USA.

Post-transcriptional gene regulation by non-coding small RNAs (sRNAs) is critical for colonization and survival of enteric pathogens, including the zoonotic pathogen . In this study, we utilized IA3902 (a representative isolate of the sheep abortion clone) and W7 (a highly motile variant of NCTC 11168, a human gastroenteritis strain) to further investigate regulation by sRNA CjNC110. Both motility and autoagglutination ability were confirmed to be phenotypes of conserved regulation by CjNC110.

View Article and Find Full Text PDF

STMGraph: spatial-context-aware of transcriptomes via a dual-remasked dynamic graph attention model.

Brief Bioinform

November 2024

Center for Genomics and Biotechnology, Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Haixia Institute of Science and Technology, Fujian Agriculture and Forestry University, No. 15 Shangxiadian Road, Cangshan District, Fuzhou 350002, China.

Spatial transcriptomics (ST) technologies enable dissecting the tissue architecture in spatial context. To perceive the global contextual information of gene expression patterns in tissue, the spatial dependence of cells must be fully considered by integrating both local and non-local features by means of spatial-context-aware. However, the current ST integration algorithm ignores for ST dropouts, which impedes the spatial-aware of ST features, resulting in challenges in the accuracy and robustness of microenvironmental heterogeneity detecting, spatial domain clustering, and batch-effects correction.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!