Background And Objective: Single Nucleotide Polymorphism (SNPs) are, nowadays, becoming the marker of choice for biological analyses involving a wide range of applications with great medical, biological, economic and environmental interest. Classification tasks i.e. the assignment of individuals to groups of origin based on their (multi-locus) genotypes, are performed in many fields such as forensic investigations, discrimination between wild and/or farmed populations and others. Τhese tasks, should be performed with a small number of loci, for computational as well as biological reasons. Thus, feature selection should precede classification tasks, especially for Single Nucleotide Polymorphism (SNP) datasets, where the number of features can amount to hundreds of thousands or millions.
Methods: In this paper, we present a novel data mining approach, called FIFS - Frequent Item Feature Selection, based on the use of frequent items for selection of the most informative markers from population genomic data. It is a modular method, consisting of two main components. The first one identifies the most frequent and unique genotypes for each sampled population. The second one selects the most appropriate among them, in order to create the informative SNP subsets to be returned.
Results: The proposed method (FIFS) was tested on a real dataset, which comprised of a comprehensive coverage of pig breed types present in Britain. This dataset consisted of 446 individuals divided in 14 sub-populations, genotyped at 59,436 SNPs. Our method outperforms the state-of-the-art and baseline methods in every case. More specifically, our method surpassed the assignment accuracy threshold of 95% needing only half the number of SNPs selected by other methods (FIFS: 28 SNPs, Delta: 70 SNPs Pairwise FST: 70 SNPs, In: 100 SNPs.) CONCLUSION: Our approach successfully deals with the problem of informative marker selection in high dimensional genomic datasets. It offers better results compared to existing approaches and can aid biologists in selecting the most informative markers with maximum discrimination power for optimization of cost-effective panels with applications related to e.g. species identification, wildlife management, and forensics.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1016/j.compbiomed.2017.09.020 | DOI Listing |
Expert Opin Drug Saf
January 2025
Department of Endocrinology, Guang'anmen Hospital of China Academy of Chinese Medical Sciences, Beijing, China.
Background: Fulminant type 1 diabetes mellitus (FT1DM) is a severe subtype of type 1 diabetes characterized by rapid onset, metabolic disturbances, and irreversible insulin secretion failure. Recent studies have suggested associations between FT1DM and certain medications, warranting further investigation.
Objectives: This study aims to analyze drugs associated with an increased risk of FT1DM using the Food and Drug Administration Adverse Event Reporting System (FAERS) database.
Sensors (Basel)
December 2024
School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi 214122, China.
With the rapid development of blockchain technology, fraudulent activities have significantly increased, posing a major threat to the personal assets of blockchain users. The blockchain transaction network formed during user transactions can be represented as a graph consisting of nodes and edges, making it suitable for a graph data structure. Fraudulent nodes in the transaction network are referred to as anomalous nodes.
View Article and Find Full Text PDFInt J Mol Sci
January 2025
Department of Biosciences and Bioinformatics, School of Science, Xi'an Jiaotong-Liverpool University, Suzhou 215123, China.
Melatonin is a hormone released by the pineal gland that regulates the sleep-wake cycle. It has been widely studied for its therapeutic effects on Alzheimer's disease (AD), particularly through the amyloidosis, oxidative stress, and neuroinflammation pathways. Nevertheless, the mechanisms through which it exerts its neuroprotective effects in AD are still largely unknown.
View Article and Find Full Text PDFMaterials (Basel)
December 2024
Sustainable Mining Engineering Research Group, Department of Mining, Mechanic, Energetic and Construction Engineering, Higher Technical School of Engineering, University of Huelva, 21007 Huelva, Spain.
This article shows the behavior of the corrosive effect of acid mine water on carbon steel metal alloys. Mining equipment, composed of various steel alloys, is particularly prone to damage from highly acidic water. This corrosion results in material thinning, brittle fractures, fatigue cracks, and ultimately, equipment failure.
View Article and Find Full Text PDFDiagnostics (Basel)
December 2024
Department of Family Medicine, Taichung Veterans General Hospital, Taichung 407219, Taiwan.
: The prevalence of diabetes is increasing worldwide, particularly in the Pacific Ocean island nations. Although machine learning (ML) models and data mining approaches have been applied to diabetes research, there was no study utilizing ML models to predict diabetes incidence in Taiwan. We aimed to predict the onset of diabetes in order to raise health awareness, thereby promoting any necessary lifestyle modifications and help mitigate disease burden.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!