High confidence rule mining for microarray analysis.

IEEE/ACM Trans Comput Biol Bioinform

Published: January 2008

We present an association rule mining method for mining high confidence rules, which describe interesting gene relationships from microarray datasets. Microarray datasets typically contain an order of magnitude more genes than experiments, rendering many data mining methods impractical as they are optimised for sparse datasets. A new family of row-enumeration rule mining algorithms have emerged to facilitate mining in dense datasets. These algorithms rely on pruning infrequent relationships to reduce the search space by using the support measure. This major shortcoming results in the pruning of many potentially interesting rules with low support but high confidence. We propose a new row-enumeration rule mining method, MaxConf, to mine high confidence rules from microarray data. MaxConf is a support-free algorithm which directly uses the confidence measure to effectively prune the search space. Experiments on three microarray datasets show that MaxConf outperforms support-based rule mining with respect to scalability and rule extraction. Furthermore, detailed biological analyses demonstrate the effectiveness of our approach -- the rules discovered by MaxConf are substantially more interesting and meaningful compared with support-based methods.

Download full-text PDF

Source
http://dx.doi.org/10.1109/tcbb.2007.1050DOI Listing

Publication Analysis

Top Keywords

rule mining
20
high confidence
16
microarray datasets
12
mining
8
mining method
8
confidence rules
8
row-enumeration rule
8
search space
8
rule
6
microarray
5

Similar Publications

Background And Aims: Infertility, as defined by the World Health Organization, is the inability to conceive after 12 months of regular, unprotected intercourse. This study aimed to identify factors influencing infertility by applying data mining techniques, specifically rule-mining methods, to analyze diverse patient data and uncover relevant insights. This approach involves a thorough analysis of patients' clinical characteristics, dietary habits, and overall conditions to identify complex patterns and relationships that may contribute to infertility.

View Article and Find Full Text PDF

Background: Stunting is a vital indicator of chronic undernutrition that reveals a failure to reach linear growth. Investigating growth and nutrition status during adolescence, in addition to infancy and childhood is very crucial. However, the available studies in Ethiopia have been usually focused in early childhood and they used the traditional stastical methods.

View Article and Find Full Text PDF

Efficiently extracting data from tables in the scientific literature is pivotal for building large-scale databases. However, the tables reported in materials science papers exist in highly diverse forms; thus, rule-based extractions are an ineffective approach. To overcome this challenge, the study presents MaTableGPT, which is a GPT-based table data extractor from the materials science literature.

View Article and Find Full Text PDF

Study on the application of brine mixing method in lithium extraction from Zabuye salt lake, Tibet.

Sci Rep

January 2025

MNR Key Laboratory of Saline Lake Resources and Environments, Institute of Mineral Resources, Chinese Academy of Geological Sciences, Beijing, 100037, China.

With the rapid development of new energy industry, the demand for lithium resources continues to rise. The salinity-gradient solar pond (SGSP) technology is used to extract the lithium carbonate from Zabuye salt lake brine in the Tibet Plateau of China. Years of production practice proved that due to the unsatisfactory quality and insufficient amount of lithium-rich brine used to make the SGSP, the yield and grade of lithium concentrate in the solar pond has been seriously affected.

View Article and Find Full Text PDF

Background: Public reporting of home health care agencies' performance metrics, including patient satisfaction, care processes, and health outcomes, aims to inform customer decisions and encourage agencies to improve the quality of services. However, there is limited research that examines the heterogeneous performance of home health care agencies.

Purposes: The aim of this study was to analyze the performance of home health care agencies by identifying distinct subgroups of agencies with similar performance profiles and describing the relationships between agency characteristics and such subgroups.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!