Text analysis of MEDLINE for discovering functional relationships among genes: evaluation of keyword extraction weighting schemes.

Int J Data Min Bioinform

Laboratory for Bioinformatics and Medical Informatics, Department of Computer Science, University of Texas at Dallas, Richardson, TX 75083-0688, USA.

Published: May 2008

One of the key challenges of microarray studies is to derive biological insights from the gene-expression patterns. Clustering genes by functional keyword association can provide direct information about the functional links among genes. However, the quality of the keyword lists significantly affects the clustering results. We compared two keyword weighting schemes: normalised z-score and term frequency-inverse document frequency (TFIDF). Two gene sets were tested to evaluate the effectiveness of the weighting schemes for keyword extraction for gene clustering. Using established measures of cluster quality, the results produced from TFIDF-weighted keywords outperformed those produced from normalised z-score weighted keywords. The optimised algorithms should be useful for partitioning genes from microarray lists into functionally discrete clusters.

Download full-text PDF

Source
http://dx.doi.org/10.1504/ijdmb.2006.009923DOI Listing

Publication Analysis

Top Keywords

weighting schemes
12
keyword extraction
8
normalised z-score
8
keyword
5
text analysis
4
analysis medline
4
medline discovering
4
discovering functional
4
functional relationships
4
genes
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!