A tree-based model for homogeneous groupings of multinomials.

Stat Med

Department of Mathematics, Myongji University, Yongin, Korea.

Published: November 2005

The motivation of this paper is to provide a tree-based method for grouping multinomial data according to their classification probability vectors. We produce an initial tree by binary recursive partitioning whereby multinomials are successively split into two subsets and the splits are determined by maximizing the likelihood function. If the number of multinomials k is too large, we propose to order the multinomials, and then build the initial tree based on a dramatically smaller number k-1 of possible splits. The tree is then pruned from the bottom up. The pruning process involves a sequence of hypothesis tests of a single homogeneous group against the alternative that there are two distinct, internally homogeneous groups. As pruning criteria, the Bayesian information criterion and the Wilcoxon rank-sum test are proposed. The tree-based model is illustrated on genetic sequence data. Homogeneous groupings of genetic sequences present new opportunities to understand and align these sequences.

Download full-text PDF	Source
http://dx.doi.org/10.1002/sim.2182	DOI Listing

Publication Analysis

Top Keywords

tree-based model

homogeneous groupings

initial tree

homogeneous

model homogeneous

multinomials

groupings multinomials

multinomials motivation

motivation paper

paper provide

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!