Phenotypic characteristics of a plant species refers to its physical properties as cataloged by plant biologists at different research centers around the world. Clustering species based upon their phenotypic characteristics is used to obtain diverse sets of parents that are useful in their breeding programs. The Hierarchical Clustering (HC) algorithm is the current standard in clustering of phenotypic data.
View Article and Find Full Text PDFWe develop a Vector Quantized Spectral Clustering (VQSC) algorithm that is a combination of spectral clustering (SC) and vector quantization (VQ) sampling for grouping genome sequences of plants. The inspiration here is to use SC for its accuracy and VQ to make the algorithm computationally cheap (the complexity of SC is cubic in terms of the input size). Although the combination of SC and VQ is not new, the novelty of our work is in developing the crucial similarity matrix in SC as well as use of -medoids in VQ, both adapted for the plant genome data.
View Article and Find Full Text PDF