In recent years, the analysis of orthologous genes based on phylogenetic profiles has received popularity in bioinfomatics. We propose a new method to extract organism groups and their hierarchy from phylogenetic profiles using the independent component analysis (ICA). The method involves first finding independent axes in the projected space from the multivariate data matrix representing phylogenetic profiles for a number of orthologous genes. Then the extracted axes are correlated with major organism groups, according to the extent of affiliation of axes scores for all the genes to specific organisms. The ICA was applied to the phylogenetic profiles created for 2,875 orthologs in 77 organisms by using the KEGG/GENES database. The 9 extracted components out of 18 predefined components well represented the organism groups as categorized in KEGG. Furthermore, we performed the cluster analysis and obtained the hierarchy of organism groups.
Download full-text PDF |
Source |
---|
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!