Correlating multiple SNPs and multiple disease phenotypes: penalized non-linear canonical correlation analysis.

Bioinformatics

Department of Clinical Epidemiology, Biostatistics and Bioinformatics, Academic Medical Center, University of Amsterdam, Meibergdreef 9, 1100 DD Amsterdam, The Netherlands.

Published: November 2009

AI Article Synopsis

Article Abstract

Motivation: Canonical correlation analysis (CCA) can be used to capture the underlying genetic background of a complex disease, by associating two datasets containing information about a patient's phenotypical and genetic details. Often the genetic information is measured on a qualitative scale, consequently ordinary CCA cannot be applied to such data. Moreover, the size of the data in genetic studies can be enormous, thereby making the results difficult to interpret.

Results: We developed a penalized non-linear CCA approach that can deal with qualitative data by transforming each qualitative variable into a continuous variable through optimal scaling. Additionally, sparse results were obtained by adapting soft-thresholding to this non-linear version of the CCA. By means of simulation studies, we show that our method is capable of extracting relevant variables out of high-dimensional sets. We applied our method to a genetic dataset containing 144 patients with glial cancer.

Contact: s.waaijenborg@amc.uva.nl.

Download full-text PDF

Source
http://dx.doi.org/10.1093/bioinformatics/btp491DOI Listing

Publication Analysis

Top Keywords

penalized non-linear
8
canonical correlation
8
correlation analysis
8
genetic
5
correlating multiple
4
multiple snps
4
snps multiple
4
multiple disease
4
disease phenotypes
4
phenotypes penalized
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!