Hereditary predisposition and causative environmental exposures have long been recognized in human malignancies. In most instances, cancer cases occur sporadically, suggesting that environmental influences are critical in determining cancer risk. To test the influence of genetic polymorphisms on breast cancer risk, we have measured 98 single nucleotide polymorphisms (SNPs) distributed over 45 genes of potential relevance to breast cancer etiology in 174 patients and have compared these with matched normal controls. Using machine learning techniques such as support vector machines (SVMs), decision trees, and naïve Bayes, we identified a subset of three SNPs as key discriminators between breast cancer and controls. The SVMs performed maximally among predictive models, achieving 69% predictive power in distinguishing between the two groups, compared with a 50% baseline predictive power obtained from the data after repeated random permutation of class labels (individuals with cancer or controls). However, the simpler naïve Bayes model as well as the decision tree model performed quite similarly to the SVM. The three SNP sites most useful in this model were (a) the +4536T/C site of the aldosterone synthase gene CYP11B2 at amino acid residue 386 Val/Ala (T/C) (rs4541); (b) the +4328C/G site of the aryl hydrocarbon hydroxylase CYP1B1 at amino acid residue 293 Leu/Val (C/G) (rs5292); and (c) the +4449C/T site of the transcription factor BCL6 at amino acid 387 Asp/Asp (rs1056932). No single SNP site on its own could achieve more than 60% in predictive accuracy. We have shown that multiple SNP sites from different genes over distant parts of the genome are better at identifying breast cancer patients than any one SNP alone. As high-throughput technology for SNPs improves and as more SNPs are identified, it is likely that much higher predictive accuracy will be achieved and a useful clinical tool developed.

Download full-text PDF

Source
http://dx.doi.org/10.1158/1078-0432.ccr-1115-03DOI Listing

Publication Analysis

Top Keywords

breast cancer
20
amino acid
12
predictive models
8
cancer
8
single nucleotide
8
nucleotide polymorphisms
8
cancer risk
8
naïve bayes
8
cancer controls
8
predictive power
8

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!