Proc Natl Acad Sci U S A
December 2016
We propose approaching prediction from a framework grounded in the theoretical correct prediction rate of a variable set as a parameter of interest. This framework allows us to define a measure of predictivity that enables assessing variable sets for, preferably high, predictivity. We first define the prediction rate for a variable set and consider, and ultimately reject, the naive estimator, a statistic based on the observed sample data, due to its inflated bias for moderate sample size and its sensitivity to noisy useless variables.
View Article and Find Full Text PDFProc Natl Acad Sci U S A
November 2015
Thus far, genome-wide association studies (GWAS) have been disappointing in the inability of investigators to use the results of identified, statistically significant variants in complex diseases to make predictions useful for personalized medicine. Why are significant variables not leading to good prediction of outcomes? We point out that this problem is prevalent in simple as well as complex data, in the sciences as well as the social sciences. We offer a brief explanation and some statistical insights on why higher significance cannot automatically imply stronger predictivity and illustrate through simulations and a real breast cancer example.
View Article and Find Full Text PDFProc Natl Acad Sci U S A
August 2008
Analysis of a subset of case-control sporadic breast cancer data, [from the National Cancer Institute's Cancer Genetic Markers of Susceptibility (CGEMS) initiative], focusing on 18 breast cancer-related genes with 304 SNPs, indicates that there are many interesting interactions that form two- and three-way networks in which BRCA1 plays a dominant and central role. The apparent interactions of BRCA1 with many other genes suggests the conjecture that BRCA1 serves as a protective gene and that some mutations in it or in related genes may prevent it from carrying out this protective function even if the patients are not carriers of known cancer-predisposing BRCA1 mutations. The method of analysis features the evaluation of the effect of a gene by averaging the effects of the SNPs covered by that gene.
View Article and Find Full Text PDFWe have revisited the acoustic evidence in the Kennedy assassination--recordings of the two Dallas police radio channels upon which our original NRC report (Ramsey NF et al., Report of the Committee on Ballistic Acoustics. National Research Council (US).
View Article and Find Full Text PDFAn improved understanding of cellular responses during normal anterior cruciate ligament (ACL) function or repair is essential for clinical assessments, understanding ligament biology, and the implementation of tissue engineering strategies. The present study utilized quantitative real-time RT-PCR combined with univariate and multivariate statistical analyses to establish a quantitative database of marker transcript expression that can provide a "blueprint" of ACL wound healing. Selected markers (collagen types I and III, biglycan, decorin, MMP-1, MMP-2, MMP-9, and TIMP-1) were assessed from 33 torn ACLs harvested during reconstructive surgery.
View Article and Find Full Text PDF