Modern large-scale genetic association studies generate increasingly high-dimensional datasets. Therefore, some variable selection procedure should be performed before the application of traditional data analysis methods, for reasons of both computational efficiency and problems related to overfitting. We describe here a "wrapper" strategy (SIZEFIT) for variable selection that uses a Random Forests classifier, coupled with various local search/optimization algorithms.
View Article and Find Full Text PDFObjective: In isolated populations, 'background' linkage disequilibrium (LD) has been shown to extend over large genetic distances. This and their reduced environmental and genetic heterogeneity has stimulated interest in their potential for association mapping. We compared LD unit map distances with pair-wise measurements of LD in a dense single nucleotide polymorphism (SNP) set.
View Article and Find Full Text PDFExpression of prolactin and of prolactin and estrogen receptors in lymphocytes, bone marrow, and lymphoma cell lines suggests that hormonal modulation may influence lymphoma risk. Prolactin and estrogen promote the proliferation and survival of B cells, factors that may increase non-Hodgkin lymphoma risk, and effects of estrogen may be modified by catechol-O-methyltransferase (COMT), an enzyme that alters estrogenic activity. Cytochrome P450 17A1 (CYP17A1), a key enzyme in estrogen biosynthesis, has been associated with increased cancer risk and may affect lymphoma susceptibility.
View Article and Find Full Text PDFAdmixture mapping (also known as "mapping by admixture linkage disequilibrium," or MALD) provides a way of localizing genes that cause disease, in admixed ethnic groups such as African Americans, with approximately 100 times fewer markers than are required for whole-genome haplotype scans. However, it has not been possible to perform powerful scans with admixture mapping because the method requires a dense map of validated markers known to have large frequency differences between Europeans and Africans. To create such a map, we screened through databases containing approximately 450000 single-nucleotide polymorphisms (SNPs) for which frequencies had been estimated in African and European population samples.
View Article and Find Full Text PDFThe laboratory rat (Rattus norvegicus) is an indispensable tool in experimental medicine and drug development, having made inestimable contributions to human health. We report here the genome sequence of the Brown Norway (BN) rat strain. The sequence represents a high-quality 'draft' covering over 90% of the genome.
View Article and Find Full Text PDFThe entire 2.9-billion-letter sequence (nucleotide base pairs) of the human genome is available as a resource for scientific discovery. Some of the findings from the completion of the human genome were expected, confirming knowledge anticipated by many years of research and analysis in both human and comparative genetics.
View Article and Find Full Text PDF