Background: Previous studies identifying patients with inflammatory bowel disease using administrative codes have yielded inconsistent results. Our objective was to develop a robust electronic medical record-based model for classification of inflammatory bowel disease leveraging the combination of codified data and information from clinical text notes using natural language processing.
Methods: Using the electronic medical records of 2 large academic centers, we created data marts for Crohn's disease (CD) and ulcerative colitis (UC) comprising patients with ≥1 International Classification of Diseases, 9th edition, code for each disease.
The importance of developing personalized risk prediction estimates has become increasingly evident in recent years. In general, patient populations may be heterogenous and represent a mixture of different unknown subtypes of disease. When the source of this heterogeneity and resulting subtypes of disease are unknown, accurate prediction of survival may be difficult.
View Article and Find Full Text PDFObjectives: Psychiatric comorbidity is common in Crohn's disease (CD) and ulcerative colitis (UC). Inflammatory bowel disease (IBD)-related surgery or hospitalizations represent major events in the natural history of the disease. The objective of this study is to examine whether there is a difference in the risk of psychiatric comorbidity following surgery in CD and UC.
View Article and Find Full Text PDFIn recent years, a wide range of markers have become available as potential tools to predict risk or progression of disease. In addition to such biological and genetic markers, short term outcome information may be useful in predicting long term disease outcomes. When such information is available, it would be desirable to combine this along with predictive markers to improve the prediction of long term survival.
View Article and Find Full Text PDFLifetime Data Anal
April 2013
In many clinical applications, understanding when measurement of new markers is necessary to provide added accuracy to existing prediction tools could lead to more cost effective disease management. Many statistical tools for evaluating the incremental value (IncV) of the novel markers over the routine clinical risk factors have been developed in recent years. However, most existing literature focuses primarily on global assessment.
View Article and Find Full Text PDFDeveloping individualized prediction rules for disease risk and prognosis has played a key role in modern medicine. When new genomic or biological markers become available to assist in risk prediction, it is essential to assess the improvement in clinical usefulness of the new markers over existing routine variables. Net reclassification improvement (NRI) has been proposed to assess improvement in risk reclassification in the context of comparing two risk models and the concept has been quickly adopted in medical journals (Pencina et al.
View Article and Find Full Text PDFObjective: The significance of non-rheumatoid arthritis (RA) autoantibodies in patients with RA is unclear. The aim of this study was to assess associations of autoantibodies with autoimmune risk alleles and with clinical diagnoses from the electronic medical records (EMRs) among RA cases and non-RA controls.
Methods: Data on 1,290 RA cases and 1,236 non-RA controls of European genetic ancestry were obtained from the EMRs of 2 large academic centers.
Identification of novel biomarkers for risk assessment is important for both effective disease prevention and optimal treatment recommendation. Discovery relies on the precious yet limited resource of stored biological samples from large prospective cohort studies. Case-cohort sampling design provides a cost-effective tool in the context of biomarker evaluation, especially when the clinical condition of interest is rare.
View Article and Find Full Text PDFRisk prediction procedures can be quite useful for the patient's treatment selection, prevention strategy, or disease management in evidence-based medicine. Often, potentially important new predictors are available in addition to the conventional markers. The question is how to quantify the improvement from the new markers for prediction of the patient's risk in order to aid cost-benefit decisions.
View Article and Find Full Text PDFIn recent years, genome-wide association studies (GWAS) and gene-expression profiling have generated a large number of valuable datasets for assessing how genetic variations are related to disease outcomes. With such datasets, it is often of interest to assess the overall effect of a set of genetic markers, assembled based on biological knowledge. Genetic marker-set analyses have been advocated as more reliable and powerful approaches compared with the traditional marginal approaches (Curtis and others, 2005.
View Article and Find Full Text PDFObjective: The purpose of this article is to retrospectively compare right ventricular-to-left ventricular (RV/LV) diameter ratios measured on the standard axial view versus the reformatted four-chamber view as predictors of mortality after acute pulmonary embolism (PE).
Materials And Methods: Six hundred seventy-four consecutive patients (mean age, 58 years; 372 women) with a diagnosis of acute PE on pulmonary CT angiography were considered. The axial and reformatted four-chamber RV/LV diameter ratios were compared as predictors of 30-day all-cause and PE-related mortality.
In this paper we develop model checking techniques for assessing functional form specifications of covariates in censored linear regression models. These procedures are based on a censored data analog to taking cumulative sums of "robust" residuals over the space of the covariate under investigation. These cumulative sums are formed by integrating certain Kaplan-Meier estimators and may be viewed as "robust" censored data analogs to the processes considered by Lin, Wei & Ying (2002).
View Article and Find Full Text PDFObjectives: Electronic health records (EHR) can allow for the generation of large cohorts of individuals with given diseases for clinical and genomic research. A rate-limiting step is the development of electronic phenotype selection algorithms to find such cohorts. This study evaluated the portability of a published phenotype algorithm to identify rheumatoid arthritis (RA) patients from EHR records at three institutions with different EHR systems.
View Article and Find Full Text PDFTo estimate an overall treatment difference with data from a randomized comparative clinical study, baseline covariates are often utilized to increase the estimation precision. Using the standard analysis of covariance technique for making inferences about such an average treatment difference may not be appropriate, especially when the fitted model is nonlinear. On the other hand, the novel augmentation procedure recently studied, for example, by Zhang and others (2008.
View Article and Find Full Text PDFObjectives: The aim of this study was to evaluate whether coronary heart disease (CHD)-susceptibility loci identified by genome-wide association studies of the general population also contribute to CHD in type 2 diabetes.
Background: No study has examined the effects of these genetic variants on CHD in diabetic patients.
Methods: We genotyped 15 genetic markers of 12 loci in 3 studies of diabetic patients: the prospective Nurses' Health Study (309 CHD cases, and 544 control subjects) and Health Professional Follow-up Study (345 CHD cases, and 451 control subjects) and the cross-sectional Joslin Heart Study (422 CHD cases, and 435 control subjects).
To develop more targeted intervention strategies, an important research goal is to identify markers predictive of clinical events. A crucial step toward this goal is to characterize the clinical performance of a marker for predicting different types of events. In this article, we present statistical methods for evaluating the performance of a prognostic marker in predicting multiple competing events.
View Article and Find Full Text PDFBiostatistics
January 2012
Nested case-control (NCC) design is used frequently in epidemiological studies as a cost-effective subcohort sampling strategy to conduct biomarker research. Sampling strategy, on the other hoand, creates challenges for data analysis because of outcome-dependent missingness in biomarker measurements. In this paper, we propose inverse probability weighted (IPW) methods for making inference about the prognostic accuracy of a novel biomarker for predicting future events with data from NCC studies.
View Article and Find Full Text PDFIn this article, we develop a powerful test for identifying single nucleotide polymorphism (SNP)-sets that are predictive of survival with data from genome-wide association studies. We first group typed SNPs into SNP-sets based on genomic features and then apply a score test to assess the overall effect of each SNP-set on the survival outcome through a kernel machine Cox regression framework. This approach uses genetic information from all SNPs in the SNP-set simultaneously and accounts for linkage disequilibrium (LD), leading to a powerful test with reduced degrees of freedom when the typed SNPs are in LD with each other.
View Article and Find Full Text PDFSequencing studies are increasingly being conducted to identify rare variants associated with complex traits. The limited power of classical single-marker association analysis for rare variants poses a central challenge in such studies. We propose the sequence kernel association test (SKAT), a supervised, flexible, computationally efficient regression method to test for association between genetic variants (common and rare) in a region and a continuous or dichotomous trait while easily adjusting for covariates.
View Article and Find Full Text PDFQuantitative procedures for evaluating added values from new markers over a conventional risk scoring system for predicting event rates at specific time points have been extensively studied. However, a single summary statistic, for example, the area under the receiver operating characteristic curve or its derivatives, may not provide a clear picture about the relationship between the conventional and the new risk scoring systems. When there are no censored event time observations in the data, two simple scatterplots with individual conventional and new scores for "cases" and "controls" provide valuable information regarding the overall and the subject-specific level incremental values from the new markers.
View Article and Find Full Text PDFFor modern evidence-based medicine, a well thought-out risk scoring system for predicting the occurrence of a clinical event plays an important role in selecting prevention and treatment strategies. Such an index system is often established based on the subject's 'baseline' genetic or clinical markers via a working parametric or semi-parametric model. To evaluate the adequacy of such a system, C-statistics are routinely used in the medical literature to quantify the capacity of the estimated risk score in discriminating among subjects with different event times.
View Article and Find Full Text PDFBiostatistics
October 2011
Accurate risk prediction is an important step in developing optimal strategies for disease prevention and treatment. Based on the predicted risks, patients can be stratified to different risk categories where each category corresponds to a particular clinical intervention. Incorrect or suboptimal interventions are likely to result in unnecessary financial and medical consequences.
View Article and Find Full Text PDFIn disease screening and prognosis studies, an important task is to determine useful markers for identifying high-risk subgroups. Once such markers are established, they can be incorporated into public health practice to provide appropriate strategies for treatment or disease monitoring based on each individual's predicted risk. In the recent years, genetic and biological markers have been examined extensively for their potential to signal progression or risk of disease.
View Article and Find Full Text PDFThere is growing evidence that genomic and proteomic research holds great potential for changing irrevocably the practice of medicine. The ability to identify important genomic and biological markers for risk assessment can have a great impact in public health from disease prevention, to detection, to treatment selection. However, the potentially large number of markers and the complexity in the relationship between the markers and the outcome of interest impose a grand challenge in developing accurate risk prediction models.
View Article and Find Full Text PDF