Identifying and categorizing spurious weight data in electronic medical records.

Am J Clin Nutr

Geriatric Research, Education, and Clinical Center and Research and Development, Puget Sound VA Medical Center, Seattle, WA.

Published: March 2018

Background: Spurious weights compromise the validity of summary measures, such as averages and trends. Even rare errors in weight records can undermine the utility of electronic medical record (EMR) data.

Objective: We sought to estimate the prevalence of spurious weight values in a large EMR, to ascertain the likely causes, and to develop and test straightforward algorithms for identifying spurious weight data.

Design: Using EMR data from 10,000 randomly selected patients aged ≥65 y in the VA system, we examined the percentage of weight change across various time intervals, from 1 to 3000 d. We examined descriptive results and developed 3 algorithms to categorize degree of weight change over time. On the basis of distributions, we identified cases that were most likely spurious. We manually reviewed these and categorized the type of error.

Results: The data followed the expected distributions. The algorithms reliably identified spurious weight. Approximately 0.8% of all weights in the record appeared to be spurious and ∼1 in 5 patient charts included ≥1 spurious weight value. The most common type of error involved the misentry of a single digit (e.g., 148 for 178).

Conclusions: Spurious weights are common in EMRs. Straightforward algorithms can identify and remove them, and thus enhance the reliability of EMR data.

Download full-text PDF

Source
http://dx.doi.org/10.1093/ajcn/nqx056DOI Listing

Publication Analysis

Top Keywords

spurious weight
20
spurious
9
weight
8
electronic medical
8
spurious weights
8
straightforward algorithms
8
emr data
8
weight change
8
change time
8
identifying categorizing
4

Similar Publications

Nutritional epidemiology aims to link dietary exposures to chronic disease, but the instruments for evaluating dietary intake are inaccurate. One way to identify unreliable data and the sources of errors is to compare estimated intakes with the total energy expenditure (TEE). In this study, we used the International Atomic Energy Agency Doubly Labeled Water Database to derive a predictive equation for TEE using 6,497 measures of TEE in individuals aged 4 to 96 years.

View Article and Find Full Text PDF

Background: Elderly individuals living alone represent a vulnerable group with limited family support, making them more susceptible to mental health issues such as depression and anxiety. This study aims to construct a network model of depression and anxiety symptoms among older adults living alone, exploring the correlations and centrality of different symptoms. The goal is to identify core and bridging symptoms to inform clinical interventions.

View Article and Find Full Text PDF

Background: In causal analyses, some third factor may distort the relationship between the exposure and the outcome variables under study, which gives spurious results. In this case, treatment groups and control groups that receive and do not receive the exposure are different from one another in some other essential variables, called confounders.

Method: Place of birth was used as exposure variable and age-specific childhood vaccination status was used as outcome variables.

View Article and Find Full Text PDF

Identification of key transcription factors from transcriptome data by correlating gene expression levels with transcription factor binding sites is important for transcriptome data analysis. In a typical scenario, we always set a threshold to filter the top ranked differentially expressed genes and top ranked transcription factor binding sites. However, correlation analysis of filtered data can often result in spurious correlations.

View Article and Find Full Text PDF

The continuous advancement of face forgery techniques has caused a series of trust crises, posing a significant menace to information security and personal privacy. In response, deep learning is being employed to develop effective detection methods to identify deepfake images and videos. Currently, most detection methods generally achieve satisfactory performance in intra-domain detection.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!