Objectives: Automatic de-identification to remove protected health information (PHI) from clinical text can use a "binary" model that replaces redacted text with a generic tag (e.g., "
Clinical narratives (the text notes found in patients' medical records) are important information sources for secondary use in research. However, in order to protect patient privacy, they must be de-identified prior to use. Manual de-identification is considered to be the gold standard approach but is tedious, expensive, slow, and impractical for use with large-scale clinical data.
View Article and Find Full Text PDFObjectives: Extracting data from publication reports is a standard process in systematic review (SR) development. However, the data extraction process still relies too much on manual effort which is slow, costly, and subject to human error. In this study, we developed a text summarization system aimed at enhancing productivity and reducing errors in the traditional data extraction process.
View Article and Find Full Text PDFObjectives: Data extraction from original study reports is a time-consuming, error-prone process in systematic review development. Information extraction (IE) systems have the potential to assist humans in the extraction task, however majority of IE systems were not designed to work on Portable Document Format (PDF) document, an important and common extraction source for systematic review. In a PDF document, narrative content is often mixed with publication metadata or semi-structured text, which add challenges to the underlining natural language processing algorithm.
View Article and Find Full Text PDFObjective: Literature database search is a crucial step in the development of clinical practice guidelines and systematic reviews. In the age of information technology, the process of literature search is still conducted manually, therefore it is costly, slow and subject to human errors. In this research, we sought to improve the traditional search approach using innovative query expansion and citation ranking approaches.
View Article and Find Full Text PDFObjective: We developed a novel computer application called Glyph that automatically converts text to sets of illustrations using natural language processing and computer graphics techniques to provide high quality pictographs for health communication. In this study, we evaluated the ability of the Glyph system to illustrate a set of actual patient instructions, and tested patient recall of the original and Glyph illustrated instructions.
Methods: We used Glyph to illustrate 49 patient instructions representing 10 different discharge templates from the University of Utah Cardiology Service.
Objectives: Natural language processing (NLP) applications typically use regular expressions that have been developed manually by human experts. Our goal is to automate both the creation and utilization of regular expressions in text classification.
Methods: We designed a novel regular expression discovery (RED) algorithm and implemented two text classifiers based on RED.