Publications by Jung-wei Fan | LitMetric

Publications by authors named "Jung-wei Fan"

Page 1 of 1

Midwest rural-urban disparities in use of patient online services for COVID-19.

Ming Huang Andrew Wen Huan He Liwei Wang Sijia Liu Jung-Wei Fan Hongfang Liu

J Rural Health

September 2022

Purpose: Rural populations are disproportionately affected by the COVID-19 pandemic. We characterized urban-rural disparities in patient portal messaging utilization for COVID-19, and, of those who used the portal during its early stage in the Midwest.

Methods: We collected over 1 million portal messages generated by midwestern Mayo Clinic patients from February to August 2020.

View Article and Find Full Text PDF

A review of auditing techniques for the Unified Medical Language System.

Ling Zheng Zhe He Duo Wei Vipina Keloth Jung-Wei Fan

J Am Med Inform Assoc

October 2020

Objective: The study sought to describe the literature related to the development of methods for auditing the Unified Medical Language System (UMLS), with particular attention to identifying errors and inconsistencies of attributes of the concepts in the UMLS Metathesaurus.

Materials And Methods: We applied the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) approach by searching the MEDLINE database and Google Scholar for studies referencing the UMLS and any of several terms related to auditing, error detection, and quality assurance. A qualitative analysis and summarization of articles that met inclusion criteria were performed.

View Article and Find Full Text PDF

Word-of-Mouth Innovation: Hypothesis Generation for Supplement Repurposing based on Consumer Reviews.

Jung-Wei Fan Yves A Lussier

AMIA Annu Symp Proc

February 2019

Dietary supplements remain a relatively underexplored source for drug repurposing. A systematic approach to soliciting responses from a large consumer population is desirable to speed up innovation. We tested a workflow that mines unexpected benefits of dietary supplements from massive consumer reviews.

View Article and Find Full Text PDF

Semantic Modeling for Exposomics with Exploratory Evaluation in Clinical Context.

Jung-Wei Fan Jianrong Li Yves A Lussier

J Healthc Eng

July 2019

Exposome is a critical dimension in the precision medicine paradigm. Effective representation of exposomics knowledge is instrumental to melding nongenetic factors into data analytics for clinical research. There is still limited work in (1) modeling exposome entities and relations with proper integration to mainstream ontologies and (2) systematically studying their presence in clinical context.

View Article and Find Full Text PDF

Mining Health-Related Issues in Consumer Product Reviews by Using Scalable Text Analytics.

Manabu Torii Sameer S Tilak Son Doan Daniel S Zisook Jung-Wei Fan

Biomed Inform Insights

July 2016

In an era when most of our life activities are digitized and recorded, opportunities abound to gain insights about population health. Online product reviews present a unique data source that is currently underexplored. Health-related information, although scarce, can be systematically mined in online product reviews.

View Article and Find Full Text PDF

Risk factor detection for heart disease by applying text analytics in electronic medical records.

Manabu Torii Jung-Wei Fan Wei-Li Yang Theodore Lee Matthew T Wiley

J Biomed Inform

December 2015

In the United States, about 600,000 people die of heart disease every year. The annual cost of care services, medications, and lost productivity reportedly exceeds 108.9 billion dollars.

View Article and Find Full Text PDF

Parsing clinical text: how good are the state-of-the-art parsers?

Min Jiang Yang Huang Jung-wei Fan Buzhou Tang Josh Denny

BMC Med Inform Decis Mak

March 2016

Background: Parsing, which generates a syntactic structure of a sentence (a parse tree), is a critical component of natural language processing (NLP) research in any domain including medicine. Although parsers developed in the general English domain, such as the Stanford parser, have been applied to clinical text, there are no formal evaluations and comparisons of their performance in the medical domain.

Methods: In this study, we investigated the performance of three state-of-the-art parsers: the Stanford parser, the Bikel parser, and the Charniak parser, using following two datasets: (1) A Treebank containing 1,100 sentences that were randomly selected from progress notes used in the 2010 i2b2 NLP challenge and manually annotated according to a Penn Treebank based guideline; and (2) the MiPACQ Treebank, which is developed based on pathology notes and clinical notes, containing 13,091 sentences.

View Article and Find Full Text PDF

Syntactic parsing of clinical text: guideline and corpus development with handling ill-formed sentences.

Jung-wei Fan Elly W Yang Min Jiang Rashmi Prasad Richard M Loomis

J Am Med Inform Assoc

December 2013

Objective: To develop, evaluate, and share: (1) syntactic parsing guidelines for clinical text, with a new approach to handling ill-formed sentences; and (2) a clinical Treebank annotated according to the guidelines. To document the process and findings for readers with similar interest.

Methods: Using random samples from a shared natural language processing challenge dataset, we developed a handbook of domain-customized syntactic parsing guidelines based on iterative annotation and adjudication between two institutions.

View Article and Find Full Text PDF

Part-of-speech tagging for clinical text: wall or bridge between institutions?

Jung-wei Fan Rashmi Prasad Rommel M Yabut Richard M Loomis Daniel S Zisook

AMIA Annu Symp Proc

February 2013

Part-of-speech (POS) tagging is a fundamental step required by various NLP systems. The training of a POS tagger relies on sufficient quality annotations. However, the annotation process is both knowledge-intensive and time-consuming in the clinical domain.

View Article and Find Full Text PDF

Deriving a probabilistic syntacto-semantic grammar for biomedicine based on domain-specific terminologies.

Jung-Wei Fan Carol Friedman

J Biomed Inform

October 2011

Biomedical natural language processing (BioNLP) is a useful technique that unlocks valuable information stored in textual data for practice and/or research. Syntactic parsing is a critical component of BioNLP applications that rely on correctly determining the sentence and phrase structure of free text. In addition to dealing with the vast amount of domain-specific terms, a robust biomedical parser needs to model the semantic grammar to obtain viable syntactic structures.

View Article and Find Full Text PDF

A review of auditing methods applied to the content of controlled biomedical terminologies.

Xinxin Zhu Jung-Wei Fan David M Baorto Chunhua Weng James J Cimino

J Biomed Inform

June 2009

Although controlled biomedical terminologies have been with us for centuries, it is only in the last couple of decades that close attention has been paid to the quality of these terminologies. The result of this attention has been the development of auditing methods that apply formal methods to assessing whether terminologies are complete and accurate. We have performed an extensive literature review to identify published descriptions of these methods and have created a framework for characterizing them.

View Article and Find Full Text PDF

Generating quality word sense disambiguation test sets based on MeSH indexing.

Jung-Wei Fan Carol Friedman

AMIA Annu Symp Proc

November 2009

Word sense disambiguation (WSD) determines the correct meaning of a word that has more than one meaning, and is a critical step in biomedical natural language processing, as interpretation of information in text can be correct only if the meanings of their component terms are correctly identified first. Quality evaluation sets are important to WSD because they can be used as representative samples for developing automatic programs and as referees for comparing different WSD programs. To help create quality test sets for WSD, we developed a MeSH-based automatic sense-tagging method that preferentially annotates terms being topical of the text.

View Article and Find Full Text PDF

Word sense disambiguation via semantic type classification.

Jung-Wei Fan Carol Friedman

AMIA Annu Symp Proc

November 2008

Accurate concept identification is crucial to biomedical natural language processing. However,ambiguity is common during the process of mapping terms to biomedical concepts (one term can be mapped to several concepts). A cost-effective approach to disambiguation relating to training is via semantic classification of the ambiguous terms,provided that the semantic classes of the concepts are available and are all different.

View Article and Find Full Text PDF

Combining contextual and lexical features to classify UMLS concepts.

Jung-Wei Fan Carol Friedman

AMIA Annu Symp Proc

October 2007

Semantic classification is important for biomedical terminologies and the many applications that depend on them. Previously we developed two classifiers for 8 broad clinically relevant classes to reclassify and validate UMLS concepts. We found them to be complementary, and then combined them using a manual approach.

View Article and Find Full Text PDF

Semantic reclassification of the UMLS concepts.

Jung-Wei Fan Carol Friedman

Bioinformatics

September 2008

Unlabelled: Accurate semantic classification is valuable for text mining and knowledge-based tasks that perform inference based on semantic classes. To benefit applications using the semantic classification of the Unified Medical Language System (UMLS) concepts, we automatically reclassified the concepts based on their lexical and contextual features. The new classification is useful for auditing the original UMLS semantic classification and for building biomedical text mining applications.

View Article and Find Full Text PDF

Using distributional analysis to semantically classify UMLS concepts.

Jung-Wei Fan Hua Xu Carol Friedman

Stud Health Technol Inform

November 2007

The UMLS is a widely used and comprehensive knowledge source in the biomedical domain. It specifies biomedical concepts and their semantic categories, and therefore is valuable for Natural Language Processing (NLP) and other knowledge-based systems. However, the UMLS semantic classification is not always accurate, which adversely affects performance of these systems.

View Article and Find Full Text PDF

Using contextual and lexical features to restructure and validate the classification of biomedical concepts.

Jung-Wei Fan Hua Xu Carol Friedman

BMC Bioinformatics

July 2007

Background: Biomedical ontologies are critical for integration of data from diverse sources and for use by knowledge-based biomedical applications, especially natural language processing as well as associated mining and reasoning systems. The effectiveness of these systems is heavily dependent on the quality of the ontological terms and their classifications. To assist in developing and maintaining the ontologies objectively, we propose automatic approaches to classify and/or validate their semantic categories.

View Article and Find Full Text PDF

Semantic classification of biomedical concepts using distributional similarity.

Jung-Wei Fan Carol Friedman

J Am Med Inform Assoc

July 2007

Objective: To develop an automated, high-throughput, and reproducible method for reclassifying and validating ontological concepts for natural language processing applications.

Design: We developed a distributional similarity approach to classify the Unified Medical Language System (UMLS) concepts. Classification models were built for seven broad biomedically relevant semantic classes created by grouping subsets of the UMLS semantic types.

View Article and Find Full Text PDF

Gene symbol disambiguation using knowledge-based profiles.

Hua Xu Jung-Wei Fan George Hripcsak Eneida A Mendonça Marianthi Markatou

Bioinformatics

April 2007

Article Synopsis

The ambiguity of gene symbols poses a significant challenge for text-mining in the biomedical field, which can be addressed using existing databases like Entrez Gene and MEDLINE for better disambiguation.
Researchers developed profiles for genes by extracting information from MEDLINE abstracts and annotated sources, applying an information retrieval method to determine the correct gene context.
The method was tested on mouse, fly, and yeast organisms, achieving high precision rates of 93.9%, 77.8%, and 89.5%, respectively, with the results and tools available online.

View Article and Find Full Text PDF