Modern generative artificial intelligence techniques like retrieval-augmented generation (RAG) may be applied in support of precision oncology treatment discussions. Experts routinely review published literature for evidence and recommendations of treatments in a labor-intensive process. A RAG pipeline may help reduce this effort by providing chunks of text from these publications to an off-the-shelf large language model (LLM), allowing it to answer related questions without any fine-tuning.
View Article and Find Full Text PDFStud Health Technol Inform
August 2024
There is a critical need for a streamlined process to identify genotype-matched individuals eligible for enrollment into clinical trials and/or targeted therapies, as current methodologies face challenges in integrating diverse molecular data sources. We have developed a precision oncology platform to assist molecular tumor boards and community oncologists in reviewing patients' phenotypes, evaluating related knowledge, and identifying genotype-matched therapies.
View Article and Find Full Text PDFExpert Opin Drug Saf
November 2023
Introduction: Pharmacovigilance (PV) involves monitoring and aggregating adverse event information from a variety of data sources, including health records, biomedical literature, spontaneous adverse event reports, product labels, and patient-generated content like social media posts, but the most pertinent details in these sources are typically available in narrative free-text formats. Natural language processing (NLP) techniques can be used to extract clinically relevant information from PV texts to inform decision-making.
Areas Covered: We conducted a non-systematic literature review by querying the PubMed database to examine the uses of NLP in drug safety and distilled the findings to present our expert opinion on the topic.
Identifying patient cohorts meeting the criteria of specific phenotypes is essential in biomedicine and particularly timely in precision medicine. Many research groups deliver pipelines that automatically retrieve and analyze data elements from one or more sources to automate this task and deliver high-performing computable phenotypes. We applied a systematic approach based on the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines to conduct a thorough scoping review on computable clinical phenotyping.
View Article and Find Full Text PDFStud Health Technol Inform
June 2022
Many decision support methods and systems in pharmacovigilance are built without explicitly addressing specific challenges that jeopardize their eventual success. We describe two sets of challenges and appropriate strategies to address them. The first are data-related challenges, which include using extensive multi-source data of poor quality, incomplete information integration, and inefficient data visualization.
View Article and Find Full Text PDFThe accelerating impact of genomic data in clinical decision-making has generated a paradigm shift from treatment based on the anatomic origin of the tumor to the incorporation of key genomic features to guide therapy. Assessing the clinical validity and utility of the genomic background of a patient's cancer represents one of the emerging challenges in oncology practice, demanding the development of automated platforms for extracting clinically relevant genomic information from medical texts. We developed PubMiner, a natural language processing tool to extract and interpret cancer type, therapy, and genomic information from biomedical abstracts.
View Article and Find Full Text PDFProcessing unstructured clinical texts is often necessary to support certain tasks in biomedicine, such as matching patients to clinical trials. Among other methods, domain-specific language models have been built to utilize free-text information. This study evaluated the performance of Bidirectional Encoder Representations from Transformers (BERT) models in assessing the similarity between clinical trial texts.
View Article and Find Full Text PDFBackground: Our objective was to support the automated classification of Food and Drug Administration (FDA) Adverse Event Reporting System (FAERS) reports for their usefulness in assessing the possibility of a causal relationship between a drug product and an adverse event.
Method: We used a data set of 326 redacted FAERS reports that was previously annotated using a modified version of the World Health Organization-Uppsala Monitoring Centre criteria for drug causality assessment by a group of SEs at the FDA and supported a similar study on the classification of reports using supervised machine learning and text engineering methods. We explored many potential features, including the incorporation of natural language processing on report text and information from external data sources, for supervised learning and developed models for predicting the classification status of reports.
Stud Health Technol Inform
May 2021
The Fast Healthcare Interoperability Resources (FHIR) contain multiple data-exchange standards that aim at optimizing healthcare information exchange. One of them, the FHIR AdverseEvent, may support post-market safety surveillance. We examined its readiness using the Food and Drug Administration's (FDA) Adverse Event Reporting System (FAERS).
View Article and Find Full Text PDFResearch findings in biomedical science are often summarized in statistical plots and sophisticated data presentations. Such visualizations are challenging for people who lack the appropriate scientific background or even experts who work in other areas. Scientists have to maximize knowledge dissemination by improving the communication of their findings to the public.
View Article and Find Full Text PDFIntroduction: The US FDA receives more than 2 million postmarket reports each year. Safety Evaluators (SEs) review these reports, as well as external information, to identify potential safety signals. With the increasing number of reports and the size of external information, more efficient solutions for data integration and decision making are needed.
View Article and Find Full Text PDFBackground: Diabetes mellitus (DM) is a metabolic disorder that causes abnormal blood glucose (BG) regulation that might result in short and long-term health complications and even death if not properly managed. Currently, there is no cure for diabetes. However, self-management of the disease, especially keeping BG in the recommended range, is central to the treatment.
View Article and Find Full Text PDFBackground: Diabetes mellitus is a chronic metabolic disorder that results in abnormal blood glucose (BG) regulations. The BG level is preferably maintained close to normality through self-management practices, which involves actively tracking BG levels and taking proper actions including adjusting diet and insulin medications. BG anomalies could be defined as any undesirable reading because of either a precisely known reason (normal cause variation) or an unknown reason (special cause variation) to the patient.
View Article and Find Full Text PDFIntroduction: In May 2008, the Food and Drug Administration launched the Sentinel Initiative, a multi-year program for the establishment of a national electronic monitoring system for medical product safety that led, in 2016, to the launch of the full Sentinel System. Under the Mini-Sentinel pilot, several algorithms for identifying health outcomes of interest, including one for anaphylaxis, were developed and evaluated using data available from the Sentinel common data model.
Purpose: To evaluate whether features extracted from unstructured narrative data using natural language processing (NLP) could be used to classify anaphylaxis cases.
As part of a collaborative project between the US Food and Drug Administration (FDA) and the Centers for Disease Control and Prevention for the development of a web-based natural language processing (NLP) workbench, we created a corpus of 1000 Vaccine Adverse Event Reporting System (VAERS) reports annotated for 36,726 clinical features, 13,365 temporal features, and 22,395 clinical-temporal links. This paper describes the final corpus, as well as the methodology used to create it, so that clinical NLP researchers outside FDA can evaluate the utility of the corpus to aid their own work. The creation of this standard went through four phases: pre-training, pre-production, production-clinical feature annotation, and production-temporal annotation.
View Article and Find Full Text PDFIntroduction: The FDA Adverse Event Reporting System (FAERS) is a primary data source for identifying unlabeled adverse events (AEs) in a drug or biologic drug product's postmarketing phase. Many AE reports must be reviewed by drug safety experts to identify unlabeled AEs, even if the reported AEs are previously identified, labeled AEs. Integrating the labeling status of drug product AEs into FAERS could increase report triage and review efficiency.
View Article and Find Full Text PDFStructured Product Labels follow an XML-based document markup standard approved by the Health Level Seven organization and adopted by the US Food and Drug Administration as a mechanism for exchanging medical products information. Their current organization makes their secondary use rather challenging. We used the Side Effect Resource database and DailyMed to generate a comparison dataset of 1159 Structured Product Labels.
View Article and Find Full Text PDFPeople with diabetes experience elevated blood glucose (BG) levels at the time of an infection. We propose to utilize patient-gathered information in an Electronic Disease Surveillance Monitoring Network (EDMON), which may support the identification of a cluster of infected people with elevated BG levels on a spatiotemporal basis. The system incorporates data gathered from diabetes apps, continuous glucose monitoring (CGM) devices, and other appropriate physiological indicators from people with type 1 diabetes.
View Article and Find Full Text PDFLiterature review is critical but time-consuming in the post-market surveillance of medical products. We focused on the safety signal of intussusception after the vaccination of infants with the Rotashield Vaccine in 1999 and retrieved all PubMed abstracts for rotavirus vaccines published after January 1, 1998. We used the Event-based Text-mining of Health Electronic Records system, the MetaMap tool, and the National Center for Biomedical Ontologies Annotator to process the abstracts and generate coded terms stamped with the date of publication.
View Article and Find Full Text PDFWe followed a systematic approach based on the Preferred Reporting Items for Systematic Reviews and Meta-Analyses to identify existing clinical natural language processing (NLP) systems that generate structured information from unstructured free text. Seven literature databases were searched with a query combining the concepts of natural language processing and structured data capture. Two reviewers screened all records for relevance during two screening phases, and information about clinical NLP systems was collected from the final set of papers.
View Article and Find Full Text PDFObjective: To evaluate the feasibility of automated dose and adverse event information retrieval in supporting the identification of safety patterns.
Methods: We extracted all rabbit Anti-Thymocyte Globulin (rATG) reports submitted to the United States Food and Drug Administration Adverse Event Reporting System (FAERS) from the product's initial licensure in April 16, 1984 through February 8, 2016. We processed the narratives using the Medication Extraction (MedEx) and the Event-based Text-mining of Health Electronic Records (ETHER) systems and retrieved the appropriate medication, clinical, and temporal information.
Introduction: Duplicate case reports in spontaneous adverse event reporting systems pose a challenge for medical reviewers to efficiently perform individual and aggregate safety analyses. Duplicate cases can bias data mining by generating spurious signals of disproportional reporting of product-adverse event pairs.
Objective: We have developed a probabilistic record linkage algorithm for identifying duplicate cases in the US Vaccine Adverse Event Reporting System (VAERS) and the US Food and Drug Administration Adverse Event Reporting System (FAERS).
We have developed a Decision Support Environment (DSE) for medical experts at the US Food and Drug Administration (FDA). The DSE contains two integrated systems: The Event-based Text-mining of Health Electronic Records (ETHER) and the Pattern-based and Advanced Network Analyzer for Clinical Evaluation and Assessment (PANACEA). These systems assist medical experts in reviewing reports submitted to the Vaccine Adverse Event Reporting System (VAERS) and the FDA Adverse Event Reporting System (FAERS).
View Article and Find Full Text PDFThe sheer volume of textual information that needs to be reviewed and analyzed in many clinical settings requires the automated retrieval of key clinical and temporal information. The existing natural language processing systems are often challenged by the low quality of clinical texts and do not demonstrate the required performance. In this study, we focus on medical product safety report narratives and investigate the association of the clinical events with appropriate time information.
View Article and Find Full Text PDF