Publications by authors named "Juan M Banda"

Article Synopsis
  • Electronic phenotyping uses various data analysis methods, including machine learning and natural language processing, to define patient characteristics, but the current process is slow and labor-intensive.
  • Large language models could automate phenotype definition extraction but have reliability issues and potential risks of generating misleading information.
  • The study aims to create a standard evaluation set for assessing large language models' outputs and to test different prompting methods, showing promising results that still need human validation to ensure accuracy and efficiency in phenotype extraction.
View Article and Find Full Text PDF
Article Synopsis
  • The diversity of clinical notes in electronic health records (EHRs) highlights the need for standardization to improve data retrieval and integration, which is where the LOINC Document Ontology (DO) comes in, specifically designed for naming clinical documents.
  • This study evaluated the LOINC DO by mapping clinical note titles from five institutions, categorizing them into three classes based on how similar they are to LOINC DO codes, and developed an automated pipeline for this mapping that doesn't require accessing note content.
  • The automated mapping system, powered by various language models, demonstrated a high accuracy of 0.90, and the research compared its results with manual mappings to assess LOINC DO's effectiveness and identify opportunities for expanding
View Article and Find Full Text PDF

Objective: The aim of the Social Media Mining for Health Applications (#SMM4H) shared tasks is to take a community-driven approach to address the natural language processing and machine learning challenges inherent to utilizing social media data for health informatics. In this paper, we present the annotated corpora, a technical summary of participants' systems, and the performance results.

Methods: The eighth iteration of the #SMM4H shared tasks was hosted at the AMIA 2023 Annual Symposium and consisted of 5 tasks that represented various social media platforms (Twitter and Reddit), languages (English and Spanish), methods (binary classification, multi-class classification, extraction, and normalization), and topics (COVID-19, therapies, social anxiety disorder, and adverse drug events).

View Article and Find Full Text PDF
Article Synopsis
  • * The latest iteration included five tasks across platforms like Twitter and Reddit, covering topics such as COVID-19, therapies, and drug-related events in both English and Spanish, with 29 teams participating from 18 countries.
  • * The top systems in competitions utilized advanced deep learning techniques, particularly pre-trained transformer models, and a dataset of over 61,000 social media posts will be available for future research.
View Article and Find Full Text PDF

Objective: Biases within probabilistic electronic phenotyping algorithms are largely unexplored. In this work, we characterize differences in subgroup performance of phenotyping algorithms for Alzheimer's disease and related dementias (ADRD) in older adults.

Materials And Methods: We created an experimental framework to characterize the performance of probabilistic phenotyping algorithms under different racial distributions allowing us to identify which algorithms may have differential performance, by how much, and under what conditions.

View Article and Find Full Text PDF
Article Synopsis
  • Common data models standardize electronic health record (EHR) data but struggle to fully integrate the necessary resources for deep phenotyping.
  • The OMOP2OBO algorithm automates the mapping of Observational Medical Outcomes Partnership (OMOP) vocabularies to Open Biological and Biomedical Ontology (OBO) ontologies, significantly reducing the need for manual curation.
  • With OMOP2OBO, mappings for a large number of conditions, drugs, and measurements were created, facilitating the identification of undiagnosed patients in rare diseases and enhancing opportunities for EHR-based deep phenotyping.
View Article and Find Full Text PDF
Article Synopsis
  • * The OHDSI consortium's NLP Working Group created methods and tools to improve the use of textual data in observational studies, detailing a framework for integrating this information into the OMOP Common Data Model (CDM).
  • * The authors also highlight the workflow for extracting and transforming data from clinical notes, share current applications of the NLP solution, and discuss challenges and lessons learned to aid other researchers in implementing NLP in their studies.
View Article and Find Full Text PDF
Article Synopsis
  • This study investigates how different interpretations of an observational study's design can affect the results when independent researchers attempt to reproduce it.
  • The researchers found that out of ten criteria for including patients, teams only agreed, on average, 4 of 10 times, leading to significant variability in the size and characteristics of the resulting patient cohorts.
  • The study concludes that providing open analytical code and a standardized data model can improve reproduction accuracy and consistency in observational research.
View Article and Find Full Text PDF

This study presents the outcomes of the shared task competition BioCreative VII (Task 3) focusing on the extraction of medication names from a Twitter user's publicly available tweets (the user's 'timeline'). In general, detecting health-related tweets is notoriously challenging for natural language processing tools. The main challenge, aside from the informality of the language used, is that people tweet about any and all topics, and most of their tweets are not related to health.

View Article and Find Full Text PDF
Article Synopsis
  • The study emphasizes the importance of real world data (RWD) for understanding and responding to the COVID-19 pandemic using a standardized approach through the CHARYBDIS framework.
  • Researchers conducted a retrospective database study across multiple countries, including the US and parts of Europe and Asia, involving over 4.5 million individuals and focusing on their clinical characteristics and outcomes.
  • Findings reveal higher diagnoses among women but more hospitalizations among men, common comorbidities like diabetes and heart disease, and key symptoms such as cough and fever; this data helps to identify trends in COVID-19 across different populations and time periods.
View Article and Find Full Text PDF

Colombia announced the first case of severe acute respiratory syndrome coronavirus 2 on March 6, 2020. Since then, the country has reported a total of 5,002,387 cases and 127,258 deaths as of October 31, 2021. The aggressive transmission dynamics of SARS-CoV-2 motivate an investigation of COVID-19 at the national and regional levels in Colombia.

View Article and Find Full Text PDF

Twitter has been a remarkable resource for research in pharmacovigilance in the last decade. Traditionally, rule- or lexicon-based methods have been utilized for automatically extracting drug tweets for human annotation. The process of human annotation to create labeled sets for machine learning models is laborious, time consuming and not scalable.

View Article and Find Full Text PDF

The COVID-19 pandemic hit hard society, strongly affecting the emotions of the people and wellbeing. It is difficult to measure how the pandemic has affected the sentiment of the people, not to mention how people responded to the dramatic events that took place during the pandemic. This study contributes to this discussion by showing that the negative perception of the people of the COVID-19 pandemic is dropping.

View Article and Find Full Text PDF

The use of social media data, like Twitter, for biomedical research has been gradually increasing over the years. With the coronavirus disease 2019 (COVID-19) pandemic, researchers have turned to more non-traditional sources of clinical data to characterize the disease in near-real time, study the societal implications of interventions, as well as the sequelae that recovered COVID-19 cases present. However, manually curated social media datasets are difficult to come by due to the expensive costs of manual annotation and the efforts needed to identify the correct texts.

View Article and Find Full Text PDF

As the COVID-19 pandemic continues to spread worldwide, an unprecedented amount of open data is being generated for medical, genetics, and epidemiological research. The unparalleled rate at which many research groups around the world are releasing data and publications on the ongoing pandemic is allowing other scientists to learn from local experiences and data generated on the front lines of the COVID-19 pandemic. However, there is a need to integrate additional data sources that map and measure the role of social dynamics of such a unique worldwide event in biomedical, biological, and epidemiological analyses.

View Article and Find Full Text PDF

The use of social media data, like Twitter, for biomedical research has been gradually increasing over the years. With the COVID-19 pandemic, researchers have turned to more nontraditional sources of clinical data to characterize the disease in near real-time, study the societal implications of interventions, as well as the sequelae that recovered COVID-19 cases present (Long-COVID). However, manually curated social media datasets are difficult to come by due to the expensive costs of manual annotation and the efforts needed to identify the correct texts.

View Article and Find Full Text PDF

Mexico has experienced one of the highest COVID-19 mortality rates in the world. A delayed implementation of social distancing interventions in late March 2020 and a phased reopening of the country in June 2020 has facilitated sustained disease transmission in the region. In this study we systematically generate and compare 30-day ahead forecasts using previously validated growth models based on mortality trends from the Institute for Health Metrics and Evaluation for Mexico and Mexico City in near real-time.

View Article and Find Full Text PDF

Background: News media coverage of antimask protests, COVID-19 conspiracies, and pandemic politicization has overemphasized extreme views but has done little to represent views of the general public. Investigating the public's response to various pandemic restrictions can provide a more balanced assessment of current views, allowing policy makers to craft better public health messages in anticipation of poor reactions to controversial restrictions.

Objective: Using data from social media, this infoveillance study aims to understand the changes in public opinion associated with the implementation of COVID-19 restrictions (eg, business and school closures, regional lockdown differences, and additional public health restrictions, such as social distancing and masking).

View Article and Find Full Text PDF

The rapid evolution of the COVID-19 pandemic has underscored the need to quickly disseminate the latest clinical knowledge during a public-health emergency. One surprisingly effective platform for healthcare professionals (HCPs) to share knowledge and experiences from the front lines has been social media (for example, the "#medtwitter" community on Twitter). However, identifying clinically-relevant content in social media without manual labeling is a challenge because of the sheer volume of irrelevant data.

View Article and Find Full Text PDF

Background: Low testing rates and delays in reporting hinder the estimation of the mortality burden associated with the COVID-19 pandemic. During a public health emergency, estimating all cause excess deaths above an expected level of death can provide a more reliable picture of the mortality burden. Here, we aim to estimate the absolute and relative mortality impact of COVID-19 pandemic in Mexico.

View Article and Find Full Text PDF

The normalization of clinical documents is essential for health information management with the enormous amount of clinical documentation generated each year. The LOINC Document Ontology (DO) is a universal clinical document standard in a hierarchical structure. The objective of this study is to investigate the feasibility and generalizability of LOINC DO by mapping from clinical note titles across five institutions to five DO axes.

View Article and Find Full Text PDF

Despite the significant health impacts of adverse events associated with drug-drug interactions, no standard models exist for managing and sharing evidence describing potential interactions between medications. Minimal information models have been used in other communities to establish community consensus around simple models capable of communicating useful information. This paper reports on a new minimal information model for describing potential drug-drug interactions.

View Article and Find Full Text PDF

Objective: To propose a paradigm for a scalable time-aware clinical data search, and to describe the design, implementation and use of a search engine realizing this paradigm.

Materials And Methods: The Advanced Cohort Engine (ACE) uses a temporal query language and in-memory datastore of patient objects to provide a fast, scalable, and expressive time-aware search. ACE accepts data in the Observational Medicine Outcomes Partnership Common Data Model, and is configurable to balance performance with compute cost.

View Article and Find Full Text PDF

Physicians' beliefs and attitudes about COVID-19 are important to ascertain because of their central role in providing care to patients during the pandemic. Identifying topics and sentiments discussed by physicians and other healthcare workers can lead to identification of gaps relating to theCOVID-19 pandemic response within the healthcare system. To better understand physicians' perspectives on the COVID-19 response, we extracted Twitter data from a specific user group that allows physicians to stay anonymous while expressing their perspectives about the COVID-19 pandemic.

View Article and Find Full Text PDF
Article Synopsis
  • Routinely collected real-world data (RWD) is essential for understanding and responding to the COVID-19 pandemic, as demonstrated by the CHARYBDIS framework for standardizing and analyzing this data.
  • A descriptive cohort study involving over 4.5 million individuals was conducted across the U.S., Europe, and Asia to examine COVID-19-related health risks and outcomes, with detailed information available on an interactive website.
  • The findings from the CHARYBDIS study serve as benchmarks to enhance our knowledge of COVID-19's progression and management, facilitating timely evaluations of new preventative and therapeutic strategies.
View Article and Find Full Text PDF

A PHP Error was encountered

Severity: Warning

Message: fopen(/var/lib/php/sessions/ci_session8ou1sagcn695ofg29fg4317m7inpkmua): Failed to open stream: No space left on device

Filename: drivers/Session_files_driver.php

Line Number: 177

Backtrace:

File: /var/www/html/index.php
Line: 316
Function: require_once

A PHP Error was encountered

Severity: Warning

Message: session_start(): Failed to read session data: user (path: /var/lib/php/sessions)

Filename: Session/Session.php

Line Number: 137

Backtrace:

File: /var/www/html/index.php
Line: 316
Function: require_once