Publications by Judy Gichoya | LitMetric

Publications by authors named "Judy Gichoya"

Page 1 of 6

The TRIPOD-LLM reporting guideline for studies using large language models.

Jack Gallifant Majid Afshar Saleem Ameen Yindalon Aphinyanaphongs Shan Chen Judy Wawira Gichoya

Nat Med

January 2025

Large language models (LLMs) are rapidly being adopted in healthcare, necessitating standardized reporting guidelines. We present transparent reporting of a multivariable model for individual prognosis or diagnosis (TRIPOD)-LLM, an extension of the TRIPOD + artificial intelligence statement, addressing the unique challenges of LLMs in biomedical applications. TRIPOD-LLM provides a comprehensive checklist of 19 main items and 50 subitems, covering key aspects from title to discussion.

View Article and Find Full Text PDF

CANAIRI: the Collaboration for Translational Artificial Intelligence Trials in healthcare.

Melissa D McCradden Alex John London Judy Wawira Gichoya Mark Sendak Lauren Erdman

Nat Med

January 2025

View Article and Find Full Text PDF

Tackling algorithmic bias and promoting transparency in health datasets: the STANDING Together consensus recommendations.

Joseph E Alderman Joanne Palmer Elinor Laws Melissa D McCradden Johan Ordish Judy W Gichoya

Lancet Digit Health

January 2025

Article Synopsis

There is a significant risk of reinforcing existing health inequalities in AI health technologies due to biases, primarily stemming from the datasets used.
The STANDING Together recommendations focus on transparency in health datasets and proactive evaluation of their impacts on different population groups, informed by a comprehensive research process with over 350 global contributors.
The 29 recommendations are divided into guidance for documenting health datasets and strategies for using them, aiming to identify and reduce algorithmic biases while promoting awareness of the inherent limitations in all datasets.

View Article and Find Full Text PDF

Diversity, inclusivity and traceability of mammography datasets used in development of Artificial Intelligence technologies: a systematic review.

Elinor Laws Joanne Palmer Joseph Alderman Ojasvi Sharma Victoria Ngai Judy Gichoya

Clin Imaging

November 2024

Article Synopsis

This review analyzes various mammography datasets used for AI development in breast cancer screening, focusing on their transparency, content, and accessibility.
A search identified 254 datasets, with only 28 being accessible; most datasets came from Europe, East Asia, and North America, raising concerns over poor demographic representation.
The findings highlight significant gaps in diversity within these datasets, underscoring the need for better documentation and inclusivity to enhance the effectiveness of AI technologies in breast cancer research.

View Article and Find Full Text PDF

Multi-institutional Evaluation and Training of Breast Density Classification AI Algorithm Using ACR Connect and AI-LAB.

Laura Brink Ricardo Amaya Romero Laura Coombs Mike Tilkin Sina Mazaheri Judy Gichoya

J Am Coll Radiol

November 2024

Objective: To demonstrate and test the capabilities of the American College of Radiology (ACR) Connect and AI-LAB software platform by implementing multi-institutional artificial intelligence (AI) training and validation for breast density classification.

Methods: In this proof-of-concept study, six U.S.

View Article and Find Full Text PDF

Health inequities, bias, and artificial intelligence.

Hanzhou Li John T Moon Vishal Shankar Janice Newsome Judy Gichoya

Tech Vasc Interv Radiol

September 2024

Musculoskeletal (MSK) pain leads to significant healthcare utilization, decreased productivity, and disability globally. Due to its complex etiology, MSK pain is often chronic and challenging to manage effectively. Disparities in pain management-influenced by provider implicit biases and patient race, gender, age, and socioeconomic status-contribute to inconsistent outcomes.

View Article and Find Full Text PDF

A Feasibility Study of Thermography for Detecting Pressure Injuries Across Diverse Skin Tones.

Miriam Asare-Baiden Sharon Eve Sonenblum Kathleen Jordan Andrew Chung Judy Wawira Gichoya

medRxiv

October 2024

Pressure injury (PI) detection is challenging, especially in dark skin tones, due to the unreliability of visual inspection. Thermography may serve as a viable alternative as temperature differences in the skin can indicate impending tissue damage. Although deep learning models hold considerable promise toward reliably detecting PI, existing work fails to evaluate performance on diverse skin tones and varying data collection protocols.

View Article and Find Full Text PDF

Understanding and training for the impact of large language models and artificial intelligence in healthcare practice: a narrative review.

Liam G McCoy Faye Yu Ci Ng Christopher M Sauer Katelyn Edelwina Yap Legaspi Bhav Jain Judy Gichoya

BMC Med Educ

October 2024

Article Synopsis

Some people are excited because big language models (LLMs) can pass important medical tests.
This makes doctors think about what skills they need to work well with these AI tools.
To prepare future doctors for using AI, medical schools might need to change how they teach!

View Article and Find Full Text PDF

Utility of Skin Tone on Pulse Oximetry in Critically Ill Patients: A Prospective Cohort Study.

Sicheng Hao Katelyn Dempsey João Matos Christopher E Cox Veronica Rotemberg Judy W Gichoya

Crit Care Explor

September 2024

Objective: Pulse oximetry, a ubiquitous vital sign in modern medicine, has inequitable accuracy that disproportionately affects minority Black and Hispanic patients, with associated increases in mortality, organ dysfunction, and oxygen therapy. Previous retrospective studies used self-reported race or ethnicity as a surrogate for skin tone which is believed to be the root cause of the disparity. Our objective was to determine the utility of skin tone in explaining pulse oximetry discrepancies.

View Article and Find Full Text PDF

Ethical debates amidst flawed healthcare artificial intelligence metrics.

Jack Gallifant Danielle S Bitterman Leo Anthony Celi Judy W Gichoya Joao Matos

NPJ Digit Med

September 2024

Healthcare AI faces an ethical dilemma between selective and equitable deployment, exacerbated by flawed performance metrics. These metrics inadequately capture real-world complexities and biases, leading to premature assertions of effectiveness. Improved evaluation practices, including continuous monitoring and silent evaluation periods, are crucial.

View Article and Find Full Text PDF

A Clinician's Guide to Understanding Bias in Critical Clinical Prediction Models.

João Matos Jack Gallifant Anand Chowdhury Nicoleta Economou-Zavlanos Marie-Laure Charpignon Judy Gichoya

Crit Care Clin

October 2024

This narrative review focuses on the role of clinical prediction models in supporting informed decision-making in critical care, emphasizing their 2 forms: traditional scores and artificial intelligence (AI)-based models. Acknowledging the potential for both types to embed biases, the authors underscore the importance of critical appraisal to increase our trust in models. The authors outline recommendations and critical care examples to manage risk of bias in AI models.

View Article and Find Full Text PDF

The TRIPOD-LLM Statement: A Targeted Guideline For Reporting Large Language Models Use.

Jack Gallifant Majid Afshar Saleem Ameen Yindalon Aphinyanaphongs Shan Chen Judy Wawira Gichoya

medRxiv

July 2024

Article Synopsis

TRIPOD-LLM is a new set of reporting guidelines specifically designed for the use of Large Language Models (LLMs) in biomedical research, aiming to standardize transparency and quality in healthcare applications.
The guidelines include a checklist with 19 main items and 50 subitems, adaptable to various research designs, emphasizing the importance of human oversight and task-specific performance.
An interactive website is provided to help researchers easily complete the guidelines and generate submissions, with the intention of continually updating the document as the field evolves.

View Article and Find Full Text PDF

ENCoDE - a skin tone and clinical dataset from a prospective trial on acute care patients.

Sicheng Hao Joao Matos Katelyn Dempsey Mahmoud Alwakeel Jared Houghtaling Judy Gichoya

medRxiv

August 2024

Background: Although hypothesized to be the root cause of the pulse oximetry disparities, skin tone and its use for improving medical therapies have yet to be extensively studied. Studies previously used self-reported race as a proxy variable for skin tone. However, this approach cannot account for skin tone variability within race groups and also risks the potential to be confounded by other non-biological factors when modeling data.

View Article and Find Full Text PDF

A health equity monitoring framework based on process mining.

Jan Niklas Adams Jennifer Ziegler Matthew McDermott Molly J Douglas René Eber Judy Wawira Gichoya

PLOS Digit Health

August 2024

Article Synopsis

* Current tools to measure health equity are limited, often focusing on specific areas of patient care rather than the entire healthcare process.
* A study introduced a process mining framework to track patient care actions, revealing that while treatment was similar for men and women, non-English speaking patients experienced delays despite having similar illness severity.

View Article and Find Full Text PDF

Enhancing radiology training with GPT-4: Pilot analysis of automated feedback in trainee preliminary reports.

Wasif Bala Hanzhou Li John Moon Hari Trivedi Judy Gichoya

Curr Probl Diagn Radiol

August 2024

Rationale And Objectives: Radiology residents often receive limited feedback on preliminary reports issued during independent call. This study aimed to determine if Large Language Models (LLMs) can supplement traditional feedback by identifying missed diagnoses in radiology residents' preliminary reports.

Materials & Methods: A randomly selected subset of 500 (250 train/250 validation) paired preliminary and final reports between 12/17/2022 and 5/22/2023 were extracted and de-identified from our institutional database.

View Article and Find Full Text PDF

A survey of skin tone assessment in prospective research.

Vanessa R Weir Katelyn Dempsey Judy Wawira Gichoya Veronica Rotemberg An-Kwok Ian Wong

NPJ Digit Med

July 2024

Increasing evidence supports reduced accuracy of noninvasive assessment tools, such as pulse oximetry, temperature probes, and AI skin diagnosis benchmarks, in patients with darker skin tones. The FDA is exploring potential strategies for device regulation to improve performance across diverse skin tones by including skin tone criteria. However, there is no consensus about how prospective studies should perform skin tone assessment in order to take this bias into account.

View Article and Find Full Text PDF

Summary of the National Cancer Institute 2023 Virtual Workshop on Medical Image De-identification-Part 2: Pathology Whole Slide Image De-identification, De-facing, the Role of AI in Image De-identification, and the NCI MIDI Datasets and Pipeline.

David Clunie Adam Taylor Tom Bisson David Gutman Ying Xiao Judy Gichoya

J Imaging Inform Med

July 2024

De-identification of medical images intended for research is a core requirement for data sharing initiatives, particularly as the demand for data for artificial intelligence (AI) applications grows. The Center for Biomedical Informatics and Information Technology (CBIIT) of the United States National Cancer Institute (NCI) convened a two half-day virtual workshop with the intent of summarizing the state of the art in de-identification technology and processes and exploring interesting aspects of the subject. This paper summarizes the highlights of the second day of the workshop, the recordings and presentations of which are publicly available for review.

View Article and Find Full Text PDF

Racial Differences in Accuracy of Predictive Models for High-Flow Nasal Cannula Failure in COVID-19.

Philip Yang Ismail A Gregory Chad Robichaux Andre L Holder Greg S Martin Judy W Gichoya

Crit Care Explor

March 2024

Article Synopsis

- The study aimed to create and validate machine learning models to predict failure of high-flow nasal cannula (HFNC) therapy in COVID-19 patients, while comparing these models to the traditional ROX index and examining accuracy across different races.
- Conducted as a retrospective cohort study at four Emory University hospitals, it analyzed data from 984 adult COVID-19 patients who received HFNC therapy, identifying that 32.2% experienced HFNC failure.
- The eXtreme Gradient Boosting (XGB) model showed superior performance (AUROC of 0.707) compared to the ROX index (AUROC of 0.616), but also highlighted significant racial disparities in prediction accuracy, which were less pronounced

View Article and Find Full Text PDF

The limits of fair medical imaging AI in real-world generalization.

Yuzhe Yang Haoran Zhang Judy W Gichoya Dina Katabi Marzyeh Ghassemi

Nat Med

October 2024

As artificial intelligence (AI) rapidly approaches human-level performance in medical imaging, it is crucial that it does not exacerbate or propagate healthcare disparities. Previous research established AI's capacity to infer demographic data from chest X-rays, leading to a key concern: do models using demographic shortcuts have unfair predictions across subpopulations? In this study, we conducted a thorough investigation into the extent to which medical AI uses demographic encodings, focusing on potential fairness discrepancies within both in-distribution training sets and external test sets. Our analysis covers three key medical imaging disciplines-radiology, dermatology and ophthalmology-and incorporates data from six global chest X-ray datasets.

View Article and Find Full Text PDF

A Responsible Framework for Applying Artificial Intelligence on Medical Images and Signals at the Point of Care: The PACS-AI Platform.

Pascal Theriault-Lauzier Denis Cobin Olivier Tastet Elodie Labrecque Langlais Bahareh Taji Judy Wawira Gichoya

Can J Cardiol

October 2024

The potential of artificial intelligence (AI) in medicine lies in its ability to enhance clinicians' capacity to analyse medical images, thereby improving diagnostic precision and accuracy and thus enhancing current tests. However, the integration of AI within health care is fraught with difficulties. Heterogeneity among health care system applications, reliance on proprietary closed-source software, and rising cybersecurity threats pose significant challenges.

View Article and Find Full Text PDF

Author Correction: Opportunistic detection of type 2 diabetes using deep learning from frontal chest radiographs.

Ayis Pyrros Stephen M Borstelmann Ramana Mantravadi Zachary Zaiman Kaesha Thomas Judy Wawira Gichoya

Nat Commun

June 2024

View Article and Find Full Text PDF

Synthetically enhanced: unveiling synthetic data's potential in medical imaging research.

Bardia Khosravi Frank Li Theo Dapamede Pouria Rouzrokh Cooper U Gamble Judy W Gichoya

EBioMedicine

June 2024

Background: Chest X-rays (CXR) are essential for diagnosing a variety of conditions, but when used on new populations, model generalizability issues limit their efficacy. Generative AI, particularly denoising diffusion probabilistic models (DDPMs), offers a promising approach to generating synthetic images, enhancing dataset diversity. This study investigates the impact of synthetic data supplementation on the performance and generalizability of medical imaging research.

View Article and Find Full Text PDF

BOLD: Blood-gas and Oximetry Linked Dataset.

João Matos Tristan Struja Jack Gallifant Luis Nakayama Marie-Laure Charpignon Judy Gichoya

Sci Data

May 2024

Pulse oximeters measure peripheral arterial oxygen saturation (SpO) noninvasively, while the gold standard (SaO) involves arterial blood gas measurement. There are known racial and ethnic disparities in their performance. BOLD is a dataset that aims to underscore the importance of addressing biases in pulse oximetry accuracy, which disproportionately affect darker-skinned patients.

View Article and Find Full Text PDF

Research ethics and artificial intelligence for global health: perspectives from the global forum on bioethics in research.

James Shaw Joseph Ali Caesar A Atuire Phaik Yeong Cheah Armando Guio Español Judy Wawira Gichoya

BMC Med Ethics

April 2024

Background: The ethical governance of Artificial Intelligence (AI) in health care and public health continues to be an urgent issue for attention in policy, research, and practice. In this paper we report on central themes related to challenges and strategies for promoting ethics in research involving AI in global health, arising from the Global Forum on Bioethics in Research (GFBR), held in Cape Town, South Africa in November 2022.

Methods: The GFBR is an annual meeting organized by the World Health Organization and supported by the Wellcome Trust, the US National Institutes of Health, the UK Medical Research Council (MRC) and the South African MRC.

View Article and Find Full Text PDF

Machine learning for healthcare that matters: Reorienting from technical novelty to equitable impact.

Aparna Balagopalan Ioana Baldini Leo Anthony Celi Judy Gichoya Liam G McCoy

PLOS Digit Health

April 2024

Despite significant technical advances in machine learning (ML) over the past several years, the tangible impact of this technology in healthcare has been limited. This is due not only to the particular complexities of healthcare, but also due to structural issues in the machine learning for healthcare (MLHC) community which broadly reward technical novelty over tangible, equitable impact. We structure our work as a healthcare-focused echo of the 2012 paper "Machine Learning that Matters", which highlighted such structural issues in the ML community at large, and offered a series of clearly defined "Impact Challenges" to which the field should orient itself.

View Article and Find Full Text PDF