Publications by authors named "Leo A Celi"

Diabetic retinopathy (DR) is a serious diabetes complication that can lead to vision loss, making timely identification crucial. Existing data-driven algorithms for DR staging from digital fundus images (DFIs) often struggle with generalization due to distribution shifts between training and target domains. To address this, DRStageNet, a deep learning model, was developed using six public and independent datasets with 91,984 DFIs from diverse demographics.

View Article and Find Full Text PDF

Large language models (LLMs) are rapidly being adopted in healthcare, necessitating standardized reporting guidelines. We present transparent reporting of a multivariable model for individual prognosis or diagnosis (TRIPOD)-LLM, an extension of the TRIPOD + artificial intelligence statement, addressing the unique challenges of LLMs in biomedical applications. TRIPOD-LLM provides a comprehensive checklist of 19 main items and 50 subitems, covering key aspects from title to discussion.

View Article and Find Full Text PDF

The Clever Hans effect occurs when machine learning models rely on spurious correlations instead of clinically relevant features and poses significant challenges to the development of reliable artificial intelligence (AI) systems in medical imaging. This scoping review provides an overview of methods for identifying and addressing the Clever Hans effect in medical imaging AI algorithms. A total of 173 papers published between 2010 and 2024 were reviewed, and 37 articles were selected for detailed analysis, with classification into two categories: detection and mitigation approaches.

View Article and Find Full Text PDF

Objective: Determine how each organ component of the SOFA score differs in its contribution to mortality risk and how that contribution may change over time.

Methods: We performed multivariate logistic regression analysis to assess the contribution of each organ component to mortality risk on Days 1 and 7 of an intensive care unit stay. We used data from two publicly available datasets, eICU Collaborative Research Database (eICU-CRD) (208 hospitals) and Medical Information Mart for Intensive Care IV (MIMIC-IV) (1 hospital).

View Article and Find Full Text PDF

Large language models (LLMs) continue to exhibit noteworthy capabilities across a spectrum of areas, including emerging proficiencies across the health care continuum. Successful LLM implementation and adoption depend on digital readiness, modern infrastructure, a trained workforce, privacy, and an ethical regulatory landscape. These factors can vary significantly across health care ecosystems, dictating the choice of a particular LLM implementation pathway.

View Article and Find Full Text PDF

Objectives: Septic shock is a common condition necessitating timely management including hemodynamic support with vasopressors. Despite the high prevalence and mortality, there is limited data characterizing patients who require three or more vasopressors. We sought to define the demographics, outcomes, and prognostic determinants associated with septic shock requiring three or more vasopressors.

View Article and Find Full Text PDF

Objective: To challenge clinicians and informaticians to learn about potential sources of bias in medical machine learning models through investigation of data and predictions from an open-source severity of illness score.

Methods: Over a two-day period (total elapsed time approximately 28 hours), we conducted a datathon that challenged interdisciplinary teams to investigate potential sources of bias in the Global Open Source Severity of Illness Score. Teams were invited to develop hypotheses, to use tools of their choosing to identify potential sources of bias, and to provide a final report.

View Article and Find Full Text PDF
Article Synopsis
  • The past two decades have seen a surge in interest and investment in AI, fueled by increased data, better computing power, and advanced modeling techniques.* -
  • Ophthalmology is one field utilizing AI to enhance telemedicine and imaging, but there are challenges to address before widespread implementation.* -
  • The article outlines a seven-step AI lifecycle, highlighting risks and proposing strategies for each phase, from data collection to post-deployment evaluation.*
View Article and Find Full Text PDF
Article Synopsis
  • Some people are excited because big language models (LLMs) can pass important medical tests.
  • This makes doctors think about what skills they need to work well with these AI tools.
  • To prepare future doctors for using AI, medical schools might need to change how they teach!
View Article and Find Full Text PDF

Background: Ejection fraction (EF) estimation informs patient plans in the ICU, and low EF can indicate ventricular systolic dysfunction, which increases the risk of adverse events including heart failure. Automated echocardiography models are an attractive solution for high-variance human EF estimation, and key to this goal are echocardiogram vector embeddings, which are a critical resource for computational researchers.

Objectives: The authors aimed to extract the vector embeddings from each echocardiogram in the EchoNet dataset using a classifier trained to classify EF as healthy (>50%) or unhealthy (<= 50%) to create an embeddings dataset for computational researchers.

View Article and Find Full Text PDF

Large language models (LLMs) hold promise to serve complex health information needs but also have the potential to introduce harm and exacerbate health disparities. Reliably evaluating equity-related model failures is a critical step toward developing systems that promote health equity. We present resources and methodologies for surfacing biases with potential to precipitate equity-related harms in long-form, LLM-generated answers to medical questions and conduct a large-scale empirical case study with the Med-PaLM 2 LLM.

View Article and Find Full Text PDF

In recent years, there has been substantial work in low-cost medical diagnostics based on the physical manifestations of disease. This is due to advancements in data analysis techniques and classification algorithms and the increased availability of computing power through smart devices. Smartphones and their ability to interface with simple sensors such as inertial measurement units (IMUs), microphones, piezoelectric sensors, etc.

View Article and Find Full Text PDF

Background: Labeling color fundus photos (CFP) is an important step in the development of artificial intelligence screening algorithms for the detection of diabetic retinopathy (DR). Most studies use the International Classification of Diabetic Retinopathy (ICDR) to assign labels to CFP, plus the presence or absence of macular edema (ME). Images can be grouped as referrable or nonreferrable according to these classifications.

View Article and Find Full Text PDF

Healthcare AI faces an ethical dilemma between selective and equitable deployment, exacerbated by flawed performance metrics. These metrics inadequately capture real-world complexities and biases, leading to premature assertions of effectiveness. Improved evaluation practices, including continuous monitoring and silent evaluation periods, are crucial.

View Article and Find Full Text PDF

This narrative review focuses on the role of clinical prediction models in supporting informed decision-making in critical care, emphasizing their 2 forms: traditional scores and artificial intelligence (AI)-based models. Acknowledging the potential for both types to embed biases, the authors underscore the importance of critical appraisal to increase our trust in models. The authors outline recommendations and critical care examples to manage risk of bias in AI models.

View Article and Find Full Text PDF

Importance: Maneuvers assessing fluid responsiveness before an intravascular volume expansion may limit useless fluid administration, which in turn may improve outcomes.

Objective: To describe maneuvers for assessing fluid responsiveness in mechanically ventilated patients.

Registration: The protocol was registered at PROSPERO: CRD42019146781.

View Article and Find Full Text PDF
Article Synopsis
  • TRIPOD-LLM is a new set of reporting guidelines specifically designed for the use of Large Language Models (LLMs) in biomedical research, aiming to standardize transparency and quality in healthcare applications.
  • The guidelines include a checklist with 19 main items and 50 subitems, adaptable to various research designs, emphasizing the importance of human oversight and task-specific performance.
  • An interactive website is provided to help researchers easily complete the guidelines and generate submissions, with the intention of continually updating the document as the field evolves.
View Article and Find Full Text PDF
Article Synopsis
  • * Current tools to measure health equity are limited, often focusing on specific areas of patient care rather than the entire healthcare process.
  • * A study introduced a process mining framework to track patient care actions, revealing that while treatment was similar for men and women, non-English speaking patients experienced delays despite having similar illness severity.
View Article and Find Full Text PDF

Given the potential benefits of artificial intelligence and machine learning (AI/ML) within healthcare, it is critical to consider how these technologies can be deployed in pediatric research and practice. Currently, healthcare AI/ML has not yet adapted to the specific technical considerations related to pediatric data nor adequately addressed the specific vulnerabilities of children and young people (CYP) in relation to AI. While the greatest burden of disease in CYP is firmly concentrated in lower and middle-income countries (LMICs), existing applied pediatric AI/ML efforts are concentrated in a small number of high-income countries (HICs).

View Article and Find Full Text PDF

Background: There are known racial disparities in the organ transplant allocation system in the United States. However, prior work has yet to establish if transplant center decisions on offer acceptance-the final step in the allocation process-contribute to these disparities.

Objective: To estimate racial differences in the acceptance of organ offers by transplant center physicians on behalf of their patients.

View Article and Find Full Text PDF