Publications by authors named "Eyal Klang"

Article Synopsis
  • The review analyzes the use of large language models (LLMs) in melanoma care, highlighting their effectiveness in patient education, diagnosis, and clinical management.
  • While LLMs have shown high accuracy in educating patients, challenges remain with readability, and diagnostic accuracy can be affected by image quality and context.
  • The study suggests that future research should focus on improving LLMs with diverse data and expert input to overcome limitations in generalizability and decision-making depth.
View Article and Find Full Text PDF

Background: Amidst the increasing use of AI in medical research, this study specifically aims to assess and compare the accuracy and credibility of openAI's GPT-4 and Google's Gemini in their ability to generate medical research introductions, focusing on the precision and reliability of their citations across five medical fields.

Methods: We compared the two models, OpenAI's GPT-4 and Google's Gemini Ultra, across five medical fields, focusing on the credibility and accuracy of citations, alongside the analysis of introduction length and unreferenced data.

Results: Gemini outperformed GPT-4 in reference precision.

View Article and Find Full Text PDF

Background: Empathy, a fundamental aspect of human interaction, is characterized as the ability to experience another being's emotions within oneself. In health care, empathy is a fundamental for health care professionals and patients' interaction. It is a unique quality to humans that large language models (LLMs) are believed to lack.

View Article and Find Full Text PDF

Aim: Diagnostic imaging is an integral part of identifying spondyloarthropathies (SpA), yet the interpretation of these images can be challenging. This review evaluated the use of deep learning models to enhance the diagnostic accuracy of SpA imaging.

Methods: Following PRISMA guidelines, we systematically searched major databases up to February 2024, focusing on studies that applied deep learning to SpA imaging.

View Article and Find Full Text PDF

Background/aim: Contrast-enhanced mammography (CEM) is a relatively novel imaging technique that enables both anatomical and functional breast imaging, with improved diagnostic performance compared to standard 2D mammography. The aim of this study is to systematically review the literature on deep learning (DL) applications for CEM, exploring how these models can further enhance CEM diagnostic potential.

Methods: This systematic review was reported according to the PRISMA guidelines.

View Article and Find Full Text PDF

Aim: To evaluate the accuracy of the Emergency Severity Index (ESI) assignments by GPT-4, a large language model (LLM), compared to senior emergency department (ED) nurses and physicians.

Method: An observational study of 100 consecutive adult ED patients was conducted. ESI scores assigned by GPT-4, triage nurses, and by a senior clinician.

View Article and Find Full Text PDF

Background: Large language models (LLMs) have shown promise in various professional fields, including medicine and law. However, their performance in highly specialized tasks, such as extracting ICD-10-CM codes from patient notes, remains underexplored.

Objective: The primary objective was to evaluate and compare the performance of ICD-10-CM code extraction by different LLMs with that of human coder.

View Article and Find Full Text PDF

Importance: Medical ethics is inherently complex, shaped by a broad spectrum of opinions, experiences, and cultural perspectives. The integration of large language models (LLMs) in healthcare is new and requires an understanding of their consistent adherence to ethical standards.

Objective: To compare the agreement rates in answering questions based on ethically ambiguous situations between three frontier LLMs (GPT-4, Gemini-pro-1.

View Article and Find Full Text PDF

Background: Healthcare reimbursement and coding is dependent on accurate extraction of International Classification of Diseases-tenth revision - clinical modification (ICD-10-CM) codes from clinical documentation. Attempts to automate this task have had limited success. This study aimed to evaluate the performance of large language models (LLMs) in extracting ICD-10-CM codes from unstructured inpatient notes and benchmark them against human coder.

View Article and Find Full Text PDF

Purpose: To evaluate AI-based chat bots ability to accurately answer common patient's questions in the field of ophthalmology.

Methods: An experienced ophthalmologist curated a set of 20 representative questions and responses were sought from two AI generative models: OpenAI's ChatGPT and Google's Bard (Gemini Pro). Eight expert ophthalmologists from different sub-specialties assessed each response, blinded to the source, and ranked them by three metrics-accuracy, comprehensiveness, and clarity, on a 1-5 scale.

View Article and Find Full Text PDF
Article Synopsis
  • * While published trials show potential benefits in areas like clinical documentation and medical decision-making, they also raise concerns about the models' accuracy.
  • * The review highlights the challenges of evaluating LLMs in clinical settings and discusses research gaps, aiming to guide future studies and the integration of LLMs into healthcare practices.
View Article and Find Full Text PDF

Large language models (LLMs) can optimize clinical workflows; however, the economic and computational challenges of their utilization at the health system scale are underexplored. We evaluated how concatenating queries with multiple clinical notes and tasks simultaneously affects model performance under increasing computational loads. We assessed ten LLMs of different capacities and sizes utilizing real-world patient data.

View Article and Find Full Text PDF

Purpose: While mammography is considered the gold standard for screening women for breast cancer, its accuracy declines in women with dense breasts. The purpose of the study is to evaluate the diagnostic accuracy of contrast enhanced mammography (CEM) for detecting breast cancer in intermediate and high-risk women, including those with genetic predispositions, over a decade-long cohort at a tertiary center.

Methods: We retrospectively analyzed all CEM examinations performed for screening purposes at a tertiary center between 2012 and 2023.

View Article and Find Full Text PDF

Background: While clinical practice guidelines advocate for multidisciplinary heart team (MDHT) discussions in coronary revascularization, variability in implementation across health care settings remains a challenge. This variability could potentially be addressed by language learning models like ChatGPT, offering decision-making support in diverse health care environments. Our study aims to critically evaluate the concordance between recommendations made by MDHT and those generated by language learning models in coronary revascularization decision-making.

View Article and Find Full Text PDF

Background: Accurate medical coding is essential for clinical and administrative purposes but complicated, time-consuming, and biased. This study compares Retrieval-Augmented Generation (RAG)-enhanced LLMs to provider-assigned codes in producing ICD-10-CM codes from emergency department (ED) clinical records.

Methods: Retrospective cohort study using 500 ED visits randomly selected from the Mount Sinai Health System between January and April 2024.

View Article and Find Full Text PDF

Multimodal technology is poised to revolutionize clinical practice by integrating artificial intelligence with traditional diagnostic modalities. This evolution traces its roots from Hippocrates' humoral theory to the use of sophisticated AI-driven platforms that synthesize data across multiple sensory channels. The interplay between historical medical practices and modern technology challenges conventional patient-clinician interactions and redefines diagnostic accuracy.

View Article and Find Full Text PDF

Background: Differential diagnosis in radiology relies on the accurate identification of imaging patterns. The use of large language models (LLMs) in radiology holds promise, with many potential applications that may enhance the efficiency of radiologists' workflow. The study aimed to evaluate the efficacy of generative pre-trained transformer (GPT)-4, a LLM, in providing differential diagnoses in neuroradiology, comparing its performance with board-certified neuroradiologists.

View Article and Find Full Text PDF
Article Synopsis
  • - The review highlights how natural language processing (NLP) and large language models (LLMs) are transforming healthcare by analyzing unstructured clinical texts to improve disease detection, diagnosis, and patient management in rheumatology.
  • - Researchers screened 1491 articles and found 35 relevant studies that demonstrated high accuracy in using NLP tools to identify conditions like rheumatoid arthritis (RA) and gout from data sources like electronic medical records.
  • - While current applications show promise, particularly for common diseases, the review calls for further research to expand NLP's utility in diagnosing rarer and more complex rheumatological conditions.
View Article and Find Full Text PDF
Article Synopsis
  • * A study analyzed electronic health records from over 21,000 patients to see if heparin therapy led to a delay in AD dementia diagnosis, factoring in various patient characteristics.
  • * Results showed that heparin therapy was linked to a significant delay in the clinical diagnosis of AD dementia by approximately 1 year in both health system cohorts, suggesting potential protective effects of heparin-like drugs against AD.
View Article and Find Full Text PDF

Background: Sheba Medical Center (SMC) is the largest hospital in Israel and has been coping with a steady increase in total Emergency Department (ED) visits. Over 140,000 patients arrive at the SMC's ED every year. Of those, 19% are admitted to the medical wards.

View Article and Find Full Text PDF

Large language models (LLMs) have significantly impacted various fields with their ability to understand and generate human-like text. This study explores the potential benefits and limitations of integrating LLMs, such as ChatGPT, into haematology practices. Utilizing systematic review methodologies, we analysed studies published after 1 December 2022, from databases like PubMed, Web of Science and Scopus, and assessing each for bias with the QUADAS-2 tool.

View Article and Find Full Text PDF

Objectives: This study aims to assess the performance of a multimodal artificial intelligence (AI) model capable of analyzing both images and textual data (GPT-4V), in interpreting radiological images. It focuses on a range of modalities, anatomical regions, and pathologies to explore the potential of zero-shot generative AI in enhancing diagnostic processes in radiology.

Methods: We analyzed 230 anonymized emergency room diagnostic images, consecutively collected over 1 week, using GPT-4V.

View Article and Find Full Text PDF

Objective: To develop an automated, new framework based on machine learning to diagnose malignant eyelid skin tumors.

Methods: This study used eyelid lesion images from Sheba Medical Center, a large tertiary center in Israel. Before model training, we pretrained our models on the International Skin Imaging Collaboration (ISIC) 2019 dataset consisting of 25,332 images.

View Article and Find Full Text PDF

Large language models (LLMs) are transforming the field of natural language processing (NLP). These models offer opportunities for radiologists to make a meaningful impact in their field. NLP is a part of artificial intelligence (AI) that uses computer algorithms to study and understand text data.

View Article and Find Full Text PDF