AI Article Synopsis

  • AI can significantly improve preoperative planning for breast reconstruction by automating radiology reporting, particularly through interpreting computed tomography angiography (CTA) scans.
  • Four advanced large language models (LLMs)—ChatGPT-4, BARD, Perplexity, and BingAI—were evaluated for their ability to answer questions about CTA scans, with responses assessed by expert surgeons for accuracy and readability.
  • BingAI outperformed the others in accuracy and readability metrics, while the study suggests that despite current limitations, LLMs have the potential to enhance future CTA reporting and preoperative planning in surgical contexts.

Article Abstract

Background: Artificial intelligence (AI) has the potential to transform preoperative planning for breast reconstruction by enhancing the efficiency, accuracy, and reliability of radiology reporting through automatic interpretation and perforator identification. Large language models (LLMs) have recently advanced significantly in medicine. This study aimed to evaluate the proficiency of contemporary LLMs in interpreting computed tomography angiography (CTA) scans for deep inferior epigastric perforator (DIEP) flap preoperative planning.

Methods: Four prominent LLMs, ChatGPT-4, BARD, Perplexity, and BingAI, answered six questions on CTA scan reporting. A panel of expert plastic surgeons with extensive experience in breast reconstruction assessed the responses using a Likert scale. In contrast, the responses' readability was evaluated using the Flesch Reading Ease score, the Flesch-Kincaid Grade level, and the Coleman-Liau Index. The DISCERN score was utilized to determine the responses' suitability. Statistical significance was identified through a t-test, and P-values < 0.05 were considered significant.

Results: BingAI provided the most accurate and useful responses to prompts, followed by Perplexity, ChatGPT, and then BARD. BingAI had the greatest Flesh Reading Ease (34.7±5.5) and DISCERN (60.5±3.9) scores. Perplexity had higher Flesch-Kincaid Grade level (20.5±2.7) and Coleman-Liau Index (17.8±1.6) scores than other LLMs.

Conclusion: LLMs exhibit limitations in their capabilities of reporting CTA for preoperative planning of breast reconstruction, yet the rapid advancements in technology hint at a promising future. AI stands poised to enhance the education of CTA reporting and aid preoperative planning. In the future, AI technology could provide automatic CTA interpretation, enhancing the efficiency, accuracy, and reliability of CTA reports.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11067004PMC
http://dx.doi.org/10.1016/j.jpra.2024.03.010DOI Listing

Publication Analysis

Top Keywords

preoperative planning
16
breast reconstruction
12
deep inferior
8
inferior epigastric
8
planning breast
8
enhancing efficiency
8
efficiency accuracy
8
accuracy reliability
8
reading ease
8
flesch-kincaid grade
8

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!