One of the major barriers to using large language models (LLMs) in medicine is the perception they use uninterpretable methods to make clinical decisions that are inherently different from the cognitive processes of clinicians. In this manuscript we develop diagnostic reasoning prompts to study whether LLMs can imitate clinical reasoning while accurately forming a diagnosis. We find that GPT-4 can be prompted to mimic the common clinical reasoning processes of clinicians without sacrificing diagnostic accuracy. This is significant because an LLM that can imitate clinical reasoning to provide an interpretable rationale offers physicians a means to evaluate whether an LLMs response is likely correct and can be trusted for patient care. Prompting methods that use diagnostic reasoning have the potential to mitigate the "black box" limitations of LLMs, bringing them one step closer to safe and effective use in medicine.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10808088PMC
http://dx.doi.org/10.1038/s41746-024-01010-1DOI Listing

Publication Analysis

Top Keywords

diagnostic reasoning
12
clinical reasoning
12
reasoning prompts
8
large language
8
processes clinicians
8
imitate clinical
8
reasoning
5
diagnostic
4
prompts reveal
4
reveal potential
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!