Application of NotebookLM, a large language model with retrieval-augmented generation, for lung cancer staging.

Ryota Tozuka Hisashi Johno Akitomo Amakawa Junichi Sato Mizuki Muto Shoichiro Seki Atsushi Komaba Hiroshi Onishi

Jpn J Radiol

Department of Radiology, University of Yamanashi, 1110 Shimokato, Chuo, Yamanashi, 409-3898, Japan.

Published: November 2024

Purpose: In radiology, large language models (LLMs), including ChatGPT, have recently gained attention, and their utility is being rapidly evaluated. However, concerns have emerged regarding their reliability in clinical applications due to limitations such as hallucinations and insufficient referencing. To address these issues, we focus on the latest technology, retrieval-augmented generation (RAG), which enables LLMs to reference reliable external knowledge (REK). Specifically, this study examines the utility and reliability of a recently released RAG-equipped LLM (RAG-LLM), NotebookLM, for staging lung cancer.

Materials And Methods: We summarized the current lung cancer staging guideline in Japan and provided this as REK to NotebookLM. We then tasked NotebookLM with staging 100 fictional lung cancer cases based on CT findings and evaluated its accuracy. For comparison, we performed the same task using a gold-standard LLM, GPT-4 Omni (GPT-4o), both with and without the REK. For GPT-4o, the REK was provided directly within the prompt rather than through RAG.

Results: NotebookLM achieved 86% diagnostic accuracy in the lung cancer staging experiment, outperforming GPT-4o, which recorded 39% accuracy with the REK and 25% without it. Moreover, NotebookLM demonstrated 95% accuracy in searching reference locations within the REK.

Conclusion: NotebookLM, a RAG-LLM, successfully performed lung cancer staging by utilizing the REK, demonstrating superior performance compared to GPT-4o (without RAG). Additionally, it provided highly accurate reference locations within the REK, allowing radiologists to efficiently evaluate the reliability of NotebookLM's responses and detect possible hallucinations. Overall, this study highlights the potential of NotebookLM, a RAG-LLM, in image diagnosis.

Download full-text PDF	Source
http://dx.doi.org/10.1007/s11604-024-01705-1	DOI Listing

Publication Analysis

Top Keywords

lung cancer

cancer staging

large language

retrieval-augmented generation

notebooklm staging

gpt-4o rek

reference locations

notebooklm rag-llm

rek

notebooklm

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!

A PHP Error was encountered