Accuracy of Current Large Language Models and The Retrieval Augmented Generation Model in Determining Dietary Principles in Chronic Kidney Disease.

J Ren Nutr

Assistant professor, Department of Electrical and Electronics Engineering, Faculty of Engineering and Architecture, Burdur Mehmet Akif Ersoy University, Burdur, Türkiye.

Published: January 2025

Objective: Large Language Models (LLMs) have emerged as powerful tools with significant potential for quickly accessing information in the nutrition and health, as in many fields. Retrieval augmented generation (RAG) has been included among artificial intelligence (AI) powered chatbot structures as a framework developed to increase the accuracy and ability of LLMs. This study aimed to evaluate the accuracy of LLMs (GPT4, Gemini, and Llama) and RAG in determining dietary principles in chronic kidney disease.

Design And Methods: The nutrition guideline published by the National Kidney Foundation in 2020 was used as an external information source in developed RAG model. Answers were obtained using 12 medical nutritional therapy prompts for CKD by four chatbots. The accuracy of the 48 answers generated by the chatbots was evaluated with a 5-point Likert scale.

Results: The results showed that Gemini and RAG had the highest accuracy scores (median:4.0), followed by GPT4 (median: 2.5) and Llama (median: 1.5), respectively. When the accuracy scores were examined between the two chatbots, a significant difference was detected between all groups except Gemini and RAG.

Conclusion: These chatbots produced both completely correct answers and false information with potentially harmful clinical outcomes. Customization of LLMs in specific areas such as nutrition or the development of a nutrition-specific RAG framework by improving LLM structures with current guidelines and articles may be an important strategy to increase the accuracy of AI powered chatbots.

Download full-text PDF

Source
http://dx.doi.org/10.1053/j.jrn.2025.01.004DOI Listing

Publication Analysis

Top Keywords

large language
8
language models
8
retrieval augmented
8
augmented generation
8
determining dietary
8
dietary principles
8
principles chronic
8
chronic kidney
8
increase accuracy
8
accuracy scores
8

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!