Objective: This study aimed to fine-tune a large language model (LLM) for domain-specific text generation in surgical and anesthesia residency education.
Summary Background Data: With growing interest in artificial intelligence (AI) for medical training, the potential of LLMs to transform residency education is explored.
Methods: The 7-billion parameter base model "Vicuna v1.5" was trained on 266,342 lines of text from 821 peer-reviewed documents. We evaluated the model with 150 surgical or anesthesia queries and assessed accuracy, token count, and inference speed across various reasoning tasks. Tests of significance were conducted using ANOVA and chi-square analysis.
Results: Our model achieved 65.3% accuracy, excelling in surgical case-based tasks. We found no significant difference in accuracy between knowledge domains (P=0.081), though longer response generation demonstrated poorer accuracy, with significant accuracy variation based on output length (P = 0.002).
Conclusions: LLMs show potential in enhancing residency education. Our model's efficiency and task-specific accuracy highlights such promise, though limits in parameter count diminishes accuracy of longer response generation. Our findings showcase how AI may be integrated effectively within future residency training.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1016/j.amjsurg.2024.02.016 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!