Purpose: To investigate the accuracy of ChatGPT (Chat generative pretrained transformer), a large language model, in calculating sample size for sport-sciences and sports-medicine research studies.
Methods: We conducted an analysis on 4 published papers (ie, examples 1-4) encompassing various study designs and approaches for calculating sample size in 3 sport-science and -medicine journals, including 3 randomized controlled trials and 1 survey paper. We provided ChatGPT with all necessary data such as mean, percentage SD, normal deviates (Zα/2 and Z1-β), and study design. Prompting from 1 example has subsequently been reused to gain insights into the reproducibility of the ChatGPT response.
Results: ChatGPT correctly calculated the sample size for 1 randomized controlled trial but failed in the remaining 3 examples, including the incorrect identification of the formula in one example of a survey paper. After interaction with ChatGPT, the correct sample size was obtained for the survey paper. Intriguingly, when the prompt from Example 3 was reused, ChatGPT provided a completely different sample size than its initial response.
Conclusions: While the use of artificial-intelligence tools holds great promise, it should be noted that it might lead to errors and inconsistencies in sample-size calculations even when the tool is fed with the necessary correct information. As artificial-intelligence technology continues to advance and learn from human feedback, there is hope for improvement in sample-size calculation and other research tasks. However, it is important for scientists to exercise caution in utilizing these tools. Future studies should assess more advanced/powerful versions of this tool (ie, ChatGPT4).
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1123/ijspp.2023-0109 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!