GPTZero Performance in Identifying Artificial Intelligence-Generated Medical Texts: A Preliminary Study.

J Korean Med Sci

Past President, World Association of Medical Editors (WAME), Editorial Consultant, The Lancet, Associate Editor, Frontiers in Epidemiology.

Published: September 2023

Background: With emergence of chatbots to help authors with scientific writings, editors should have tools to identify artificial intelligence-generated texts. GPTZero is among the first websites that has sought media attention claiming to differentiate machine-generated from human-written texts.

Methods: Using 20 text pieces generated by ChatGPT in response to arbitrary questions on various topics in medicine and 30 pieces chosen from previously published medical articles, the performance of GPTZero was assessed.

Results: GPTZero had a sensitivity of 0.65 (95% confidence interval, 0.41-0.85); specificity, 0.90 (0.73-0.98); accuracy, 0.80 (0.66-0.90); and positive and negative likelihood ratios, 6.5 (2.1-19.9) and 0.4 (0.2-0.7), respectively.

Conclusion: GPTZero has a low false-positive (classifying a human-written text as machine-generated) and a high false-negative rate (classifying a machine-generated text as human-written).

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10519776PMC
http://dx.doi.org/10.3346/jkms.2023.38.e319DOI Listing

Publication Analysis

Top Keywords

artificial intelligence-generated
8
gptzero
5
gptzero performance
4
performance identifying
4
identifying artificial
4
intelligence-generated medical
4
medical texts
4
texts preliminary
4
preliminary study
4
study background
4

Similar Publications

Introduction: This study examines the ability of human readers, recurrence quantification analysis (RQA), and an online artificial intelligence (AI) detection tool (GPTZero) to distinguish between AI-generated and human-written personal statements in physical therapist education program applications.

Review Of Literature: The emergence of large language models such as ChatGPT and Google Gemini has raised concerns about the authenticity of personal statements. Previous studies have reported varying degrees of success in detecting AI-generated text.

View Article and Find Full Text PDF

Prevalence of Words and Phrases Associated With Large Language Model-Generated Text in the Nursing Literature.

Comput Inform Nurs

December 2024

Author Affiliations: Data Driven WV, John Chambers College of Business and Economics (Ms Bailey), and School of Nursing, West Virginia University (Dr Carter-Templeton), Morgantown; School of Library and Information Sciences, North Carolina Central University, Durham (Dr Peterson); Duke University School of Nursing, Durham, NC (Dr Oermann); Dwight Schar College of Nursing and Health Sciences, Ashland University, OH (Dr Owens).

All disciplines, including nursing, may be experiencing significant changes with the advent of free, publicly available generative artificial intelligence tools. Recent research has shown the difficulty in distinguishing artificial intelligence-generated text from content that is written by humans, thereby increasing the probability for unverified information shared in scholarly works. The purpose of this study was to determine the extent of generative artificial intelligence usage in published nursing articles.

View Article and Find Full Text PDF

Cortical lesions impact cognitive decline in multiple sclerosis via volume loss of nonlesional cortex.

Ann Clin Transl Neurol

December 2024

MS Center Amsterdam, Department of Anatomy and Neurosciences, Amsterdam Neuroscience, Vrije Universiteit Amsterdam, Amsterdam, The Netherlands.

Objective: To assess the interrelationship between cortical lesions and cortical thinning and volume loss in people with multiple sclerosis within cortical networks, and how this relates to future cognition.

Methods: In this longitudinal study, 230 people with multiple sclerosis and 60 healthy controls underwent 3 Tesla MRI at baseline and neuropsychological assessment at baseline and 5-year follow-up. Cortical regions (N = 212) were divided into seven functional networks.

View Article and Find Full Text PDF

Background: The proliferation of generative artificial intelligence (AI), such as ChatGPT, has added complexity and richness to the virtual environment by increasing the presence of AI-generated content (AIGC). Although social media platforms such as TikTok have begun labeling AIGC to facilitate the ability for users to distinguish it from human-generated content, little research has been performed to examine the effect of these AIGC labels.

Objective: This study investigated the impact of AIGC labels on perceived accuracy, message credibility, and sharing intention for misinformation through a web-based experimental design, aiming to refine the strategic application of AIGC labels.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!