Evaluating LLMs' grammatical error correction performance in learner Chinese.

PLoS One

Yantai Institute of Technology, Yantai, China.

Published: October 2024

Large language models (LLMs) have recently exhibited significant capabilities in various English NLP tasks. However, their performance in Chinese grammatical error correction (CGEC) remains unexplored. This study evaluates the abilities of state-of-the-art LLMs in correcting learner Chinese errors from a corpus linguistic perspective. The performance of LLMs is assessed using standard evaluation metrics of MaxMatch score. Keyword and key n-gram analyses are conducted to quantitatively explore linguistic features that differentiate LLM outputs from those of human annotators. LLMs' performance in syntactic and semantic dimensions is further qualitatively analyzed based on these probes of keywords and key n-grams. Results show that LLMs achieve a relatively higher performance in test datasets with multiple annotators and low performance in those with a single annotator. LLMs tend to overcorrect wrong sentences, under the explicit prompt of the "minimal edit" strategy, by using more linguistic devices to generate fluent and grammatical sentences. Furthermore, they struggle with under-correction and hallucination in reasoning-dependent situations. These findings highlight the strengths and limitations of LLMs in CGEC, suggesting that future efforts should focus on refining overcorrection tendencies and improving the handling of complex semantic contexts.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11524451	PMC
http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0312881	PLOS

Publication Analysis

Top Keywords

grammatical error

error correction

learner chinese

performance

llms

evaluating llms'

llms' grammatical

correction performance

performance learner

chinese large

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!

A PHP Error was encountered