Publications by Nanna Inie

Publications by authors named "Nanna Inie"

Page 1 of 1

Summon a demon and bind it: A grounded theory of LLM red teaming.

Nanna Inie Jonathan Stray Leon Derczynski

PLoS One

January 2025

Engaging in the deliberate generation of abnormal outputs from Large Language Models (LLMs) by attacking them is a novel human activity. This paper presents a thorough exposition of how and why people perform such attacks, defining LLM red-teaming based on extensive and diverse evidence. Using a formal qualitative methodology, we interviewed dozens of practitioners from a broad range of backgrounds, all contributors to this novel work of attempting to cause LLMs to fail.

View Article and Find Full Text PDF