Publications by authors named "Leon Derczynski"

Engaging in the deliberate generation of abnormal outputs from Large Language Models (LLMs) by attacking them is a novel human activity. This paper presents a thorough exposition of how and why people perform such attacks, defining LLM red-teaming based on extensive and diverse evidence. Using a formal qualitative methodology, we interviewed dozens of practitioners from a broad range of backgrounds, all contributors to this novel work of attempting to cause LLMs to fail.

View Article and Find Full Text PDF

Data-driven and machine learning based approaches for detecting, categorising and measuring abusive content such as hate speech and harassment have gained traction due to their scalability, robustness and increasingly high performance. Making effective detection systems for abusive content relies on having the right training datasets, reflecting a widely accepted mantra in computer science: Garbage In, Garbage Out. However, creating training datasets which are large, varied, theoretically-informed and that minimize biases is difficult, laborious and requires deep expertise.

View Article and Find Full Text PDF

We aimed to investigate whether daily fluctuations in mental health-relevant Twitter posts are associated with daily fluctuations in mental health crisis episodes. We conducted a primary and replicated time-series analysis of retrospectively collected data from Twitter and two London mental healthcare providers. Daily numbers of 'crisis episodes' were defined as incident inpatient, home treatment team and crisis house referrals between 2010 and 2014.

View Article and Find Full Text PDF