Publications by authors named "Johannes C Eichstaedt"

Large language models (LLMs) are becoming more widely used to simulate human participants and so understanding their biases is important. We developed an experimental framework using Big Five personality surveys and uncovered a previously undetected social desirability bias in a wide range of LLMs. By systematically varying the number of questions LLMs were exposed to, we demonstrate their ability to infer when they are being evaluated.

View Article and Find Full Text PDF

Social media can provide real-time insight into trends in substance use, addiction, and recovery. Prior studies have used platforms such as Reddit and X (formerly Twitter), but evolving policies around data access have threatened these platforms' usability in research. We evaluate the potential of a broad set of platforms to detect emerging trends in the opioid epidemic.

View Article and Find Full Text PDF

In the most comprehensive population surveys, mental health is only broadly captured through questionnaires asking about "mentally unhealthy days" or feelings of "sadness." Further, population mental health estimates are predominantly consolidated to yearly estimates at the state level, which is considerably coarser than the best estimates of physical health. Through the large-scale analysis of social media, robust estimation of population mental health is feasible at finer resolutions.

View Article and Find Full Text PDF

Large language models (LLMs) such as Open AI's GPT-4 (which power ChatGPT) and Google's Gemini, built on artificial intelligence, hold immense potential to support, augment, or even eventually automate psychotherapy. Enthusiasm about such applications is mounting in the field as well as industry. These developments promise to address insufficient mental healthcare system capacity and scale individual access to personalized treatments.

View Article and Find Full Text PDF

The Cantril Ladder is among the most widely administered subjective well-being measures; every year, it is collected in 140+ countries in the Gallup World Poll and reported in the World Happiness Report. The measure asks respondents to evaluate their lives on a ladder from worst (bottom) to best (top). Prior work found Cantril Ladder scores sensitive to social comparison and to reflect one's relative position in the income distribution.

View Article and Find Full Text PDF

Full national coverage below the state level is difficult to attain through survey-based data collection. Even the largest survey-based data collections, such as the CDC's Behavioral Risk Factor Surveillance System or the Gallup-Healthways Well-being Index (both with more than 300,000 responses p.a.

View Article and Find Full Text PDF

Opioid poisoning mortality is a substantial public health crisis in the United States, with opioids involved in approximately 75% of the nearly 1 million drug related deaths since 1999. Research suggests that the epidemic is driven by both over-prescribing and social and psychological determinants such as economic stability, hopelessness, and isolation. Hindering this research is a lack of measurements of these social and psychological constructs at fine-grained spatial and temporal resolutions.

View Article and Find Full Text PDF

Many scholars have proposed that feeling what we believe others are feeling-often known as "empathy"-is essential for other-regarding sentiments and plays an important role in our moral lives. Caring for and about others (without necessarily sharing their feelings)-often known as "compassion"-is also frequently discussed as a relevant force for prosocial motivation and action. Here, we explore the relationship between empathy and compassion using the methods of computational linguistics.

View Article and Find Full Text PDF

Wellbeing is predominantly measured through surveys but is increasingly measured by analysing individuals' language on social media platforms using social media text mining (SMTM). To investigate whether the structure of wellbeing is similar across both data collection methods, we compared networks derived from survey items and social media language features collected from the same participants. The dataset was split into an independent exploration (n = 1169) and a final subset (n = 1000).

View Article and Find Full Text PDF

Background: An infodemic is excess information, including false or misleading information, that spreads in digital and physical environments during a public health emergency. The COVID-19 pandemic has been accompanied by an unprecedented global infodemic that has led to confusion about the benefits of medical and public health interventions, with substantial impact on risk-taking and health-seeking behaviors, eroding trust in health authorities and compromising the effectiveness of public health responses and policies. Standardized measures are needed to quantify the harmful impacts of the infodemic in a systematic and methodologically robust manner, as well as harmonizing highly divergent approaches currently explored for this purpose.

View Article and Find Full Text PDF

Extensive evidence demonstrates the effects of area-based disadvantage on a variety of life outcomes, such as increased mortality and low economic mobility. Despite these well-established patterns, disadvantage, often measured using composite indices, is inconsistently operationalized across studies. To address this issue, we systematically compared 5 U.

View Article and Find Full Text PDF

Introduction: Although surveys are a well-established instrument to capture the population prevalence of mental health at a moment in time, public Twitter is a continuously available data source that can provide a broader window into population mental health. We characterized the relationship between COVID-19 case counts, stay-at-home orders because of COVID-19, and anxiety and depression in 7 major U.S.

View Article and Find Full Text PDF

We study the language differentially associated with loneliness and depression using 3.4-million Facebook posts from 2986 individuals, and uncover the statistical associations of survey-based depression and loneliness with both dictionary-based (Linguistic Inquiry Word Count 2015) and open-vocabulary linguistic features (words, phrases, and topics). Loneliness and depression were found to have highly overlapping language profiles, including sickness, pain, and negative emotions as (cross-sectional) risk factors, and social relationships and activities as protective factors.

View Article and Find Full Text PDF

Background: Personal sensing has shown promise for detecting behavioral correlates of depression, but there is little work examining personal sensing of cognitive and affective states. Digital language, particularly through personal text messages, is one source that can measure these markers.

Methods: We correlated privacy-preserving sentiment analysis of text messages with self-reported depression symptom severity.

View Article and Find Full Text PDF

Technology now makes it possible to understand efficiently and at large scale how people use language to reveal their everyday thoughts, behaviors, and emotions. Written text has been analyzed through both theory-based, closed-vocabulary methods from the social sciences as well as data-driven, open-vocabulary methods from computer science, but these approaches have not been comprehensively compared. To provide guidance on best practices for automatically analyzing written text, this narrative review and quantitative synthesis compares five predominant closed- and open-vocabulary methods: Linguistic Inquiry and Word Count (LIWC), the General Inquirer, DICTION, Latent Dirichlet Allocation, and Differential Language Analysis.

View Article and Find Full Text PDF

Looking to supplement common economic indicators, politicians and policymakers are increasingly interested in how to measure and improve the subjective well-being of communities. Theories about nonprofit organizations suggest that they represent a potential policy-amenable lever to increase community subjective well-being. Using longitudinal cross-lagged panel models with IRS and Twitter data, this study explores whether communities with higher numbers of nonprofits per capita exhibit greater subjective well-being in the form of more expressions of positive emotion, engagement, and relationships.

View Article and Find Full Text PDF

On May 25, 2020, George Floyd, an unarmed Black American male, was killed by a White police officer. Footage of the murder was widely shared. We examined the psychological impact of Floyd's death using two population surveys that collected data before and after his death; one from Gallup (117,568 responses from = 47,355) and one from the US Census (409,652 responses from = 319,471).

View Article and Find Full Text PDF

Objective: We explore the personality of counties as assessed through linguistic patterns on social media. Such studies were previously limited by the cost and feasibility of large-scale surveys; however, language-based computational models applied to large social media datasets now allow for large-scale personality assessment.

Method: We applied a language-based assessment of the five factor model of personality to 6,064,267 U.

View Article and Find Full Text PDF

Background: Oral histories from 9/11 responders to the World Trade Center (WTC) attacks provide rich narratives about distress and resilience. Artificial Intelligence (AI) models promise to detect psychopathology in natural language, but they have been evaluated primarily in non-clinical settings using social media. This study sought to test the ability of AI-based language assessments to predict PTSD symptom trajectories among responders.

View Article and Find Full Text PDF

Psychological research has shown that subjective well-being is sensitive to social comparison effects; individuals report decreased happiness when their neighbors earn more than they do. In this work, we use Twitter language to estimate the well-being of users, and model both individual and neighborhood income using hierarchical modeling across counties in the United States (US). We show that language-based estimates from a sample of 5.

View Article and Find Full Text PDF

A rapidly growing literature has attempted to explain Donald Trump's success in the 2016 U.S. presidential election as a result of a wide variety of differences in individual characteristics, attitudes, and social processes.

View Article and Find Full Text PDF

Researchers and policy makers worldwide are interested in measuring the subjective well-being of populations. When users post on social media, they leave behind digital traces that reflect their thoughts and feelings. Aggregation of such digital traces may make it possible to monitor well-being at large scale.

View Article and Find Full Text PDF

Excessive alcohol use in the US contributes to over 88,000 deaths per year and costs over $250 billion annually. While previous studies have shown that excessive alcohol use can be detected from general patterns of social media engagement, we characterized how drinking-specific language varies across regions and cultures in the US. From a database of 38 billion public tweets, we selected those mentioning "drunk", found the words and phrases distinctive of drinking posts, and then clustered these into topics and sets of semantically related words.

View Article and Find Full Text PDF

We studied whether medical conditions across 21 broad categories were predictable from social media content across approximately 20 million words written by 999 consenting patients. Facebook language significantly improved upon the prediction accuracy of demographic variables for 18 of the 21 disease categories; it was particularly effective at predicting diabetes and mental health conditions including anxiety, depression and psychoses. Social media data are a quantifiable link into the otherwise elusive daily lives of patients, providing an avenue for study and assessment of behavioral and environmental disease risk factors.

View Article and Find Full Text PDF