Public Health Discussions on Social Media: Evaluating Automated Sentiment Analysis Methods.

JMIR Form Res

Department of Health Administration, The College of Health Professions, Central Michigan University, Mt Pleasant, MI, United States.

Published: January 2025

AI Article Synopsis

  • Sentiment analysis is a key method for analyzing text, especially in social media research, where the choice between manual and automated methods is crucial.
  • The study compared several sentiment analysis tools, including VADER, TEXT2DATA, LIWC-22, and ChatGPT 4.0, against manually coded sentiment scores from YouTube comments on the opioid crisis, assessing factors like ease of use and cost.
  • Findings revealed that LIWC-22 excelled in identifying sentiment patterns, while VADER was best at classifying negative comments, but overall, automated tools showed only fair agreement with manual coding, with ChatGPT performing poorly.

Article Abstract

Background: Sentiment analysis is one of the most widely used methods for mining and examining text. Social media researchers need guidance on choosing between manual and automated sentiment analysis methods.

Objective: Popular sentiment analysis tools based on natural language processing (NLP; VADER [Valence Aware Dictionary for Sentiment Reasoning], TEXT2DATA [T2D], and Linguistic Inquiry and Word Count [LIWC-22]), and a large language model (ChatGPT 4.0) were compared with manually coded sentiment scores, as applied to the analysis of YouTube comments on videos discussing the opioid epidemic. Sentiment analysis methods were also examined regarding ease of programming, monetary cost, and other practical considerations.

Methods: Evaluation methods included descriptive statistics, receiver operating characteristic (ROC) curve analysis, confusion matrices, Cohen κ, accuracy, specificity, precision, sensitivity (recall), F-score harmonic mean, and the Matthews correlation coefficient. An inductive, iterative approach to content analysis of the data was used to obtain manual sentiment codes.

Results: A subset of comments were analyzed by a second coder, producing good agreement between the 2 coders' judgments (κ=0.734). YouTube social media about the opioid crisis had many more negative comments (4286/4871, 88%) than positive comments (79/662, 12%), making it possible to evaluate the performance of sentiment analysis models in an unbalanced dataset. The tone summary measure from LIWC-22 performed better than other tools for estimating the prevalence of negative versus positive sentiment. According to the ROC curve analysis, VADER was best at classifying manually coded negative comments. A comparison of Cohen κ values indicated that NLP tools (VADER, followed by LIWC's tone and T2D) showed only fair agreement with manual coding. In contrast, ChatGPT 4.0 had poor agreement and failed to generate binary sentiment scores in 2 out of 3 attempts. Variations in accuracy, specificity, precision, sensitivity, F-score, and MCC did not reveal a single superior model. F-score harmonic means were 0.34-0.38 (SD 0.02) for NLP tools and very low (0.13) for ChatGPT 4.0. None of the MCCs reached a strong correlation level.

Conclusions: Researchers studying negative emotions, public worries, or dissatisfaction with social media face unique challenges in selecting models suitable for unbalanced datasets. We recommend VADER, the only cost-free tool we evaluated, due to its excellent discrimination, which can be further improved when the comments are at least 100 characters long. If estimating the prevalence of negative comments in an unbalanced dataset is important, we recommend the tone summary measure from LIWC-22. Researchers using T2D must know that it may only score some data and, compared with other methods, be more time-consuming and cost-prohibitive. A general-purpose large language model, ChatGPT 4.0, has yet to surpass the performance of NLP models, at least for unbalanced datasets with highly prevalent (7:1) negative comments.

Download full-text PDF

Source
http://dx.doi.org/10.2196/57395DOI Listing

Publication Analysis

Top Keywords

sentiment analysis
24
social media
16
negative comments
16
analysis methods
12
sentiment
11
analysis
10
automated sentiment
8
large language
8
language model
8
model chatgpt
8

Similar Publications

Background: Social media platforms are increasingly used by people living with dementia and their care partners to seek information and advice, share personal stories, raise awareness, and offer support to others. Engagement with social media is often accompanied by a personal disclosure of a dementia diagnosis or identification as a care partner, but the impact of this disclosure remains unknown. Social media engagement can be beneficial by facilitating peer-interactions and social support; however experts have raised concerns about the potential for exposure to misinformation and stigma as a result of self-disclosure.

View Article and Find Full Text PDF

This paper intends to solve the limitations of the existing methods to deal with the comments of tourist attractions. With the technical support of Artificial Intelligence (AI), an online comment method of tourist attractions based on text mining model and attention mechanism is proposed. In the process of text mining, the attention mechanism is used to calculate the contribution of each topic to text representation on the topic layer of Latent Dirichlet Allocation (LDA).

View Article and Find Full Text PDF

To explore the attitudes of healthcare professionals and the public on applying ChatGPT in clinical practice. The successful application of ChatGPT in clinical practice depends on technical performance and critically on the attitudes and perceptions of non-healthcare and healthcare. This study has a qualitative design based on artificial intelligence.

View Article and Find Full Text PDF

Public Health Discussions on Social Media: Evaluating Automated Sentiment Analysis Methods.

JMIR Form Res

January 2025

Department of Health Administration, The College of Health Professions, Central Michigan University, Mt Pleasant, MI, United States.

Article Synopsis
  • Sentiment analysis is a key method for analyzing text, especially in social media research, where the choice between manual and automated methods is crucial.
  • The study compared several sentiment analysis tools, including VADER, TEXT2DATA, LIWC-22, and ChatGPT 4.0, against manually coded sentiment scores from YouTube comments on the opioid crisis, assessing factors like ease of use and cost.
  • Findings revealed that LIWC-22 excelled in identifying sentiment patterns, while VADER was best at classifying negative comments, but overall, automated tools showed only fair agreement with manual coding, with ChatGPT performing poorly.
View Article and Find Full Text PDF

In this paper, we analyse the attitudes and sentiments of Romanian smallholders towards mole infestations, as expressed in online contexts. A corpus of texts on the topic of ground moles and how to get rid of them was collected from social media and blog thread discussions. The texts were analysed using topic modelling, clustering, and sentiment analysis, revealing both negative and positive sentiments and attitudes.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!