Publications by authors named "Ari Z Klein"

Article Synopsis
  • Researchers annotated over 9,700 tweets from users reporting their pregnancies to analyze health trends.
  • They developed deep neural network classifiers that achieved an impressive F-score of 0.93 for identifying specific childhood health conditions associated with pregnancy exposures.
  • The study highlights Twitter's potential as a valuable tool for assessing relationships between pregnancy factors and childhood health outcomes on a broad scale.
View Article and Find Full Text PDF
Article Synopsis
  • The text discusses the significance of real-world data from social media, particularly Twitter, for health and social science research, emphasizing the need to identify user demographics like age and gender to evaluate research representativeness.
  • It outlines the objective of a scoping review that summarizes existing literature on methods for predicting Twitter users' age and gender, noting the challenges involved in this process.
  • The review analyzed 684 studies, finding 74 relevant ones that discussed age or gender prediction, revealing a predominance in gender prediction methods, with varying levels of performance in accuracy for both age and gender classifications.
View Article and Find Full Text PDF

Objective: The aim of the Social Media Mining for Health Applications (#SMM4H) shared tasks is to take a community-driven approach to address the natural language processing and machine learning challenges inherent to utilizing social media data for health informatics. In this paper, we present the annotated corpora, a technical summary of participants' systems, and the performance results.

Methods: The eighth iteration of the #SMM4H shared tasks was hosted at the AMIA 2023 Annual Symposium and consisted of 5 tasks that represented various social media platforms (Twitter and Reddit), languages (English and Spanish), methods (binary classification, multi-class classification, extraction, and normalization), and topics (COVID-19, therapies, social anxiety disorder, and adverse drug events).

View Article and Find Full Text PDF
Article Synopsis
  • Preterm birth, defined as delivery before 37 weeks, is a critical global health issue, contributing to high neonatal and infant mortality rates, especially in the U.S., with recent studies suggesting a link between COVID-19 infection during pregnancy and increased preterm birth risk.
  • This study employed machine learning and natural language processing to analyze Twitter data from pregnant women to determine the correlation between the timing of COVID-19 infection during pregnancy and the incidence of preterm births.
  • The analysis identified 298 Twitter users who reported COVID-19 infections and their birth outcomes, with a distribution of cases across the first, second, and third trimesters, and found a notable percentage of preterm births among those infected.*
View Article and Find Full Text PDF
Article Synopsis
  • * The latest iteration included five tasks across platforms like Twitter and Reddit, covering topics such as COVID-19, therapies, and drug-related events in both English and Spanish, with 29 teams participating from 18 countries.
  • * The top systems in competitions utilized advanced deep learning techniques, particularly pre-trained transformer models, and a dataset of over 61,000 social media posts will be available for future research.
View Article and Find Full Text PDF

Background: More than 6 million people in the United States have Alzheimer disease and related dementias, receiving help from more than 11 million family or other informal caregivers. A range of traditional interventions has been developed to support family caregivers; however, most of them have not been implemented in practice and remain largely inaccessible. While recent studies have shown that family caregivers of people with dementia use Twitter to discuss their experiences, methods have not been developed to enable the use of Twitter for interventions.

View Article and Find Full Text PDF
Article Synopsis
  • A study was conducted to evaluate the risks associated with medication use during pregnancy, focusing on β-blockers, which are commonly prescribed but have unclear safety profiles for fetal development.
  • Researchers analyzed 2.75 billion tweets from users who announced their pregnancies to identify those who took β-blockers and to assess their pregnancy outcomes.
  • They found 5,114 tweets discussing β-blocker use in pregnancy, with over 45% of those tweets confirming that users self-reported taking the medication, allowing them to estimate prenatal periods for further analysis of pregnancy outcomes.
View Article and Find Full Text PDF
Article Synopsis
  • Researchers aimed to understand COVID-19 transmission in the UK using Twitter as a data source due to limited testing and information.
  • They collected geo-tagged tweets indicating possible COVID-19 exposure using natural language processing and machine learning methods.
  • Findings showed that Twitter reports aligned with lab-confirmed cases, often appearing up to 2 weeks earlier, suggesting tweets could help identify trends and inform public health policies.
View Article and Find Full Text PDF

Background: Pre-exposure prophylaxis (PrEP) is highly effective at preventing the acquisition of HIV. There is a substantial gap, however, between the number of people in the United States who have indications for PrEP and the number of them who are prescribed PrEP. Although Twitter content has been analyzed as a source of PrEP-related data (eg, barriers), methods have not been developed to enable the use of Twitter as a platform for implementing PrEP-related interventions.

View Article and Find Full Text PDF
Article Synopsis
  • Researchers developed an automated method called ReportAGE to identify the exact age of social media users based on their self-reported ages in tweets.
  • The system uses natural language processing techniques, including a deep neural network model, and achieved high accuracy in detecting age-related tweets and extracting exact ages.
  • ReportAGE was tested on over 1.2 billion tweets and successfully predicted the ages of 132,637 users, highlighting its potential for enhancing social media data analysis in research.
View Article and Find Full Text PDF
Article Synopsis
  • COVID-19 poses serious risks to pregnant individuals, leading to complications such as maternal death and preterm birth, yet many are hesitant to get vaccinated due to insufficient safety data.
  • This study aimed to explore the use of Twitter data to identify pregnant users who have received the COVID-19 vaccine and to track their pregnancy outcomes.
  • The findings revealed that Twitter can serve as a valuable tool for gathering vaccination data from pregnant individuals, which could help enhance the understanding of vaccine safety and potentially improve vaccination rates in this population.
View Article and Find Full Text PDF
Article Synopsis
  • Researchers aimed to create an automated system using natural language processing to analyze Twitter data for potential unreported COVID-19 cases in the U.S., addressing issues with traditional testing methods.
  • They collected tweets related to COVID-19 from January 2020 and developed a classifier using deep learning techniques, specifically a BERT model, to distinguish tweets that self-report infections.
  • The model achieved a solid performance with an F-score of 0.76, showing promise in identifying potential cases based on social media data. The team processed over 85 million tweets during their study.
View Article and Find Full Text PDF
Article Synopsis
  • A study highlights the common issues of miscarriage, stillbirth, and preterm birth in the U.S. but reveals that their causes are still largely unknown.
  • Researchers collected a dataset of 6,487 tweets related to these adverse pregnancy outcomes from a larger pool of over 400 million public tweets by pregnant women on Twitter.
  • The tweets are labeled to identify personal experiences with these outcomes, allowing for deeper insights into patient experiences and potentially enabling future machine learning studies to recognize more cases across social media.
View Article and Find Full Text PDF
Article Synopsis
  • The study explores how social media, particularly Twitter, can be used to track COVID-19 information shared by users in the U.S.
  • Researchers employed natural language processing and machine learning techniques to analyze the timing and location of these reports.
  • The findings indicate that this approach could serve as an early warning system for predicting the spread of COVID-19.
View Article and Find Full Text PDF
Article Synopsis
  • The study explores using social media mining to track COVID-19 reports on Twitter in England.
  • It builds on methods previously used in the US to identify personal accounts of COVID-19 experiences.
  • The results show that natural language processing and machine learning can effectively monitor the spread of the virus geographically and over time.
View Article and Find Full Text PDF
Article Synopsis
  • Researchers are aiming to leverage social media, specifically Twitter, to study the causes of negative pregnancy outcomes like miscarriage and stillbirth, which remain largely unknown.
  • They created a natural language processing system to automatically identify and select users who have announced their pregnancies for potential research comparisons.
  • After analyzing 2,361 pregnancy-related tweets using machine learning, they achieved high accuracy in identifying users whose pregnancies ended with a healthy outcome, with plans to use this data for broader studies on pregnancy outcomes.
View Article and Find Full Text PDF
Article Synopsis
  • The rise of social media use in health research raises concerns about the credibility of information, particularly since some posts may not come from genuine personal accounts.
  • Existing bot detection methods haven't been tested on users sharing health-related info, which is vital for reliable research.
  • This paper enhances a political bot detection system for health research, showing improved performance with new features and a machine learning classifier, achieving an F1-score of 0.7 for identifying bots, which represents a significant improvement.
View Article and Find Full Text PDF
Article Synopsis
  • In the U.S., significant percentages of pregnancies result in fetal loss, and preterm birth is a major cause of infant mortality, yet the reasons remain largely unclear.
  • The study aims to analyze how women discuss miscarriage, stillbirth, and preterm birth on Twitter and develop techniques to automatically identify relevant cases for further research.
  • Through a detailed analysis of over 400 million tweets, researchers created a filtering method to better identify tweets reporting personal experiences of adverse pregnancy outcomes, resulting in thousands of relevant tweets identified for further evaluation.
View Article and Find Full Text PDF
Article Synopsis
  • Researchers used Twitter to study pregnancies with birth defects by developing natural language processing (NLP) methods to automatically identify relevant tweets from a larger group of users.
  • They trained machine learning algorithms on over 22,000 tweets to distinguish between tweets that explicitly report birth defects and those that just mention them, addressing the challenge of class imbalances.
  • The study found that their SVM classifier was effective for identifying relevant tweets and suggested a new approach to uncover additional users for research, alongside creating a publicly available dataset for future machine learning projects.
View Article and Find Full Text PDF
Article Synopsis
  • Developing the Intelligent Clinical Text Evaluator (INCITE) aims to streamline the tedious and costly process of evaluating unstructured answers in medical exams using natural language processing (NLP).
  • INCITE uses semi-supervised learning combined with fuzzy matching techniques to accurately identify relevant concepts in responses, achieving a high F-score of 0.89 compared to human reviewers.
  • While INCITE struggles with complex phrases and variability in annotator opinions, its customizable features and ability to learn from limited data can improve assessment accuracy and reduce inconsistencies.
View Article and Find Full Text PDF
Article Synopsis
  • Social media is becoming a valuable tool for studying medication-related information, but traditional data sources are still commonly used in observational studies.
  • An analysis of 27,941 tweets has been conducted to train machine learning algorithms that can automatically identify users' medication intake.
  • While a classifier performs well across most medication types, it struggles with nervous system medications, indicating that a specialized approach may be needed for more accurate studies in that area.
View Article and Find Full Text PDF
Article Synopsis
  • - The study investigates how social media, particularly Twitter, can be used to collect data on rare health events like birth defects, which are a major cause of infant deaths in the US.
  • - Researchers mined over 432 million tweets from pregnant users to find mentions of birth defects, using advanced text analysis techniques to ensure accurate data collection.
  • - After analyzing 16,822 tweets, they were able to identify a group of 646 users whose pregnancies experienced birth defects, which could lead to better epidemiological analyses in the future.
View Article and Find Full Text PDF