Natural language processing of medical records offers tremendous potential to improve the patient experience. Sentiment analysis of clinical notes has been performed with mixed results, often highlighting the issue that dictionary ratings are not domain specific. Here, for the first time, we re-calibrate the labMT sentiment dictionary on 3.
View Article and Find Full Text PDFThe murder of George Floyd by police in May 2020 sparked international protests and brought unparalleled levels of attention to the Black Lives Matter movement. As we show, his death set record levels of activity and amplification on Twitter, prompted the saddest day in the platform's history, and caused his name to appear among the ten most frequently used phrases in a day, where he is the only individual to have ever received that level of attention who was not known to the public earlier that same week. Importantly, we find that the Black Lives Matter movement's rhetorical strategy to connect and repeat the names of past Black victims of police violence-foregrounding racial injustice as an ongoing pattern rather than a singular event-was exceptionally effective following George Floyd's death: attention given to him extended to over 185 prior Black victims, more than other past moments in the movement's history.
View Article and Find Full Text PDFBackground: Mental health challenges are thought to affect approximately 10% of the global population each year, with many of those affected going untreated because of the stigma and limited access to services. As social media lowers the barrier for joining difficult conversations and finding supportive groups, Twitter is an open source of language data describing the changing experience of a stigmatized group.
Objective: By measuring changes in the conversation around mental health on Twitter, we aim to quantify the hypothesized increase in discussions and awareness of the topic as well as the corresponding reduction in stigma around mental health.
Measuring the specific kind, temporal ordering, diversity, and turnover rate of stories surrounding any given subject is essential to developing a complete reckoning of that subject's historical impact. Here, we use Twitter as a distributed news and opinion aggregation source to identify and track the dynamics of the dominant day-scale stories around Donald Trump, the 45th President of the United States. Working with a data set comprising around 20 billion 1-grams, we first compare each day's 1-gram and 2-gram usage frequencies to those of a year before, to create day- and week-scale timelines for Trump stories for 2016-2021.
View Article and Find Full Text PDFIn real time, Twitter strongly imprints world events, popular culture, and the day-to-day, recording an ever-growing compendium of language change. Vitally, and absent from many standard corpora such as books and news archives, Twitter also encodes popularity and spreading through retweets. Here, we describe Storywrangler, an ongoing curation of over 100 billion tweets containing 1 trillion 1-grams from 2008 to 2021.
View Article and Find Full Text PDFWe study collective attention paid towards hurricanes through the lens of n-grams on Twitter, a social media platform with global reach. Using hurricane name mentions as a proxy for awareness, we find that the exogenous temporal dynamics are remarkably similar across storms, but that overall collective attention varies widely even among storms causing comparable deaths and damage. We construct 'hurricane attention maps' and observe that hurricanes causing deaths on (or economic damage to) the continental United States generate substantially more attention in English language tweets than those that do not.
View Article and Find Full Text PDFThe past decade has witnessed a marked increase in the use of social media by politicians, most notably exemplified by the 45th President of the United States (POTUS), Donald Trump. On Twitter, POTUS messages consistently attract high levels of engagement as measured by likes, retweets, and replies. Here, we quantify the balance of these activities, also known as "ratios", and study their dynamics as a proxy for collective political engagement in response to presidential communications.
View Article and Find Full Text PDFWorking from a dataset of 118 billion messages running from the start of 2009 to the end of 2019, we identify and explore the relative daily use of over 150 languages on Twitter. We find that eight languages comprise 80% of all tweets, with English, Japanese, Spanish, Arabic, and Portuguese being the most dominant. To quantify social spreading in each language over time, we compute the 'contagion ratio': The balance of retweets to organic messages.
View Article and Find Full Text PDFIn confronting the global spread of the coronavirus disease COVID-19 pandemic we must have coordinated medical, operational, and political responses. In all efforts, data is crucial. Fundamentally, and in the possible absence of a vaccine for 12 to 18 months, we need universal, well-documented testing for both the presence of the disease as well as confirmed recovery through serological tests for antibodies, and we need to track major socioeconomic indices.
View Article and Find Full Text PDF