With the objective of extracting new knowledge about rare diseases from social media messages, we evaluated three models on a Named Entity Recognition (NER) task, consisting of extracting phenotypes and treatments from social media messages. We trained the three models on a dataset with social media messages about Developmental and Epileptic Encephalopathies and more common diseases. This preliminary study revealed that CamemBERT and CamemBERT-bio exhibit similar performance on social media testimonials, slightly outperforming DrBERT. It also highlighted that their performance was lower on this type of data than on structured health datasets. Limitations, including a narrow focus on NER performance and dataset-specific evaluation, call for further research to fully assess model capabilities on larger and more diverse datasets.

Download full-text PDF

Source
http://dx.doi.org/10.3233/SHTI240556DOI Listing

Publication Analysis

Top Keywords

social media
20
media messages
12
three models
8
social
5
media
5
evaluation bert-based
4
bert-based models
4
models patient
4
patient data
4
data french
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!