Publications by Sophie Rosset

Publications by authors named "Sophie Rosset"

Page 1 of 1

A Study on the Impacts of Slot Types and Training Data on Joint Natural Language Understanding in a Spanish Medication Management Assistant Scenario.

Surya Roca Sophie Rosset José García Álvaro Alesanco

Sensors (Basel)

March 2022

This study evaluates the impacts of slot tagging and training data length on joint natural language understanding (NLU) models for medication management scenarios using chatbots in Spanish. In this study, we define the intents (purposes of the sentences) for medication management scenarios and two types of slot tags. For training the model, we generated four datasets, combining long/short sentences with long/short slots, while for testing, we collect the data from real interactions of users with a chatbot.

View Article and Find Full Text PDF

Lessons Learned from the Usability Evaluation of a Simulated Patient Dialogue System.

Leonardo Campillos-Llanos Catherine Thomas Éric Bilinski Antoine Neuraz Sophie Rosset

J Med Syst

May 2021

Simulated consultations through virtual patients allow medical students to practice history-taking skills. Ideally, applications should provide interactions in natural language and be multi-case, multi-specialty. Nevertheless, few systems handle or are tested on a large variety of cases.

View Article and Find Full Text PDF

Survey on evaluation methods for dialogue systems.

Jan Deriu Alvaro Rodrigo Arantxa Otegi Guillermo Echegoyen Sophie Rosset

Artif Intell Rev

June 2020

In this paper, we survey the methods and concepts developed for the evaluation of dialogue systems. Evaluation, in and of itself, is a crucial part during the development process. Often, dialogue systems are evaluated by means of human evaluations and questionnaires.

View Article and Find Full Text PDF

The Impact of Specialized Corpora for Word Embeddings in Natural Langage Understanding.

Antoine Neuraz Bastien Rance Nicolas Garcelon Leonardo Campillos Llanos Anita Burgun Sophie Rosset

Stud Health Technol Inform

June 2020

Recent studies in the biomedical domain suggest that learning statistical word representations (static or contextualized word embeddings) on large corpora of specialized data improve the results on downstream natural language processing (NLP) tasks. In this paper, we explore the impact of the data source of word representations on a natural language understanding task. We compared embeddings learned with Fasttext (static embedding) and ELMo (contextualized embedding) representations, learned either on the general domain (Wikipedia) or on specialized data (electronic health records, EHR).

View Article and Find Full Text PDF

Do You Need Embeddings Trained on a Massive Specialized Corpus for Your Clinical Natural Language Processing Task?

Antoine Neuraz Vincent Looten Bastien Rance Nicolas Daniel Nicolas Garcelon Sophie Rosset

Stud Health Technol Inform

August 2019

We explore the impact of data source on word representations for different NLP tasks in the clinical domain in French (natural language understanding and text classification). We compared word embeddings (Fasttext) and language models (ELMo), learned either on the general domain (Wikipedia) or on specialized data (electronic health records, EHR). The best results were obtained with ELMo representations learned on EHR data for one of the two tasks(+7% and +8% of gain in F1-score).

View Article and Find Full Text PDF

Combining an expert-based medical entity recognizer to a machine-learning system: methods and a case study.

Pierre Zweigenbaum Thomas Lavergne Natalia Grabar Thierry Hamon Sophie Rosset

Biomed Inform Insights

September 2013

Medical entity recognition is currently generally performed by data-driven methods based on supervised machine learning. Expert-based systems, where linguistic and domain expertise are directly provided to the system are often combined with data-driven systems. We present here a case study where an existing expert-based medical entity recognition system, Ogmios, is combined with a data-driven system, Caramba, based on a linear-chain Conditional Random Field (CRF) classifier.

View Article and Find Full Text PDF

Eventual situations for timeline extraction from clinical reports.

Cyril Grouin Natalia Grabar Thierry Hamon Sophie Rosset Xavier Tannier

J Am Med Inform Assoc

December 2013

Objective: To identify the temporal relations between clinical events and temporal expressions in clinical reports, as defined in the i2b2/VA 2012 challenge.

Design: To detect clinical events, we used rules and Conditional Random Fields. We built Random Forest models to identify event modality and polarity.

View Article and Find Full Text PDF

Hybrid methods for improving information access in clinical documents: concept, assertion, and relation identification.

Anne-Lyse Minard Anne-Laure Ligozat Asma Ben Abacha Delphine Bernhard Bruno Cartoni Sophie Rosset

J Am Med Inform Assoc

January 2012

Objective: This paper describes the approaches the authors developed while participating in the i2b2/VA 2010 challenge to automatically extract medical concepts and annotate assertions on concepts and relations between concepts.

Design: The authors'approaches rely on both rule-based and machine-learning methods. Natural language processing is used to extract features from the input texts; these features are then used in the authors' machine-learning approaches.

View Article and Find Full Text PDF