This dataset was created to investigate the impact of data collection modes and pre-processing techniques on the quality of free comment data related to consumers' sensory perceptions. A total of 200 consumers were recruited and divided into two groups of 100. Each group evaluated six madeleine samples (five distinct samples and one replicate) in a sensory analysis laboratory, using different free comment data collection modes. Consumers in the first group provided only words or short expressions, while those in the second group used complete sentences. Additionally, participants reported their liking for each sample. The collected data provided valuable insights into the effectiveness of the free comment method in sensory evaluation of food products. They emphasized the importance of data pre-processing and demonstrated how the chosen techniques can impact the quality of the results. The dataset is based on real-world consumer data, showcasing how individuals naturally express their subjective perceptions. It features descriptions that reflect authentic consumer language, including informal expressions, incorrect phrasing, spelling errors, and unstructured sentences. This raw textual data has been annotated and translated into English. The dataset can therefore be repurposed to assess and compare the performance of different text mining, natural language processing and sentiment analysis algorithms in both French and English, as well as to drive innovations in AI tools for sensory and consumer research.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11742558 | PMC |
http://dx.doi.org/10.1016/j.dib.2024.111250 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!