Mid-infrared spectra of dried and roasted cocoa ( L.): A dataset for machine learning-based classification of cocoa varieties and prediction of theobromine and caffeine content.

Data Brief

Centro Surcolombiano de Investigación en Café (CESURCAFÉ), Departamento de Ingeniería Agrícola, Universidad Surcolombiana, Neiva-Huila 410001, Colombia.

Published: February 2025

This paper presents a comprehensive dataset of mid-infrared spectra for dried and roasted cocoa beans ( L.), along with their corresponding theobromine and caffeine content. Infrared data were acquired using Attenuated Total Reflectance-Fourier Transform Infrared (ATR-FTIR) spectroscopy, while High-Performance Liquid Chromatography (HPLC) was employed to accurately quantify theobromine and caffeine in the dried cocoa beans. The theobromine/caffeine relationship served as a robust chemical marker for distinguishing between different cocoa varieties. This dataset provides a basis for further research, enabling the integration of mid-infrared spectral data with HPLC (as a standard) to fine-tune machine learning and deep learning models that could be used to simultaneously predict the theobromine and caffeine content, as well as cocoa variety in both dried and roasted cocoa samples using a non-destructive approach based on spectral data. The tools developed from this dataset could significantly advance automated processes in the cocoa industry and support decision-making on an industrial scale, facilitating real-time quality control of cocoa-based products, improving cocoa variety classification, and optimizing bean selection, blending strategies, and product formulation, while reducing the need for labor-intensive and costly quantification methods. The dataset is organized into Excel sheets and structured according to experimental conditions and replicates, providing a valuable framework for further analysis, model development, and calibration of multivariate statistical models.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11748727PMC
http://dx.doi.org/10.1016/j.dib.2024.111243DOI Listing

Publication Analysis

Top Keywords

theobromine caffeine
16
dried roasted
12
roasted cocoa
12
caffeine content
12
cocoa
9
mid-infrared spectra
8
spectra dried
8
cocoa varieties
8
cocoa beans
8
spectral data
8

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!