Utilizing deep features from electroencephalography (EEG) data for emotional music composition provides a novel approach for creating personalized and emotionally rich music. Compared to textual data, converting continuous EEG and music data into discrete units presents significant challenges, particularly the lack of a clear and fixed vocabulary for standardizing EEG and audio data. The lack of this standard makes the mapping relationship between EEG signals and musical elements (such as rhythm, melody, and emotion) blurry and complex. Therefore, we propose a method of using clustering to create discrete representations and using the Transformer model to reverse mapping relationships. Specifically, the model uses clustering labels to segment signals and independently encodes EEG and emotional music data to construct a vocabulary, thereby achieving discrete representation. A time series dictionary was developed using clustering algorithms, which more effectively captures and utilizes the temporal and structural relationships between EEG and audio data. In response to the insensitivity to temporal information in heterogeneous data, we adopted a multi head attention mechanism and positional encoding technology to enable the model to focus on information in different subspaces, thereby enhancing the understanding of the complex internal structure of EEG and audio data. In addition, to address the mismatch between local and global information in emotion driven music generation, we introduce an audio masking prediction loss learning method. Our method generates music that 20 On the indicator, a performance of 68.19% was achieved, which improved the score by 4.9% compared to other methods, indicating the effectiveness of this method.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11366740 | PMC |
http://dx.doi.org/10.3389/fnbot.2024.1437737 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!