Time-frequency feature representation using multi-resolution texture analysis and acoustic activity detector for real-life speech emotion recognition.

Sensors (Basel)

Department of Information Technology & Communication, Shih Chien University, 200 University Road, Neimen, Kaohsiung 84550, Taiwan.

Published: January 2015

The classification of emotional speech is mostly considered in speech-related research on human-computer interaction (HCI). In this paper, the purpose is to present a novel feature extraction based on multi-resolutions texture image information (MRTII). The MRTII feature set is derived from multi-resolution texture analysis for characterization and classification of different emotions in a speech signal. The motivation is that we have to consider emotions have different intensity values in different frequency bands. In terms of human visual perceptual, the texture property on multi-resolution of emotional speech spectrogram should be a good feature set for emotion classification in speech. Furthermore, the multi-resolution analysis on texture can give a clearer discrimination between each emotion than uniform-resolution analysis on texture. In order to provide high accuracy of emotional discrimination especially in real-life, an acoustic activity detection (AAD) algorithm must be applied into the MRTII-based feature extraction. Considering the presence of many blended emotions in real life, in this paper make use of two corpora of naturally-occurring dialogs recorded in real-life call centers. Compared with the traditional Mel-scale Frequency Cepstral Coefficients (MFCC) and the state-of-the-art features, the MRTII features also can improve the correct classification rates of proposed systems among different language databases. Experimental results show that the proposed MRTII-based feature information inspired by human visual perception of the spectrogram image can provide significant classification for real-life emotional recognition in speech.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4327087PMC
http://dx.doi.org/10.3390/s150101458DOI Listing

Publication Analysis

Top Keywords

multi-resolution texture
8
texture analysis
8
acoustic activity
8
emotional speech
8
feature extraction
8
feature set
8
human visual
8
analysis texture
8
mrtii-based feature
8
texture
6

Similar Publications

Article Synopsis
  • The biomedical imaging field has expanded significantly, with an increasing need for computer-assisted diagnosis, especially highlighted by the COVID-19 pandemic.
  • A novel content-based medical image retrieval system is introduced, utilizing a unique pattern descriptor called MsNrRiTxP to extract multi-resolution, noise-resistant, and rotation-invariant texture features from medical images.
  • The proposed method showed superior retrieval performance in tests across various CT and MRI datasets, outperforming existing pattern-based systems in both noise-free and noisy conditions.
View Article and Find Full Text PDF

Electrofacies analysis conducted the distribution effects throughout the reservoir despite the difficulty of characterizing stratigraphic relationships. Clustering methods quantitatively define the reservoir zone from non-reservoir considering electrofacies. Asmari Formation is the most significant reservoir of the Mansouri oilfield in SW Iran, generally composed of carbonate and sandstone layers.

View Article and Find Full Text PDF

Arbitrary scale super-resolution diffusion model for brain MRI images.

Comput Biol Med

March 2024

School of Information Science and Engineering, Shandong Normal University, Jinan, 250358, China. Electronic address:

Given the constraints posed by hardware capacity, scan duration, and patient cooperation, the reconstruction of magnetic resonance imaging (MRI) images emerges as a pivotal aspect of medical imaging research. Currently, deep learning-based super-resolution (SR) methods have been widely discussed in medical image processing due to their ability to reconstruct high-quality, high resolution (HR) images from low resolution (LR) inputs. However, most existing MRI SR methods are designed for specific magnifications and cannot generate MRI images at arbitrary scales, which hinders the radiologists from fully visualizing the lesions.

View Article and Find Full Text PDF

Recent clinical research describes a subset of glioblastoma patients that exhibit REP prior to the start of radiation therapy. Current literature has thus far described this population using clinicopathologic features. To our knowledge, this study is the first to investigate the potential of conventional radiomics, sophisticated multi-resolution fractal texture features, and different molecular features (MGMT, IDH mutations) as a diagnostic and prognostic tool for prediction of REP from non-REP cases using computational and statistical modeling methods.

View Article and Find Full Text PDF

Multi-Resolution 3D Rendering for High-Performance Web AR.

Sensors (Basel)

August 2023

Laboratory of Photogrammetry, School of Rural, Surveying and Geoinformatics Engineering, National Technical University of Athens, 15780 Athens, Greece.

In the context of web augmented reality (AR), 3D rendering that maintains visual quality and frame rate requirements remains a challenge. The lack of a dedicated and efficient 3D format often results in the degraded visual quality of the original data and compromises the user experience. This paper examines the integration of web-streamable view-dependent representations of large-sized and high-resolution 3D models in web AR applications.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!