Publications by authors named "Carlos Ishi"

Article Synopsis
  • Emotion recognition in conversation (ERC) involves interpreting human emotions by analyzing various contextual and multimodal data, but current research often neglects issues related to unimodal discrepancies and speaker dependencies.
  • The paper introduces a new model called HiMul-LGG, which uses a hierarchical decision fusion approach to align different modalities and a local-global graph neural network to improve reliance on speaker-specific information.
  • Testing on two datasets, IEMOCAP and MELD, shows that HiMul-LGG outperforms existing methods, with additional findings supporting the effectiveness of its design strategies.
View Article and Find Full Text PDF

Android robots capable of emotional interactions with humans have considerable potential for application to research. While several studies developed androids that can exhibit human-like emotional facial expressions, few have empirically validated androids' facial expressions. To investigate this issue, we developed an android head called Nikola based on human psychology and conducted three studies to test the validity of its facial expressions.

View Article and Find Full Text PDF

Emotion recognition has drawn consistent attention from researchers recently. Although gesture modality plays an important role in expressing emotion, it is seldom considered in the field of emotion recognition. A key reason is the scarcity of labeled data containing 3D skeleton data.

View Article and Find Full Text PDF

Emotion recognition has been gaining attention in recent years due to its applications on artificial agents. To achieve a good performance with this task, much research has been conducted on the multi-modality emotion recognition model for leveraging the different strengths of each modality. However, a research question remains: what exactly is the most appropriate way to fuse the information from different modalities? In this paper, we proposed audio sample augmentation and an emotion-oriented encoder-decoder to improve the performance of emotion recognition and discussed an inter-modality, decision-level fusion method based on a graph attention network (GAT).

View Article and Find Full Text PDF

Purpose: The purpose of this study was to investigate the influence of room acoustics in the relationship between vowel height and vocal fry.

Methods: This was a cross-sectional study. Participants (college students, n = 40) read the first six sentences of "The Rainbow Passage" under nine simulated room acoustic conditions.

View Article and Find Full Text PDF