Semantic comprehension aims to reasonably reproduce people's real intentions or thoughts, e.g., sentiment, humor, sarcasm, motivation, and offensiveness, from multiple modalities. It can be instantiated as a multimodal-oriented multitask classification issue and applied to scenarios, such as online public opinion supervision and political stance analysis. Previous methods generally employ multimodal learning alone to deal with varied modalities or solely exploit multitask learning to solve various tasks, a few to unify both into an integrated framework. Moreover, multimodal-multitask cooperative learning could inevitably encounter the challenges of modeling high-order relationships, i.e., intramodal, intermodal, and intertask relationships. Related research of brain sciences proves that the human brain possesses multimodal perception and multitask cognition for semantic comprehension via decomposing, associating, and synthesizing processes. Thus, establishing a brain-inspired semantic comprehension framework to bridge the gap between multimodal and multitask learning becomes the primary motivation of this work. Motivated by the superiority of the hypergraph in modeling high-order relations, in this article, we propose a hypergraph-induced multimodal-multitask (HIMM) network for semantic comprehension. HIMM incorporates monomodal, multimodal, and multitask hypergraph networks to, respectively, mimic the decomposing, associating, and synthesizing processes to tackle the intramodal, intermodal, and intertask relationships accordingly. Furthermore, temporal and spatial hypergraph constructions are designed to model the relationships in the modality with sequential and spatial structures, respectively. Also, we elaborate a hypergraph alternative updating algorithm to ensure that vertices aggregate to update hyperedges and hyperedges converge to update their connected vertices. Experiments on the dataset with two modalities and five tasks verify the effectiveness of HIMM on semantic comprehension.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TNNLS.2023.3252359DOI Listing

Publication Analysis

Top Keywords

semantic comprehension
24
modeling high-order
12
high-order relationships
8
hypergraph-induced multimodal-multitask
8
multitask learning
8
intramodal intermodal
8
intermodal intertask
8
intertask relationships
8
decomposing associating
8
associating synthesizing
8

Similar Publications

When we listen to speech, our brain's neurophysiological responses "track" its acoustic features, but it is less well understood how these auditory responses are enhanced by linguistic content. Here, we recorded magnetoencephalography (MEG) responses while subjects of both sexes listened to four types of continuous-speech-like passages: speech-envelope modulated noise, English-like non-words, scrambled words, and a narrative passage. Temporal response function (TRF) analysis provides strong neural evidence for the emergent features of speech processing in cortex, from acoustics to higher-level linguistics, as incremental steps in neural speech processing.

View Article and Find Full Text PDF

Weakly-supervised thyroid ultrasound segmentation: Leveraging multi-scale consistency, contextual features, and bounding box supervision for accurate target delineation.

Comput Biol Med

January 2025

Department of Artificial Intelligence, Faculty of Artificial Intelligence, Egyptian Russian University, 11829, Badr City, Egypt. Electronic address:

Weakly-supervised learning (WSL) methods have gained significant attention in medical image segmentation, but they often face challenges in accurately delineating boundaries due to overfitting to weak annotations such as bounding boxes. This issue is particularly pronounced in thyroid ultrasound images, where low contrast and noisy backgrounds hinder precise segmentation. In this paper, we propose a novel weakly-supervised segmentation framework that addresses these challenges.

View Article and Find Full Text PDF

Loss of Insight in Syndromes Associated with Frontotemporal Lobar Degeneration: Clinical and Imaging Features.

Am J Geriatr Psychiatry

December 2024

Department of Clinical and Experimental Sciences (DA, BB), University of Brescia, Brescia, Italy; Molecular Markers Laboratory (BB), IRCCS Istituto Centro San Giovanni di Dio Fatebenefratelli, Brescia, Italy. Electronic address:

Objectives: The present study aims to assess the prevalence, associated clinical symptoms, longitudinal changes, and imaging correlates of Loss of Insight (LOI), which is still unexplored in syndromes associated with Frontotemporal Lobar Degeneration (FTLD).

Design: Retrospective longitudinal cohort study, from Oct 2009 to Feb 2023.

Setting: Tertiary Frontotemporal Dementia research clinic.

View Article and Find Full Text PDF

Accurately predicting intracerebral hemorrhage (ICH) prognosis is a critical and indispensable step in the clinical management of patients post-ICH. Recently, integrating artificial intelligence, particularly deep learning, has significantly enhanced prediction accuracy and alleviated neurosurgeons from the burden of manual prognosis assessment. However, uni-modal methods have shown suboptimal performance due to the intricate pathophysiology of the ICH.

View Article and Find Full Text PDF

Background: Global aphasia is a severe communication disorder affecting all language modalities, commonly caused by stroke. Evidence as to whether the functional communication of people with global aphasia (PwGA) can improve after speech and language therapy (SLT) is limited and conflicting. This is partly because cognition, which is relevant to participation in therapy and implicated in successful functional communication, can be severely impaired in global aphasia.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!