LFSRM: Few-Shot Diagram-Sentence Matching via Local-Feedback Self-Regulating Memory.

Lingling Zhang Wenjun Wu Jun Liu Xiaojun Chang Xin Hu Yuhui Zheng Yaqiang Wu Qinghua Zheng

IEEE Trans Pattern Anal Mach Intell

Published: January 2025

Image-sentence matching that aims to understand the correspondence between vision and language, has achieved significant progress with various deep methods trained under large-scale supervision. Different from natural images taken by camera, diagrams in the textbooks contain more graphic objects, drawings, and natural objects, and the diagram-sentence matching plays an important role in textbook understanding and question answering. However, existing matching models are not suitable for the challenging task between diagrams and sentences, due to the more serious few-shot content and incomplete description problems. In this paper, we propose a novel local-feedback self-regulating memory framework (LFSRM) for diagram-sentence matching. On one hand, LFSRM includes an external memory to store the useful multi-modal information, especially uncommon ones, to overcome the few-shot content problem, where the memory is updated flexibly according to the local-feedback from visual-textual alignment scores. On the other hand, LFSRM designs an attention mechanism on local-level alignment scores and a strengthening factor impacted on sentence-to-diagram matching direction for alleviating the incomplete description problem. Extensive experiments on three datasets show that LFSRM achieves satisfactory results on conventional image-sentence matching, and outperforms SOTA methods on few-shot image/diagram-sentence matching by a large margin. The dataset for diagram-sentence matching called AI2D and the LFSRM code are opened on Github https://github.com/TeamResearchWork/LFSRM.

Download full-text PDF	Source
http://dx.doi.org/10.1109/TPAMI.2025.3528723	DOI Listing

Publication Analysis

Top Keywords

diagram-sentence matching

matching

local-feedback self-regulating

self-regulating memory

image-sentence matching

few-shot content

incomplete description

hand lfsrm

alignment scores

lfsrm

Similar Publications

LFSRM: Few-Shot Diagram-Sentence Matching via Local-Feedback Self-Regulating Memory.

IEEE Trans Pattern Anal Mach Intell

January 2025

Lingling Zhang Wenjun Wu Jun Liu Xiaojun Chang Xin Hu

View Article and Find Full Text PDF

Similar Publications

Fs-DSM: Few-Shot Diagram-Sentence Matching via Cross-Modal Attention Graph Model.

IEEE Trans Image Process

September 2021

Xin Hu Lingling Zhang Jun Liu Qinghua Zheng Jianlong Zhou

Diagram-sentence matching is a valuable academic research because it can help learners effectively understand the diagrams with the assisted by sentences. However, there are many uncommon objects, i.e.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!