A Survey on Learning Objects' Relationship for Image Captioning.

Comput Intell Neurosci

Aerospace Information Research Institute, Chinese Academy Sciences, Beijing, China.

Published: June 2023

Image captioning is a challenging modality transformation task in computer vision and natural language processing, aiming to understand the image content and describe it with a natural language. Recently, the relationship information between objects in the image has been investigated to be of importance in generating a more vivid and readable sentence. Many types of research have been done in relationship mining and learning for leveraging into the caption models. This paper mainly summarizes the methods of relational representation and relational encoding in image captioning. Besides, we discuss the advantages and disadvantages of these methods and provide commonly used datasets for the relational captioning task. Finally, the current problems and challenges in this task are highlighted.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10241575PMC
http://dx.doi.org/10.1155/2023/8600853DOI Listing

Publication Analysis

Top Keywords

image captioning
12
natural language
8
image
5
survey learning
4
learning objects'
4
objects' relationship
4
relationship image
4
captioning
4
captioning image
4
captioning challenging
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!