A PHP Error was encountered

Severity: Warning

Message: file_get_contents(https://...@pubfacts.com&api_key=b8daa3ad693db53b1410957c26c9a51b4908&a=1): Failed to open stream: HTTP request failed! HTTP/1.1 429 Too Many Requests

Filename: helpers/my_audit_helper.php

Line Number: 176

Backtrace:

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 176
Function: file_get_contents

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 250
Function: simplexml_load_file_from_url

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 1034
Function: getPubMedXML

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 3152
Function: GetPubMedArticleOutput_2016

File: /var/www/html/application/controllers/Detail.php
Line: 575
Function: pubMedSearch_Global

File: /var/www/html/application/controllers/Detail.php
Line: 489
Function: pubMedGetRelatedKeyword

File: /var/www/html/index.php
Line: 316
Function: require_once

Learning Aligned Image-Text Representations Using Graph Attentive Relational Network. | LitMetric

Image-text matching aims to measure the similarities between images and textual descriptions, which has made great progress recently. The key to this cross-modal matching task is to build the latent semantic alignment between visual objects and words. Due to the widespread variations of sentence structures, it is very difficult to learn the latent semantic alignment using only global cross-modal features. Many previous methods attempt to learn the aligned image-text representations by the attention mechanism but generally ignore the relationships within textual descriptions which determine whether the words belong to the same visual object. In this paper, we propose a graph attentive relational network (GARN) to learn the aligned image-text representations by modeling the relationships between noun phrases in a text for the identity-aware image-text matching. In the GARN, we first decompose images and texts into regions and noun phrases, respectively. Then a skip graph neural network (skip-GNN) is proposed to learn effective textual representations which are a mixture of textual features and relational features. Finally, a graph attention network is further proposed to obtain the probabilities that the noun phrases belong to the image regions by modeling the relationships between noun phrases. We perform extensive experiments on the CUHK Person Description dataset (CUHK-PEDES), Caltech-UCSD Birds dataset (CUB), Oxford-102 Flowers dataset and Flickr30K dataset to verify the effectiveness of each component in our model. Experimental results show that our approach achieves the state-of-the-art results on these four benchmark datasets.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TIP.2020.3048627DOI Listing

Publication Analysis

Top Keywords

noun phrases
16
aligned image-text
12
image-text representations
12
graph attentive
8
attentive relational
8
relational network
8
image-text matching
8
textual descriptions
8
latent semantic
8
semantic alignment
8

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!