Visual Question Answering (VQA) has attracted much attention in both computer vision and natural language processing communities, not least because it offers insight into the relationships between two important sources of information. Current datasets, and the models built upon them, have focused on questions which are answerable by direct analysis of the question and image alone. The set of such questions that require no external information to answer is interesting, but very limited. It excludes questions which require common sense, or basic factual knowledge to answer, for example. Here we introduce FVQA (Fact-based VQA), a VQA dataset which requires, and supports, much deeper reasoning. FVQA primarily contains questions that require external information to answer. We thus extend a conventional visual question answering dataset, which contains image-question-answer triplets, through additional image-question-answer-supporting fact tuples. Each supporting-fact is represented as a structural triplet, such as .

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2017.2754246DOI Listing

Publication Analysis

Top Keywords

visual question
12
question answering
12
questions require
12
fvqa fact-based
8
require external
8
external answer
8
fact-based visual
4
question
4
answering visual
4
answering vqa
4

Similar Publications

Objectives: This study aims to assess the awareness and acceptance of preventive and interceptive orthodontic treatment among Saudi perents.

Methods: The study used a 29-question questionnaire, covering parents' demographic data, parents' awareness of malocclusion and habits, and parents' acceptance of treatment. It included visuals of different malocclusions, normal occlusion, and specific habits.

View Article and Find Full Text PDF

The signals that mediate mate choice can be complex, comprising multiple components, and understanding how complex signals evolve under sexual selection has been the focus of much study. However, open questions still remain about the role of the female's sensory and perceptual processes in shaping the evolution of complex signals. Male green swordtails have an elongated caudal fin that comprises colour, length and a black melanic margin; females prefer males with larger bodies, longer swords and complete black sword margins.

View Article and Find Full Text PDF

Objective: This study aimed to assess the effects of kinesiotaping (KT) adjunct to physical therapy (PT) on proprioception, cervical range of motion (ROM), pain, disability, anxiety, depression, and quality of life (QoL) in cervical spondylosis.

Methods: Sixty-nine patients aged 50-70 years were randomized into three groups: PT, PT plus KT(PT+KT), PT plus sham-taping(PT+ST). All participants underwent standardized 15-session PT, 5 days/week.

View Article and Find Full Text PDF

Flower colour contrast, 'spectral purity' and a red herring.

Plant Biol (Stuttg)

January 2025

Department of Behavioral Physiology and Sociobiology, University of Würzburg, Würzburg, Germany.

Nature offers a bewildering diversity of flower colours. Understanding the ecology and evolution of this fantastic floral diversity requires knowledge about the visual systems of their natural observers, such as insect pollinators. The key question is how flower colour and pattern can be measured and represented to characterise the signals that are relevant to pollinators.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!