Visual question answering requires a system to provide an accurate natural language answer given an image and a natural language question. However, it is widely recognized that previous generic VQA methods often tend to memorize biases present in the training data rather than learning proper behaviors, such as grounding images before predicting answers. Therefore, these methods usually achieve high in-distribution but poor out-of-distribution performance. In recent years, various datasets and debiasing methods have been proposed to evaluate and enhance the VQA robustness, respectively. This paper provides the first comprehensive survey focused on this emerging fashion. Specifically, we first provide an overview of the development process of datasets from in-distribution and out-of-distribution perspectives. Then, we examine the evaluation metrics employed by these datasets. Third, we propose a typology that presents the development process, similarities and differences, robustness comparison, and technical features of existing debiasing methods. Furthermore, we analyze and discuss the robustness of representative vision-and-language pre-training models on VQA. Finally, through a thorough review of the available literature and experimental analysis, we discuss the key areas for future research from various viewpoints.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1109/TPAMI.2024.3366154 | DOI Listing |
Can J Ophthalmol
January 2025
Faculty of Medicine, University of Montreal, Montreal, QB, Canada; Department of Ophthalmology, Centre Hospitalier de l'Université de Montréal, Montreal, QB, Canada. Electronic address:
Objective: To evaluate the performance of large language models (LLMs), specifically Microsoft Copilot, GPT-4 (GPT-4o and GPT-4o mini), and Google Gemini (Gemini and Gemini Advanced), in answering ophthalmological questions and assessing the impact of prompting techniques on their accuracy.
Design: Prospective qualitative study.
Participants: Microsoft Copilot, GPT-4 (GPT-4o and GPT-4o mini), and Google Gemini (Gemini and Gemini Advanced).
Eur Stroke J
January 2025
Department of Neurology, University Medical Center Hamburg-Eppendorf, Hamburg, Germany.
Background: We aimed to assess impairments on health-related quality of life, and mental health resulting from Retinal artery occlusion (RAO) with monocular visual field loss and posterior circulation ischemic stroke (PCIS) with full or partial hemianopia using patient-reported outcome measures (PROMs).
Methods: In a prospective study, consecutive patients with acute RAO on fundoscopy and PCIS on imaging were recruited during their surveillance on a stroke unit over a period of 15 months. Baseline characteristics were determined from medical records and interviews.
Sensors (Basel)
January 2025
Faculty of Artificial Intelligence in Education, Central China Normal University, Wuhan 430079, China.
Behavioral computing based on visual cues has become increasingly important, as it can capture and annotate teachers' and students' classroom states on a large scale and in real time. However, there is a lack of consensus on the research status and future trends of computer vision-based classroom behavior recognition. The present study conducted a systematic literature review of 80 peer-reviewed journal articles following the Preferred Reporting Items for Systematic Assessment and Meta-Analysis (PRISMA) guidelines.
View Article and Find Full Text PDFJ Clin Med
January 2025
Department of Periodontology and Oral Mucosa Diseases, Medical University of Warsaw, Binieckiego 6 St., 02-097 Warsaw, Poland.
: The aim of the present narrative review is to synthesize the available scientific evidence on the effects of submarginal instrumentation with periodontal endoscopy and evaluate its' potential efficacy in terms of the non-surgical therapy of peri-implantitis. : The literature search was performed via electronic databases, including PubMed, Web of Science, Cochrane, and Scopus, and was supplemented by manual searching. A literature review was conducted addressing the following PICOS questions: (1) What is the efficacy of non-surgical submarginal instrumentation of the implant surface with the aid of a periodontal endoscope in patients with peri-implantitis? (2) What is the efficacy of non-surgical subgingival instrumentation performed with the aid of a periodontal endoscope compared with conventional subgingival instrumentation in patients with periodontitis, in terms of clinical parameters and patient-reported outcomes? Mechanical decontamination of the implant surface is crucial for resolving inflammation and arresting further bone loss.
View Article and Find Full Text PDFJ Clin Med
January 2025
Department of Internal Medicine and Hematology, Semmelweis University, 1088 Budapest, Hungary.
Limited research has explored histamine intolerance from the perspective of primary caregivers. Our objective was to develop a practical symptom profile from the standpoint of general practice. We also aimed to gather data on the frequency and timing of disease progression and to establish a staging system.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!