Understanding and explaining deep learning models is an imperative task. Towards this, we propose a method that obtains gradient-based certainty estimates that also provide visual attention maps. Particularly, we solve for visual question answering task. We incorporate modern probabilistic deep learning methods that we further improve by using the gradients for these estimates. These have two-fold benefits: a) improvement in obtaining the certainty estimates that correlate better with misclassified samples and b) improved attention maps that provide state-of-the-art results in terms of correlation with human attention regions. The improved attention maps result in consistent improvement for various methods for visual question answering. Therefore, the proposed technique can be thought of as a tool for obtaining improved certainty estimates and explanations for deep learning models. We provide detailed empirical analysis for the visual question answering task on all standard benchmarks and comparison with state of the art methods.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1109/TIP.2020.3046916 | DOI Listing |
Sci Rep
December 2024
KAUST Center of Excellence for Smart Health (KCSH), King Abdullah University of Science and Technology, Thuwal, 23955, Saudi Arabia.
Analyzing microbial samples remains computationally challenging due to their diversity and complexity. The lack of robust de novo protein function prediction methods exacerbates the difficulty in deriving functional insights from these samples. Traditional prediction methods, dependent on homology and sequence similarity, often fail to predict functions for novel proteins and proteins without known homologs.
View Article and Find Full Text PDFSci Rep
December 2024
Department of Informatics, University of Hamburg, Hamburg, Germany.
Central to the development of universal learning systems is the ability to solve multiple tasks without retraining from scratch when new data arrives. This is crucial because each task requires significant training time. Addressing the problem of continual learning necessitates various methods due to the complexity of the problem space.
View Article and Find Full Text PDFSci Rep
December 2024
Department of Computer Science, Birzeit University, P.O. Box 14, Birzeit, West Bank, Palestine.
Accurate classification of logos is a challenging task in image recognition due to variations in logo size, orientation, and background complexity. Deep learning models, such as VGG16, have demonstrated promising results in handling such tasks. However, their performance is highly dependent on optimal hyperparameter settings, whose fine-tuning is both labor-intensive and time-consuming.
View Article and Find Full Text PDFSci Rep
December 2024
Faculty of Dental Medicine and Oral Health Sciences, McGill University, Montreal, Canada.
Accurate diagnosis of oral lesions, early indicators of oral cancer, is a complex clinical challenge. Recent advances in deep learning have demonstrated potential in supporting clinical decisions. This paper introduces a deep learning model for classifying oral lesions, focusing on accuracy, interpretability, and reducing dataset bias.
View Article and Find Full Text PDFSci Rep
December 2024
Department of Civil Engineering, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland.
Deep learning models are widely used for traffic forecasting on freeways due to their ability to learn complex temporal and spatial relationships. In particular, graph neural networks, which integrate graph theory into deep learning, have become popular for modeling traffic sensor networks. However, traditional graph convolutional networks (GCNs) face limitations in capturing long-range spatial correlations, which can hinder accurate long-term predictions.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!