Is learning more knowledge always better for vision-and-language models? In this paper, we study knowledge transferability in multi-modal tasks. The current tendency in machine learning is to assume that by joining multiple datasets from different tasks, their overall performance improves. However, we show that not all knowledge transfers well or has a positive impact on related tasks, even when they share a common goal.
View Article and Find Full Text PDFThe content-style duality is a fundamental element in art. These two dimensions can be easily differentiated by humans: content refers to the objects and concepts in an artwork, and style to the way it looks. Yet, we have not found a way to fully capture this duality with visual representations.
View Article and Find Full Text PDFWe introduce an emotional stimuli detection task that targets extracting emotional regions that evoke people's emotions (i.e., emotional stimuli) in artworks.
View Article and Find Full Text PDF