Matching hand-drawn sketches with photos (a.k.a sketch-photo recognition or re-identification) faces the information asymmetry challenge due to the abstract nature of the sketch modality. Existing works tend to learn shared embedding spaces with CNN models by discarding the appearance cues for photo images or introducing GAN for sketch-photo synthesis. The former unavoidably loses discriminability, while the latter contains ineffaceable generation noise. In this paper, we start the first attempt to design an information-aligned sketch transformer (SketchTrans ) via cross-modal disentangled prototype learning, while the transformer has shown great promise for discriminative visual modelling. Specifically, we design an asymmetric disentanglement scheme with a dynamic updatable auxiliary sketch (A-sketch) to align the modality representations without sacrificing information. The asymmetric disentanglement decomposes the photo representations into sketch-relevant and sketch-irrelevant cues, transferring sketch-irrelevant knowledge into the sketch modality to compensate for the missing information. Moreover, considering the feature discrepancy between the two modalities, we present a modality-aware prototype contrastive learning method that mines representative modality-sharing information using the modality-aware prototypes rather than the original feature representations. Extensive experiments on category- and instance-level sketch-based datasets validate the superiority of our proposed method under various metrics.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1109/TPAMI.2023.3337005 | DOI Listing |
J Phys Condens Matter
December 2024
Institute of Nano Science and Technology, Knowledge City, Sector 81, Mohali, Punjab 140306, India.
Orbitronics and valleytronics, analogous to spintronics, leverage the orbital degree of freedom and the valley degree of freedom of electrons to carry information, promising significant advancements in information processing. In this study, we disentangle the orbital and valley Nernst effect (VNE) in 2D monolayers, based on the global symmetry of the monolayers. We conduct an in-depth analysis of the orbital (valley) Nernst effect in inversion symmetric (asymmetric) monolayers, using an analytical tight binding model.
View Article and Find Full Text PDFPhys Rev Lett
November 2024
Department of Applied Physics and Materials Science, and Department of Physics, California Institute of Technology, Pasadena, California 91125, USA.
The spectral and transport properties of strongly correlated metals, such as SrVO_{3} (SVO), are widely attributed to electron-electron (e-e) interactions, with lattice vibrations (phonons) playing a secondary role. Here, using first-principles electron-phonon (e-ph) and dynamical mean field theory calculations, we show that e-ph interactions play an essential role in SVO: they govern the electron scattering and resistivity in a wide temperature range down to 30 K, and induce an experimentally observed kink in the spectral function. In contrast, the e-e interactions control quasiparticle renormalization and low temperature transport, and enhance the e-ph coupling.
View Article and Find Full Text PDFPhys Chem Chem Phys
August 2024
Department of Chemistry, Washington University in St. Louis, St. Louis, MO 63130, USA.
The competition between host-guest binding and solvent interactions is a crucial factor in determining the binding affinities and selectivity of molecular receptor species. The interplay between these competing interactions, however, have been difficult to disentangle. In particular, the development of molecular-level descriptions of solute-solvent interactions remains a grand experimental challenge.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
December 2024
Deep cooperative multi-agent reinforcement learning has demonstrated its remarkable success over a wide spectrum of complex control tasks. However, recent advances in multi-agent learning mainly focus on value decomposition while leaving entity interactions still intertwined, which easily leads to over-fitting on noisy interactions between entities. In this work, we introduce a novel interactiOn Pattern disenTangling (OPT) method, to disentangle the entity interactions into interaction prototypes, each of which represents an underlying interaction pattern within a subgroup of the entities.
View Article and Find Full Text PDFIEEE Trans Neural Netw Learn Syst
May 2024
This article studies an emerging practical problem called heterogeneous prototype learning (HPL). Unlike the conventional heterogeneous face synthesis (HFS) problem that focuses on precisely translating a face image from a source domain to another target one without removing facial variations, HPL aims at learning the variation-free prototype of an image in the target domain while preserving the identity characteristics. HPL is a compounded problem involving two cross-coupled subproblems, that is, domain transfer and prototype learning (PL), thus making most of the existing HFS methods that simply transfer the domain style of images unsuitable for HPL.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!