The performance gap between Vision Transformers (ViTs) and Convolutional Neural Networks (CNNs) persists due to the lack of inductive bias, notably when training from scratch with limited datasets. This paper identifies two crucial shortcomings in ViTs: spatial relevance and diverse channel representation. Thus, ViTs struggle to grasp fine-grained spatial features and robust channel representation due to insufficient data. We propose the Dynamic Hybrid Vision Transformer (DHVT) to address these challenges. Regarding the spatial aspect, DHVT introduces convolution in the feature embedding phase and feature projection modules to enhance spatial relevance. Regarding the channel aspect, the dynamic aggregation mechanism and a groundbreaking design "head token" facilitate the recalibration and harmonization of disparate channel representations. Moreover, we investigate the choices of the network meta-structure and adopt the optimal multi-stage hybrid structure without the conventional class token. The methods are then modified with a novel dimensional variable residual connection mechanism to leverage the potential of the structure sufficiently. This updated variant, called DHVT2, offers a more computationally efficient solution for vision-related tasks. DHVT and DHVT2 achieve state-of-the-art image recognition results, effectively bridging the performance gap between CNNs and ViTs. The downstream experiments further demonstrate their strong generalization capacities.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2025.3528228DOI Listing

Publication Analysis

Top Keywords

dynamic hybrid
8
hybrid vision
8
vision transformer
8
performance gap
8
spatial relevance
8
channel representation
8
dhvt
4
dhvt dynamic
4
transformer small
4
small dataset
4

Similar Publications

Chitosan is widely used in drug delivery applications, due to its biocompatibility, bio-degradability, and low toxicity. Nevertheless, its properties can be enhanced through the physical or chemical modification of its amino and hydroxyl groups. This work explores the electrostatic complexation of two chitosan samples of differing lengths with two poly(-isopropylacrylamide) (PNIPAM) homopolymers of different molecular weight carrying a chargeable carboxyl end group.

View Article and Find Full Text PDF

Creating artificial cells with a dynamic cytoskeleton, akin to those in living cells, is a major goal in bottom-up synthetic biology. In this study, we demonstrate the in situ polymerization of microtubules encapsulated in giant polymer-lipid hybrid vesicles (GHVs) composed of 1,2-dioleoyl-sn-glycero-3-phosphocholine and an amphiphilic block copolymer. The block copolymer is comprised of poly(cholesteryl methacrylate-co-butyl methacrylate) as the hydrophobic block and either poly(6-O-methacryloyl-D-galactopyranose) or poly(carboxyethyl acrylate) as the hydrophilic extension.

View Article and Find Full Text PDF

Background: Nuclear medicine is a dynamic field that uses radioactive substances for diagnosis, therapy, and research. Developing terminology in this domain involves addressing complex concepts across multiple disciplines. Greek and Latin roots provide universal terms, enabling clear communication among global professionals.

View Article and Find Full Text PDF

The rapid spread of SARS-CoV-2 and its continuing impact on human health has prompted the need for effective and rapid development of monoclonal antibody therapeutics. In this study, we investigate polyclonal antibodies in serum and B cells from the whole blood of three donors with SARS-CoV-2 immunity to find high-affinity anti-SARS-CoV-2 antibodies to escape variants. Serum IgG antibodies were selected by their affinity to the receptor-binding domain (RBD) and non-RBD sites on the spike protein of Omicron subvariant B.

View Article and Find Full Text PDF

Mixed lineage kinase domain-like protein (MLKL) is a pseudokinase featured by a protein kinase-like domain without catalytic activity. MLKL was originally discovered to be phosphorylated by receptor-interacting protein kinase 1/3, typically increase plasma membrane permeabilization, and disrupt the membrane integrity, ultimately executing necroptosis. Recent evidence uncovers the association of MLKL with diverse cellular organelles, including the mitochondrion, lysosome, endosome, endoplasmic reticulum, and nucleus.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!