DctViT: Discrete Cosine Transform meet vision transformers.

Neural Netw

Changchun Institute of Optics, Fine Mechanics and Physics, Chinese Academy of Sciences, Changchun, 130033, Jilin, China. Electronic address:

Published: April 2024

Vision transformers (ViTs) have become one of the dominant frameworks for vision tasks in recent years because of their ability to efficiently capture long-range dependencies in image recognition tasks using self-attention. In fact, both CNNs and ViTs have advantages and disadvantages in vision tasks, and some studies suggest that the use of both may be an effective way to balance performance and computational cost. In this paper, we propose a new hybrid network based on CNN and transformer, using CNN to extract local features and transformer to capture long-distance dependencies. We also proposed a new feature map resolution reduction based on Discrete Cosine Transform and self-attention, named DCT-Attention Down-sample (DAD). Our DctViT-L achieves 84.8% top-1 accuracy on ImageNet 1K, far outperforming CMT, Next-ViT, SpectFormer and other state-of-the-art models, with lower computational costs. Using DctViT-B as the backbone, RetinaNet can achieve 46.8% mAP on COCO val2017, which improves mAP by 2.5% and 1.1% with less calculation cost compared with CMT-S and SpectFormer as the backbone.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.neunet.2024.106139DOI Listing

Publication Analysis

Top Keywords

discrete cosine
8
cosine transform
8
vision transformers
8
vision tasks
8
dctvit discrete
4
transform meet
4
vision
4
meet vision
4
transformers vision
4
transformers vits
4

Similar Publications

Generating high quality histopathology images like immunohistochemistry (IHC) stained images is essential for precise diagnosis and the advancement of computer-aided diagnostic (CAD) systems. Producing IHC images in laboratory is quite expensive and time consuming. Recently, some attempts have been made based on artificial intelligence techniques (particularly, deep learning) to generate IHC images.

View Article and Find Full Text PDF

Viruses are submicroscopic agents that can infect other lifeforms and use their hosts' cells to replicate themselves. Despite having simplistic genetic structures among all living beings, viruses are highly adaptable, resilient, and capable of causing severe complications in their hosts' bodies. Due to their multiple transmission pathways, high contagion rate, and lethality, viruses pose the biggest biological threat both animal and plant species face.

View Article and Find Full Text PDF

In recent years, the introduction of memristors in discrete chaotic map has attracted much attention due to its enhancement of the complexity and controllability of chaotic maps, especially in the fields of secure communication and random number generation, which have shown promising applications. In this work, a three-dimensional discrete memristive hyperchaotic map (3D-DMCHM) based on cosine memristor is constructed. First, we analyze the fixed points of the map and their stability, showing that the map can either have a linear fixed point or none at all, and the stability depends on the parameters and initial state of the map.

View Article and Find Full Text PDF
Article Synopsis
  • Diagnosing epilepsy through EEG signals is complex and error-prone due to variability and the large amount of data involved, making portable diagnostic systems challenging to develop.
  • The paper proposes using compressive sensing to reduce EEG data while keeping important information, enabling better seizure classification using features extracted from the signals.
  • Implemented on microcontrollers like STM32 and Raspberry Pi, this system achieved significant advances, including up to 70% data reduction, faster transmission times, notable energy savings, and a high classification accuracy of 98.78% with preserved signal quality.
View Article and Find Full Text PDF

We discuss how Dirichlet boundary conditions can be directly imposed for the Moulinec-Suquet discretization on the boundary of rectangular domains in iterative schemes based on the fast Fourier transform (FFT) and computational homogenization problems in mechanics. Classically, computational homogenization methods based on the fast Fourier transform work with periodic boundary conditions. There are applications, however, when Dirichlet (or Neumann) boundary conditions are required.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!