GFNet: Global Filter Networks for Visual Recognition.

IEEE Trans Pattern Anal Mach Intell

Published: September 2023

Recent advances in self-attention and pure multi-layer perceptrons (MLP) models for vision have shown great potential in achieving promising performance with fewer inductive biases. These models are generally based on learning interaction among spatial locations from raw data. The complexity of self-attention and MLP grows quadratically as the image size increases, which makes these models hard to scale up when high-resolution features are required. In this paper, we present the Global Filter Network (GFNet), a conceptually simple yet computationally efficient architecture, that learns long-term spatial dependencies in the frequency domain with log-linear complexity. Our architecture replaces the self-attention layer in vision Transformers with three key operations: a 2D discrete Fourier transform, an element-wise multiplication between frequency-domain features and learnable global filters, and a 2D inverse Fourier transform. Based on this basic design, we develop a series of isotropic models with a Transformer-style simple architecture and CNN-style hierarchical models with better performance. Isotropic GFNet models exhibit favorable accuracy/complexity trade-offs compared to recent vision Transformers and pure MLP models. Hierarchical GFNet models can inherit successful designs in CNNs and be easily scaled up with larger model sizes and more training data, showing strong performance on both image classification (e.g., 85.0% top-1 accuracy on ImageNet-1 k without any extra data or supervision, and 87.4% accuracy with ImageNet-21 k pre-training) and dense prediction tasks (e.g., 54.3 mIoU on ADE20 k val). Our results demonstrate that GFNet can be a very competitive alternative to Transformer-based models and CNNs in terms of efficiency, generalization ability and robustness. Code is available at https://github.com/raoyongming/GFNet.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2023.3263824DOI Listing

Publication Analysis

Top Keywords

models
9
global filter
8
mlp models
8
vision transformers
8
fourier transform
8
gfnet models
8
gfnet
5
gfnet global
4
filter networks
4
networks visual
4

Similar Publications

Objective: This study aimed to evaluate and compare the clinicopathologic features of primary fallopian tubal carcinoma (PFTC) and high-grade serous ovarian cancer (HGSOC) and explore the prognostic factors of these two malignant tumors.

Methods: Fifty-seven patients diagnosed with PFTC from 2006 to 2015 and 60 patients diagnosed with HGSOC from 2014 to 2015 with complete prognostic information were identified at Women's Hospital of Zhejiang University. The clinicopathological and surgical data were collected, and the survival of the patients was followed for 5 years after surgery.

View Article and Find Full Text PDF

Background: Epidemiological studies associate an increase in breast cancer risk, particularly triple-negative breast cancer (TNBC), with lack of breastfeeding. This is more prevalent in African American women, with significantly lower rate of breastfeeding compared to Caucasian women. Prolonged breastfeeding leads to gradual involution (GI), whereas short-term or lack of breastfeeding leads to abrupt involution (AI) of the breast.

View Article and Find Full Text PDF

Background: Accurate classification of host phenotypes from microbiome data is crucial for advancing microbiome-based therapies, with machine learning offering effective solutions. However, the complexity of the gut microbiome, data sparsity, compositionality, and population-specificity present significant challenges. Microbiome data transformations can alleviate some of the aforementioned challenges, but their usage in machine learning tasks has largely been unexplored.

View Article and Find Full Text PDF

Role of extracellular vesicles in the pathogenesis of mosquito-borne flaviviruses that impact public health.

J Biomed Sci

January 2025

Departamento de Biología Molecular y Biotecnología, Instituto de Investigaciones Biomédicas, Universidad Nacional Autónoma de México (UNAM), 04510, Mexico City, Mexico.

Mosquito-borne flaviviruses represent a public health challenge due to the high-rate endemic infections, severe clinical outcomes, and the potential risk of emerging global outbreaks. Flavivirus disease pathogenesis converges on cellular factors from vectors and hosts, and their interactions are still unclear. Exosomes and microparticles are extracellular vesicles released from cells that mediate the intercellular communication necessary for maintaining homeostasis; however, they have been shown to be involved in disease establishment and progression.

View Article and Find Full Text PDF

Background: De-intensification of anti-cancer therapy without significantly affecting outcomes is an important goal. Omission of axillary surgery or breast radiation is considered a reasonable option in elderly patients with early-stage breast cancer and good prognostic factors. Data on avoidance of both axillary surgery and radiation therapy (RT) is scarce and inconclusive.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!