Enhancing Low-Light Images with Kolmogorov-Arnold Networks in Transformer Attention.

Sensors (Basel)

Faculty of Electronics, Telecommunications and Information Technologies, Polytechnic University Timisoara, 300223 Timisoara, Romania.

Published: January 2025

Low-light image enhancement (LLIE) techniques improve the performance of image sensors by enhancing visibility and details in poorly lit environments and have significantly benefited from recent research into Transformer models. This work presents a novel Transformer attention mechanism inspired by the Kolmogorov-Arnold representation theorem, incorporating learnable non-linearity and multivariate function decomposition. This innovative mechanism is the foundation of KAN-T, our proposed Transformer network. By enhancing feature flexibility and enabling the model to capture broader contextual information, KAN-T achieves superior performance. Our comprehensive experiments, both quantitative and qualitative, demonstrate that the proposed method achieves state-of-the-art performance in low-light image enhancement, highlighting its effectiveness and wide-ranging applicability. The code will be released upon publication.

Download full-text PDF

Source
http://dx.doi.org/10.3390/s25020327DOI Listing

Publication Analysis

Top Keywords

transformer attention
8
low-light image
8
image enhancement
8
enhancing low-light
4
low-light images
4
images kolmogorov-arnold
4
kolmogorov-arnold networks
4
transformer
4
networks transformer
4
attention low-light
4

Similar Publications

Although the Transformer architecture has established itself as the industry standard for jobs involving natural language processing, it still has few uses in computer vision. In vision, attention is used in conjunction with convolutional networks or to replace individual convolutional network elements while preserving the overall network design. Differences between the two domains, such as significant variations in the scale of visual things and the higher granularity of pixels in images compared to words in the text, make it difficult to transfer Transformer from language to vision.

View Article and Find Full Text PDF

The current work introduces the hybrid ensemble framework for the detection and segmentation of colorectal cancer. This framework will incorporate both supervised classification and unsupervised clustering methods to present more understandable and accurate diagnostic results. The method entails several steps with CNN models: ADa-22 and AD-22, transformer networks, and an SVM classifier, all inbuilt.

View Article and Find Full Text PDF

Background And Objectives: Hypertensive Retinopathy (HR) is a retinal manifestation resulting from persistently elevated blood pressure. Severity grading of HR is essential for patient risk stratification, effective management, progression monitoring, timely intervention, and minimizing the risk of vision impairment. Computer-aided diagnosis and artificial intelligence (AI) systems play vital roles in the diagnosis and grading of HR.

View Article and Find Full Text PDF

Background: Food image recognition, a crucial step in computational gastronomy, has diverse applications across nutritional platforms. Convolutional neural networks (CNNs) are widely used for this task due to their ability to capture hierarchical features. However, they struggle with long-range dependencies and global feature extraction, which are vital in distinguishing visually similar foods or images where the context of the whole dish is crucial, thus necessitating transformer architecture.

View Article and Find Full Text PDF

A Dual-Channel and Frequency-Aware Approach for Lightweight Video Instance Segmentation.

Sensors (Basel)

January 2025

The Higher Educational Key Laboratory for Measuring & Control Technology and Instrumentation of Heilongjiang Province, Harbin University of Science and Technology, Harbin 150080, China.

Video instance segmentation, a key technology for intelligent sensing in visual perception, plays a key role in automated surveillance, robotics, and smart cities. These scenarios rely on real-time and efficient target-tracking capabilities for accurate perception and intelligent analysis of dynamic environments. However, traditional video instance segmentation methods face complex models, high computational overheads, and slow segmentation speeds in time-series feature extraction, especially in resource-constrained environments.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!