Most learning-based methods previously used in image dehazing employ a supervised learning strategy, which is time-consuming and requires a large-scale dataset. However, large-scale datasets are difficult to obtain. Here, we propose a self-supervised zero-shot dehazing network (SZDNet) based on dark channel prior, which uses a hazy image generated from the output dehazed image as a pseudo-label to supervise the optimization process of the network. Additionally, we use a novel multichannel quad-tree algorithm to estimate atmospheric light values, which is more accurate than previous methods. Furthermore, the sum of the cosine distance and the mean squared error between the pseudo-label and the input image is applied as a loss function to enhance the quality of the dehazed image. The most significant advantage of the SZDNet is that it does not require a large dataset for training before performing the dehazing task. Extensive testing shows promising performances of the proposed method in both qualitative and quantitative evaluations when compared with state-of-the-art methods.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10102283 | PMC |
http://dx.doi.org/10.1007/s12200-023-00062-7 | DOI Listing |
Proc IEEE Comput Soc Conf Comput Vis Pattern Recognit
June 2024
Humans effortlessly interpret images by parsing them into part-whole hierarchies; deep learning excels in learning multi-level feature spaces, but they often lack explicit coding of part-whole relations, a prominent property of medical imaging. To overcome this limitation, we introduce Adam-v2, a new self-supervised learning framework extending Adam [79] by explicitly incorporating part-whole hierarchies into its learning objectives through three key branches: (1) Localizability, acquiring discriminative representations to distinguish different anatomical patterns; (2) Composability, learning each anatomical structure in a parts-to-whole manner; and (3) Decomposability, comprehending each anatomical structure in a whole-to-parts manner. Experimental results across 10 tasks, compared to 11 baselines in zero-shot, few-shot transfer, and full fine-tuning settings, showcase Adam-v2's superior performance over large-scale medical models and existing SSL methods across diverse downstream tasks.
View Article and Find Full Text PDFProc SPIE Int Soc Opt Eng
February 2024
Department of Computer Science, Johns Hopkins University, Baltimore, MD 21286, USA.
Magnetic resonance images are often acquired as several 2D slices and stacked into a 3D volume, yielding a lower through-plane resolution than in-plane resolution. Many super-resolution (SR) methods have been proposed to address this, including those that use the inherent high-resolution (HR) in-plane signal as HR data to train deep neural networks. Techniques with this approach are generally both self-supervised and internally trained, so no external training data is required.
View Article and Find Full Text PDFMagn Reson Imaging
February 2025
Department of Computer Science and Technology, Zhejiang Normal University, Jinhua, Zhejiang 321004, China.
Purpose: This study introduces GraFMRI, a novel framework designed to address the challenges of reconstructing high-quality MRI images from undersampled k-space data. Traditional methods often suffer from noise amplification and loss of structural detail, leading to suboptimal image quality. GraFMRI leverages Graph Neural Networks (GNNs) to transform multi-modal MRI data (T1, T2, PD) into a graph-based representation, enabling the model to capture intricate spatial relationships and inter-modality dependencies.
View Article and Find Full Text PDFMed Image Anal
January 2025
ÉTS Montréal, Québec, Canada; Centre de Recherche du Centre Hospitalier de l'Université de Montréal (CR-CHUM), Québec, Canada.
Foundation vision-language models are currently transforming computer vision, and are on the rise in medical imaging fueled by their very promising generalization capabilities. However, the initial attempts to transfer this new paradigm to medical imaging have shown less impressive performances than those observed in other domains, due to the significant domain shift and the complex, expert domain knowledge inherent to medical-imaging tasks. Motivated by the need for domain-expert foundation models, we present FLAIR, a pre-trained vision-language model for universal retinal fundus image understanding.
View Article and Find Full Text PDFIEEE Trans Image Process
October 2024
Recently, there exists an increased research interest in embodied artificial intelligence (EAI), which involves an agent learning to perform a specific task when dynamically interacting with the surrounding 3D environment. There into, a new challenge is that many unseen objects may appear due to the increased number of object categories in 3D scenes. It makes developing models with strong zero-shot generalization ability to new objects necessary.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!