The composed image retrieval (CIR) task aims to retrieve the desired target image for a given multimodal query, i.e., a reference image with its corresponding modification text. The key limitations encountered by existing efforts are two aspects: 1) ignoring the multiple query-target matching factors; 2) ignoring the potential unlabeled reference-target image pairs in existing benchmark datasets. To address these two limitations is non-trivial due to the following challenges: 1) how to effectively model the multiple matching factors in a latent way without direct supervision signals; 2) how to fully utilize the potential unlabeled reference-target image pairs to improve the generalization ability of the CIR model. To address these challenges, in this work, we first propose a CLIP-Transformer based muLtI-factor Matching Network (LIMN), which consists of three key modules: disentanglement-based latent factor tokens mining, dual aggregation-based matching token learning, and dual query-target matching modeling. Thereafter, we design an iterative dual self-training paradigm to further enhance the performance of LIMN by fully utilizing the potential unlabeled reference-target image pairs in a weakly-supervised manner. Specifically, we denote the iterative dual self-training paradigm enhanced LIMN as LIMN+. Extensive experiments on four datasets, including FashionIQ, Shoes, CIRR, and Fashion200 K, show that our proposed LIMN and LIMN+ significantly surpass the state-of-the-art baselines.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1109/TPAMI.2023.3346434 | DOI Listing |
Prediction-powered inference (PPI) [1] and its subsequent development called PPI++ [2] provide a novel approach to standard statistical estimation leveraging machine learning systems to enhance unlabeled data with predictions. We use this paradigm in clinical trials. The predictions are provided by disease progression models, providing prognostic scores for all the participants as a function of baseline covariates.
View Article and Find Full Text PDFNat Commun
January 2025
Biosensor National Special Laboratory, Department of Biomedical Engineering, Zhejiang University, Hangzhou, China.
Cellular glycosylation is crucial for cell recognition, signal transduction, and the development of various diseases, especially in tumor initiation, progression, and metastasis. Current glycosylation profiling methods normally involve laborious sample processing and labeling and lack in-situ quantitative analysis. Here, we present a direct optical method to investigate and quantify the glycan expression on single cells based on lectin-glycan kinetic quantification with plasmonic imaging.
View Article and Find Full Text PDFJ Clin Invest
January 2025
Department of Nuclear Medicine, Peking University Cancer Hospital & Institute, Beijing, China.
Background: B7-H3 or CD276 is notably overexpressed in various malignant tumor cells in humans, with extremely high expression rates. The development of a radiotracer that targets B7-H3 may provide a universal tumor-specific imaging agent and allow the noninvasive assessment of the whole-body distribution of B7-H3-expressing lesions.
Methods: We enhanced and optimized the structure of an affibody (ABY) that targets B7-H3 to create the radiolabeled radiotracer [68Ga]Ga-B7H3-BCH, and then, we conducted both foundational experiments and clinical translational studies.
Front Artif Intell
January 2025
College of Computer and Control Engineering, Northeast Forestry University, Harbin, China.
Introduction: In clinical, the echocardiogram is the most widely used for diagnosing heart diseases. Different heart diseases are diagnosed based on different views of the echocardiogram images, so efficient echocardiogram view classification can help cardiologists diagnose heart disease rapidly. Echocardiogram view classification is mainly divided into supervised and semi-supervised methods.
View Article and Find Full Text PDFJ Imaging Inform Med
January 2025
Key Laboratory of Knowledge Engineering with Big Data (the Ministry of Education of China), Hefei University of Technology, Hefei, 230009, Anhui, China.
With the advancement of deep learning models nowadays, they have successfully applied in the semi-supervised medical image segmentation where there are few annotated medical images and a large number of unlabeled ones. A representative approach in this regard is the semi-supervised method based on consistency regularization, which improves model training by imposing consistency constraints (perturbations) on unlabeled data. However, the perturbations in this kind of methods are often artificially designed, which may introduce biases unfavorable to the model learning in the handling of medical image segmentation.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!