This paper focuses on referring segmentation, which aims to selectively segment the corresponding visual region in an image (or video) according to the referring expression. However, the existing methods usually consider the interaction between multi-modal features at the decoding end of the network. Specifically, they interact the visual features of each scale with language respectively, thus ignoring the correlation between multi-scale features. In this work, we present an encoder fusion network (EFN), which transfers the multi-modal feature learning process from the decoding end to the encoding end and realizes the gradual refinement of multi-modal features by the language. In EFN, we also adopt a co-attention mechanism to promote the mutual alignment of language and visual information in feature space. In the decoding stage, a boundary enhancement module (BEM) is proposed to enhance the network's attention to the details of the target. For video data, we introduce an asymmetric cross-frame attention module (ACFM) to effectively capture the temporal information from the video frames by computing the relationship between each pixel of the current frame and each pooled sub-region of the reference frames. Extensive experiments on referring image/video segmentation datasets show that our method outperforms the state-of-the-art performance.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TPAMI.2022.3221387DOI Listing

Publication Analysis

Top Keywords

referring segmentation
8
multi-modal features
8
referring
4
segmentation encoder-fused
4
encoder-fused cross-modal
4
cross-modal attention
4
attention network
4
network paper
4
paper focuses
4
focuses referring
4

Similar Publications

Purpose: The relationship between retinal morphology, as assessed by optical coherence tomography (OCT), and retinal function in microperimetry (MP) has not been well studied, despite its increasing importance as an essential functional endpoint for clinical trials and emerging therapies in retinal diseases. Normative databases of healthy ageing eyes are largely missing from literature.

Methods: Healthy subjects above 50 years were examined using two MP devices, MP-3 (NIDEK) and MAIA (iCare).

View Article and Find Full Text PDF

A case of uveoscleral mucosa-associated lymphoid tissue lymphoma with secondary angle-closure glaucoma.

BMC Ophthalmol

December 2024

Department of Ophthalmology, Faculty of Medicine, Graduate School of Medicine, Hokkaido University, N-15, W-7, Kita-ku, Sapporo, 060-8638, Japan.

Background/aim: Mucosa-associated lymphoid tissue (MALT) lymphomas occur in not only the ocular adnexa, but rarely in the sclera or uvea. Histopathological confirmation contributes to a better understanding of the pathogenesis and treatment. We report a case of uveoscleral MALT lymphoma with angle-closure glaucoma.

View Article and Find Full Text PDF

Background -Smoking is associated with arrhythmia and sudden cardiac death, but the biological mechanisms remain unclear. In electrocardiogram (ECG) recordings abnormal durations of ventricular repolarization (QT interval), atrial depolarization (P wave), and atrioventricular depolarization (PR interval and segment), predict cardiac arrhythmia and mortality. Previous analyses of the National Health and Nutrition Examination Survey (NHANES) database for associations between smoking and ECG abnormalities were incomplete.

View Article and Find Full Text PDF

Post rotavirus vaccine introduction in Mozambique (September 2015), we documented a decline in rotavirus-associated diarrhoea and genotypes changes in our diarrhoeal surveillance spanning 2008-2021. This study aimed to perform whole-genome sequencing of rotavirus strains from 2009 to 2012 (pre-vaccine) and 2017-2018 (post-vaccine). Rotavirus strains previously detected by conventional PCR as G2P[4], G2P[6], G3P[4], G8P[4], G8P[6], and G9P[6] from children with moderate-to-severe and less-severe diarrhoea and without diarrhoea (healthy community controls) were sequenced using Illumina MiSeq platform and analysed using bioinformatics tools.

View Article and Find Full Text PDF

Comparing CT-like bone images based on FRACTURE MR with CT in pediatric congenital vertebral anomalies.

AJNR Am J Neuroradiol

December 2024

From the Department of Radiology (H.N.M., F.B.G.), Bai Jerbai Wadia Hospital for Children, Mumbai, Maharashtra, India.

Background And Purpose: Congenital vertebral anomalies are commonly associated with underlying spinal cord anomaly which necessitates imaging both the spinal cord and the bony vertebral column to understand the extent of the deformity better. While MRI is the gold standard for spinal cord imaging, it does not provide CT-like bone details. Many MR bone imaging techniques have been tested in various adult spine conditions in the past decade but not much has been described on their reliability in pediatric spine.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!