IEEE Trans Neural Netw Learn Syst
July 2024
In this article, we address the challenges in unsupervised video object segmentation (UVOS) by proposing an efficient algorithm, termed MTNet, which concurrently exploits motion and temporal cues. Unlike previous methods that focus solely on integrating appearance with motion or on modeling temporal relations, our method combines both aspects by integrating them within a unified framework. MTNet is devised by effectively merging appearance and motion features during the feature extraction process within encoders, promoting a more complementary representation.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
July 2022
High-cost pixel-level annotations makes it appealing to train saliency detection models with weak supervision. However, a single weak supervision source hardly contain enough information to train a well-performing model. To this end, we introduce a unified two-stage framework to learn from category labels, captions, web images and unlabeled images.
View Article and Find Full Text PDFObject proposals are a series of candidate segments containing objects of interest, which are taken as preprocessing and widely applied in various vision tasks. However, most of existing saliency approaches only utilize the proposals to compute a location prior. In this paper, we naturally take the proposals as the bags of instances of multiple instance learning (MIL), where the instances are the superpixels contained in the proposals, and formulate saliency detection problem as a MIL task (i.
View Article and Find Full Text PDFIn this paper, we propose a visual saliency detection algorithm to explore the fusion of various saliency models in a manner of bootstrap learning. First, an original bootstrapping model, which combines both weak and strong saliency models, is constructed. In this model, image priors are exploited to generate an original weak saliency model, which provides training samples for a strong model.
View Article and Find Full Text PDF