This article presents a semisupervised multilabel fully convolutional network (FCN) for hierarchical object parsing of images. We consider each object part (e.g., eye and head) as a class label and learn to assign every image pixel to multiple coherent part labels. Different from previous methods that consider part labels as independent classes, our method explicitly models the internal relationships between object parts, e.g., that a pixel highly scored for eyes should be highly scored for heads as well. Such relationships directly reflect the structure of the semantic space and thus should be respected while learning the deep representation. We achieve this objective by introducing a multilabel softmax loss function over both labeled and unlabeled images and regularizing it with two pairwise ranking constraints. The first constraint is based on a manifold assumption that image pixels being visually and spatially close to each other should be collaboratively classified as the same part label. The other constraint is used to enforce that no pixel receives significant scores from more than one label that are semantically conflicting with each other. The proposed loss function is differentiable with respect to network parameters and hence can be optimized by standard stochastic gradient methods. We evaluate the proposed method on two public image data sets for hierarchical object parsing and compare it with the alternative parsing methods. Extensive comparisons showed that our method can achieve state-of-the-art performance while using 50% less labeled training samples than the alternatives.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1109/TNNLS.2019.2931183 | DOI Listing |
Sci Rep
January 2025
Biotechnology Major, Sangmyung University, Seoul, 03016, South Korea.
Numerous studies have proven the potential of deep learning models for classifying wildlife. Such models can reduce the workload of experts by automating species classification to monitor wild populations and global trade. Although deep learning models typically perform better with more input data, the available wildlife data are ordinarily limited, specifically for rare or endangered species.
View Article and Find Full Text PDFSensors (Basel)
January 2025
College of Computer and Information Sciences, Fujian Agriculture and Forestry University, Fuzhou 350002, China.
Compared with conventional targets, small objects often face challenges such as smaller size, lower resolution, weaker contrast, and more background interference, making their detection more difficult. To address this issue, this paper proposes an improved small object detection method based on the YOLO11 model-PC-YOLO11s. The core innovation of PC-YOLO11s lies in the optimization of the detection network structure, which includes the following aspects: Firstly, PC-YOLO11s has adjusted the hierarchical structure of the detection network and added a P2 layer specifically for small object detection.
View Article and Find Full Text PDFNeural Netw
January 2025
Institute of Cognitive Sciences and Technologies, National Research Council, Padova, Italy. Electronic address:
By dynamic planning, we refer to the ability of the human brain to infer and impose motor trajectories related to cognitive decisions. A recent paradigm, active inference, brings fundamental insights into the adaptation of biological organisms, constantly striving to minimize prediction errors to restrict themselves to life-compatible states. Over the past years, many studies have shown how human and animal behaviors could be explained in terms of active inference - either as discrete decision-making or continuous motor control - inspiring innovative solutions in robotics and artificial intelligence.
View Article and Find Full Text PDFSensors (Basel)
January 2025
School of Geosciences, Yangtze University, Wuhan 430100, China.
Roadside tree segmentation and parameter extraction play an essential role in completing the virtual simulation of road scenes. Point cloud data of roadside trees collected by LiDAR provide important data support for achieving assisted autonomous driving. Due to the interference from trees and other ground objects in street scenes caused by mobile laser scanning, there may be a small number of missing points in the roadside tree point cloud, which makes it familiar for under-segmentation and over-segmentation phenomena to occur in the roadside tree segmentation process.
View Article and Find Full Text PDFSensors (Basel)
December 2024
Master's Program in Information and Computer Science, Doshisha University, Kyoto 610-0394, Japan.
The semantic segmentation of bone structures demands pixel-level classification accuracy to create reliable bone models for diagnosis. While Convolutional Neural Networks (CNNs) are commonly used for segmentation, they often struggle with complex shapes due to their focus on texture features and limited ability to incorporate positional information. As orthopedic surgery increasingly requires precise automatic diagnosis, we explored SegFormer, an enhanced Vision Transformer model that better handles spatial awareness in segmentation tasks.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!