In this paper, we introduce a novel panoptic segmentation method called the Mask-Pyramid Network. Existing Mask RCNN-based methods first generate a large number of box proposals and then filter them at each feature level, which requires a lot of computational resources, while most of the box proposals are suppressed and discarded in the Non-Maximum Suppression process. Additionally, for panoptic segmentation, it is a problem to properly fuse the semantic segmentation results with the Mask RCNN-produced instance segmentation results.
View Article and Find Full Text PDFHuman action recognition methods in videos based on deep convolutional neural networks usually use random cropping or its variants for data augmentation. However, this traditional data augmentation approach may generate many non-informative samples (video patches covering only a small part of the foreground or only the background) that are not related to a specific action. These samples can be regarded as noisy samples with incorrect labels, which reduces the overall action recognition performance.
View Article and Find Full Text PDFDeep reinforcement learning (DRL) has been utilized in numerous computer vision tasks, such as object detection, autonomous driving, etc. However, relatively few DRL methods have been proposed in the area of image segmentation, particularly in left ventricle segmentation. Reinforcement learning-based methods in earlier works often rely on learning proper thresholds to perform segmentation, and the segmentation results are inaccurate due to the sensitivity of the threshold.
View Article and Find Full Text PDF