AI Article Synopsis

  • RGB-D indoor scene parsing is a difficult task in computer vision due to the complexity and disorder of indoor environments.
  • Researchers introduced a new model called FASFLNet, which combines feature adaptive selection and fusion using a lightweight backbone network (MobileNetV2) for better efficiency and performance.
  • FASFLNet effectively uses additional depth information for enhanced feature extraction and integrates features at multiple layers to improve accuracy, outperforming existing models in experiments on prominent datasets.

Article Abstract

RGB-D indoor scene parsing is a challenging task in computer vision. Conventional scene-parsing approaches based on manual feature extraction have proved inadequate in this area because indoor scenes are both unordered and complex. This study proposes a feature adaptive selection, and fusion lightweight network (FASFLNet) for RGB-D indoor scene parsing that is both efficient and accurate. The proposed FASFLNet utilizes a lightweight classification network (MobileNetV2), constituting the backbone of the feature extraction. This lightweight backbone model guarantees that FASFLNet is not only highly efficient but also provides good performance in terms of feature extraction. The additional information provided by depth images (specifically, spatial information such as the shape and scale of objects) is used in FASFLNet as supplemental information for feature-level adaptive fusion between the RGB and depth streams. Furthermore, during decoding, the features of different layers are fused from top-bottom and integrated at different layers for final pixel-level classification, resulting in an effect similar to that of pyramid supervision. Experimental results obtained on the NYU V2 and SUN RGB-D datasets indicate that the proposed FASFLNet outperforms existing state-of-the-art models and is both highly efficient and accurate.

Download full-text PDF

Source
http://dx.doi.org/10.1364/OE.480252DOI Listing

Publication Analysis

Top Keywords

rgb-d indoor
12
indoor scene
12
scene parsing
12
feature extraction
12
feature adaptive
8
adaptive selection
8
selection fusion
8
fusion lightweight
8
lightweight network
8
efficient accurate
8

Similar Publications

Over recent years, automated Human Activity Recognition (HAR) has been an area of concern for many researchers due to its widespread application in surveillance systems, healthcare environments, and many more. This has led researchers to develop coherent and robust systems that efficiently perform HAR. Although there have been many efficient systems developed to date, still, there are many issues to be addressed.

View Article and Find Full Text PDF

Lightweight Two-Layer Control Architecture for Human-Following Robot.

Sensors (Basel)

December 2024

Department of Computer and Decision Sciences, Faculty of Mines, Universidad Nacional de Colombia, Medellín 050034, Colombia.

(1) Background: Human detection and tracking are critical tasks for assistive autonomous robots, particularly in ensuring safe and efficient human-robot interaction in indoor environments. The increasing need for personal assistance among the elderly and people with disabilities has led to the development of innovative robotic systems. (2) Methods: This research presents a lightweight two-layer control architecture for a human-following robot, integrating a fuzzy behavior-based control system with low-level embedded controllers.

View Article and Find Full Text PDF

DDNet: Depth Dominant Network for Semantic Segmentation of RGB-D Images.

Sensors (Basel)

October 2024

Division of Science, Engineering and Health Studies, School of Professional Education and Executive Development, The Hong Kong Polytechnic University, Hong Kong 999077, China.

Article Synopsis
  • Convolutional neural networks (CNNs) are commonly used for indoor scene parsing and object segmentation in color images, but they struggle with the lack of geometric and context information from RGB data alone.
  • This study introduces a new network called the Depth Dominant Network (DDNet) that emphasizes the utilization of depth map context, leveraging the geometric information found in depth images for better segmentation results.
  • DDNet features a dual-branch CNN design that prioritizes depth information for segmentation, while also incorporating RGB data to enrich the depth features, demonstrating superior performance on various RGB-D semantic segmentation benchmarks.
View Article and Find Full Text PDF

The paper presents a vision-based obstacle avoidance strategy for lightweight self-driving cars that can be run on a CPU-only device using a single RGB-D camera. The method consists of two steps: visual perception and path planning. The visual perception part uses ORBSLAM3 enhanced with optical flow to estimate the car's poses and extract rich texture information from the scene.

View Article and Find Full Text PDF

Neural Colour Correction for Indoor 3D Reconstruction Using RGB-D Data.

Sensors (Basel)

June 2024

Institute of Electronics and Informatics Engineering of Aveiro (IEETA), Intelligent System Associate Laboratory (LASI), University of Aveiro, 3810-193 Aveiro, Portugal.

With the rise in popularity of different human-centred applications using 3D reconstruction data, the problem of generating photo-realistic models has become an important task. In a multiview acquisition system, particularly for large indoor scenes, the acquisition conditions will differ along the environment, causing colour differences between captures and unappealing visual artefacts in the produced models. We propose a novel neural-based approach to colour correction for indoor 3D reconstruction.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!