We present Affordance Recognition with One-Shot Human Stances (AROS), a one-shot learning approach that uses an explicit representation of interactions between highly articulated human poses and 3D scenes. The approach is one-shot since it does not require iterative training or retraining to add new affordance instances. Furthermore, only one or a small handful of examples of the target pose are needed to describe the interactions.
View Article and Find Full Text PDFMany types of Convolutional Neural Network (CNN) models and training methods have been proposed in recent years aiming to provide efficiency for embedded and edge devices with limited computation and memory resources. The wide variety of architectures makes this a complex task that has to balance generality with efficiency. Among the most interesting camera-sensor architectures are Pixel Processor Arrays (PPAs).
View Article and Find Full Text PDFVision processing for control of agile autonomous robots requires low-latency computation, within a limited power and space budget. This is challenging for conventional computing hardware. Parallel processor arrays (PPAs) are a new class of vision sensor devices that exploit advances in semiconductor technology, embedding a processor within each pixel of the image sensor array.
View Article and Find Full Text PDFEnvironments in which Global Positioning Systems (GPS), or more generally Global Navigation Satellite System (GNSS), signals are denied or degraded pose problems for the guidance, navigation, and control of autonomous systems. This can make operating in hostile GNSS-Impaired environments, such as indoors, or in urban and natural canyons, impossible or extremely difficult. Pixel Processor Array (PPA) cameras-in conjunction with other on-board sensors-can be used to address this problem, aiding in tracking, localization, and control.
View Article and Find Full Text PDFAgents that need to act on their surroundings can significantly benefit from the perception of their interaction possibilities or affordances. In this paper we combine the benefits of the Interaction Tensor, a straight-forward geometrical representation that captures multiple object-scene interactions, with deep learning saliency for fast parsing of affordances in the environment. Our approach works with visually perceived 3D pointclouds and enables to query a 3D scene for locations that support affordances such as sitting or riding, as well as interactions for everyday objects like the where to hang an umbrella or place a mug.
View Article and Find Full Text PDFToday, the workflows that are involved in industrial assembly and production activities are becoming increasingly complex. To efficiently and safely perform these workflows is demanding on the workers, in particular when it comes to infrequent or repetitive tasks. This burden on the workers can be eased by introducing smart assistance systems.
View Article and Find Full Text PDF