Publications by authors named "Juergen Gall"

Rethinking 3-D LiDAR Point Cloud Segmentation.

IEEE Trans Neural Netw Learn Syst

December 2021

Many point-based semantic segmentation methods have been designed for indoor scenarios, but they struggle if they are applied to point clouds that are captured by a light detection and ranging (LiDAR) sensor in an outdoor environment. In order to make these methods more efficient and robust such that they can handle LiDAR data, we introduce the general concept of reformulating 3-D point-based operations such that they can operate in the projection space. While we show by means of three point-based methods that the reformulated versions are between 300 and 400 times faster and achieve higher accuracy, we furthermore demonstrate that the concept of reformulating 3-D point-based operations allows to design new architectures that unify the benefits of point-based and image-based methods.

View Article and Find Full Text PDF

Action segmentation is the task of predicting the actions for each frame of a video. As obtaining the full annotation of videos for action segmentation is expensive, weakly supervised approaches that can learn only from transcripts are appealing. In this paper, we propose a novel end-to-end approach for weakly supervised action segmentation based on a two-branch neural network.

View Article and Find Full Text PDF

In order to enable timely actions to prevent major losses of crops caused by lack of nutrients and, hence, increase the potential yield throughout the growing season while at the same time prevent excess fertilization with detrimental environmental consequences, early, non-invasive, and on-site detection of nutrient deficiency is required. Current non-invasive methods for assessing the nutrient status of crops deal in most cases with nitrogen (N) deficiency only and optical sensors to diagnose N deficiency, such as chlorophyll meters or canopy reflectance sensors, do not monitor N, but instead measure changes in leaf spectral properties that may or may not be caused by N deficiency. In this work, we study how well nutrient deficiency symptoms can be recognized in RGB images of sugar beets.

View Article and Find Full Text PDF

With the success of deep learning in classifying short trimmed videos, more attention has been focused on temporally segmenting and classifying activities in long untrimmed videos. State-of-the-art approaches for action segmentation utilize several layers of temporal convolution and temporal pooling. Despite the capabilities of these approaches in capturing temporal dependencies, their predictions suffer from over-segmentation errors.

View Article and Find Full Text PDF

Action recognition has become a rapidly developing research field within the last decade. But with the increasing demand for large scale data, the need of hand annotated data for the training becomes more and more impractical. One way to avoid frame-based human annotation is the use of action order information to learn the respective action classes.

View Article and Find Full Text PDF

Since annotating and curating large datasets is very expensive, there is a need to transfer the knowledge from existing annotated datasets to unlabelled data. Data that is relevant for a specific application, however, usually differs from publicly available datasets since it is sampled from a different domain. While domain adaptation methods compensate for such a domain shift, they assume that all categories in the target domain are known and match the categories in the source domain.

View Article and Find Full Text PDF

Large image datasets such as ImageNet or open-ended photo websites like Flickr are revealing new challenges to image classification that were not apparent in smaller, fixed sets. In particular, the efficient handling of dynamically growing datasets, where not only the amount of training data but also the number of classes increases over time, is a relatively unexplored problem. In this challenging setting, we study how two variants of Random Forests (RF) perform under four strategies to incorporate new classes while avoiding to retrain the RFs from scratch.

View Article and Find Full Text PDF

In this work, we address the problem of estimating 2d human pose from still images. Articulated body pose estimation is challenging due to the large variation in body poses and appearances of the different body parts. Recent methods that rely on the pictorial structure framework have shown to be very successful in solving this task.

View Article and Find Full Text PDF

Capturing the skeleton motion and detailed time-varying surface geometry of multiple, closely interacting peoples is a very challenging task, even in a multicamera setup, due to frequent occlusions and ambiguities in feature-to-person assignments. To address this task, we propose a framework that exploits multiview image segmentation. To this end, a probabilistic shape and appearance model is employed to segment the input images and to assign each pixel uniquely to one person.

View Article and Find Full Text PDF

Abstract—The paper introduces Hough forests, which are random forests adapted to perform a generalized Hough transform in an efficient way. Compared to previous Hough-based systems such as implicit shape models, Hough forests improve the performance of the generalized Hough transform for object detection on a categorical level. At the same time, their flexibility permits extensions of the Hough transform to new domains such as object tracking and action recognition.

View Article and Find Full Text PDF

In this paper, we propose the combined use of complementary concepts for 3D tracking: region fitting on one side and dense optical flow as well as tracked SIFT features on the other. Both concepts are chosen such that they can compensate for the shortcomings of each other. While tracking by the object region can prevent the accumulation of errors, optical flow and SIFT can handle larger transformations.

View Article and Find Full Text PDF