Publications by authors named "Alberto Del Bimbo"

Effective modeling of human interactions is of utmost importance when forecasting behaviors such as future trajectories. Each individual, with its motion, influences surrounding agents since everyone obeys to social non-written rules such as collision avoidance or group following. In this paper we model such interactions, which constantly evolve through time, by looking at the problem from an algorithmic point of view, i.

View Article and Find Full Text PDF

Compatible features enable the direct comparison of old and new learned features allowing to use them interchangeably over time. In visual search systems, this eliminates the need to extract new features from the gallery-set when the representation model is upgraded with novel data. This has a big value in real applications as re-indexing the gallery-set can be computationally expensive when the gallery-set is large, or even infeasible due to privacy or other concerns of the application.

View Article and Find Full Text PDF

Action understanding is a fundamental computer vision branch for several applications, ranging from surveillance to robotics. Most works deal with localizing and recognizing the action in both time and space, without providing a characterization of its evolution. Recent works have addressed the prediction of action progress, which is an estimate of how far the action has advanced as it is performed.

View Article and Find Full Text PDF

We propose an automatic method to estimate self-reported pain based on facial landmarks extracted from videos. For each video sequence, we decompose the face into four different regions and the pain intensity is measured by modeling the dynamics of facial movement using the landmarks of these regions. A formulation based on Gram matrices is used for representing the trajectory of landmarks on the Riemannian manifold of symmetric positive semi-definite matrices of fixed rank.

View Article and Find Full Text PDF

Estimating the 3D shape of objects from monocular images is a well-established and challenging task in the computer vision field. Further challenges arise when highly deformable objects, such as human faces or bodies, are considered. In this work, we address the problem of estimating the 3D shape of a human body from single images.

View Article and Find Full Text PDF

We propose an automatic method for pain intensity measurement from video. For each video, pain intensity was measured using the dynamics of facial movement using 66 facial points. Gram matrices formulation was used for facial points trajectory representations on the Riemannian manifold of symmetric positive semi-definite matrices of fixed rank.

View Article and Find Full Text PDF

The 3D Morphable Model (3DMM) is a powerful statistical tool for representing 3D face shapes. To build a 3DMM, a training set of face scans in full point-to-point correspondence is required, and its modeling capabilities directly depend on the variability contained in the training data. Thus, to increase the descriptive power of the 3DMM, establishing a dense correspondence across heterogeneous scans with sufficient diversity in terms of identities, ethnicities, or expressions becomes essential.

View Article and Find Full Text PDF

Neural networks are widely used as a model for classification in a large variety of tasks. Typically, a learnable transformation (i.e.

View Article and Find Full Text PDF

Facial Action Units (AUs) correspond to the deformation/contraction of individual facial muscles or their combinations. As such, each AU affects just a small portion of the face, with deformations that are asymmetric in many cases. Generating and analyzing AUs in 3D is particularly relevant for the potential applications it can enable.

View Article and Find Full Text PDF

Pedestrians and drivers are expected to safely navigate complex urban environments along with several non cooperating agents. Autonomous vehicles will soon replicate this capability. Each agent acquires a representation of the world from an egocentric perspective and must make decisions ensuring safety for itself and others.

View Article and Find Full Text PDF

Face recognition "in the wild" has been revolutionized by the deployment of deep learning based approaches. In fact, it has been extensively demonstrated that Deep Convolutional Neural Networks (DCNNs) are powerful enough to overcome most of the limits that affected face recognition algorithms based on hand-crafted features. These include variations in illumination, pose, expression and occlusion, to mention some.

View Article and Find Full Text PDF

Object detection is one of the most important tasks of computer vision. It is usually performed by evaluating a subset of the possible locations of an image, that are more likely to contain the object of interest. Exhaustive approaches have now been superseded by object proposal methods.

View Article and Find Full Text PDF

In this paper, we present the ALIEN tracking method that exploits oversampling of local invariant representations to build a robust object/context discriminative classifier. To this end, we use multiple instances of scale invariant local features weakly aligned along the object template. This allows taking into account the 3D shape deviations from planarity and their interactions with shadows, occlusions, and sensor quantization for which no invariant representations can be defined.

View Article and Find Full Text PDF

In this paper we introduce a method for person re-identification based on discriminative, sparse basis expansions of targets in terms of a labeled gallery of known individuals. We propose an iterative extension to sparse discriminative classifiers capable of ranking many candidate targets. The approach makes use of soft- and hard- re-weighting to redistribute energy among the most relevant contributing elements and to ensure that the best candidates are ranked at each iteration.

View Article and Find Full Text PDF

Photographic documents both in digital and in printed format plays a fundamental role in crime scene analysis. Photos are crucial to reconstruct what happened and also to freeze the fact scenario with all the different present objects and evidences. Consequently, it is immediate to comprehend the paramount importance of the assessment of the authenticity of such images, to avoid that a possible malicious counterfeiting leads to a wrong evaluation of the circumstance.

View Article and Find Full Text PDF

In this paper, we present a novel and original framework, which we dubbed mesh-local binary pattern (LBP), for computing local binary-like-patterns on a triangular-mesh manifold. This framework can be adapted to all the LBP variants employed in 2D image analysis. As such, it allows extending the related techniques to mesh surfaces.

View Article and Find Full Text PDF

Recognizing human actions in 3-D video sequences is an important open problem that is currently at the heart of many research domains including surveillance, natural interfaces and rehabilitation. However, the design and development of models for action recognition that are both accurate and efficient is a challenging task due to the variability of the human pose, clothing and appearance. In this paper, we propose a new framework to extract a compact representation of a human action captured through a depth sensor, and enable accurate action recognition.

View Article and Find Full Text PDF

In this paper, we present a novel method to improve the flexibility of descriptor matching for image recognition by using local multiresolution pyramids in feature space. We propose that image patches be represented at multiple levels of descriptor detail and that these levels be defined in terms of local spatial pooling resolution. Preserving multiple levels of detail in local descriptors is a way of hedging one's bets on which levels will most relevant for matching during learning and recognition.

View Article and Find Full Text PDF

We contribute, through this paper, to the design of a novel variational framework able to match and recognize multiple instances of multiple reference logos in image archives. Reference logos and test images are seen as constellations of local features (interest points, regions, etc.) and matched by minimizing an energy function mixing: 1) a fidelity term that measures the quality of feature matching, 2) a neighborhood criterion that captures feature co-occurrence/geometry, and 3) a regularization term that controls the smoothness of the matching solution.

View Article and Find Full Text PDF

In this paper, we present a novel approach to 3D face matching that shows high effectiveness in distinguishing facial differences between distinct individuals from differences induced by nonneutral expressions within the same individual. The approach takes into account geometrical information of the 3D face and encodes the relevant information into a compact representation in the form of a graph. Nodes of the graph represent equal width isogeodesic facial stripes.

View Article and Find Full Text PDF

Identifying correspondences between trajectory segments observed from nonsynchronized cameras is important for reconstruction of the complete trajectory of moving targets in a large scene. Such a reconstruction can be obtained from motion data by comparing the trajectory segments and estimating both the spatial and temporal alignments. Exhaustive testing of all possible correspondences of trajectories over a temporal window is only viable in the cases with a limited number of moving targets and large view overlaps.

View Article and Find Full Text PDF

Image analysis and computer vision can be effectively employed to recover the three-dimensional structure of imaged objects, together with their surface properties. In this paper, we address the problem of metric reconstruction and texture acquisition from a single uncalibrated view of a surface of revolution (SOR). Geometric constraints induced in the image by the symmetry properties of the SOR structure are exploited to perform self-calibration of a natural camera, 3D metric reconstruction, and texture acquisition.

View Article and Find Full Text PDF