IEEE Trans Pattern Anal Mach Intell
November 2024
A recent trend in Non-Rigid Structure-from-Motion (NRSfM) is to express local, differential constraints between pairs of images, from which the surface normal at any point can be obtained by solving a system of polynomial equations. While this approach is more successful than its counterparts relying on global constraints, the resulting methods face two main problems: First, most of the equation systems they formulate are of high degree and must be solved using computationally expensive polynomial solvers. Some methods use polynomial reduction strategies to simplify the system, but this adds some phantom solutions.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
June 2024
In this work, we tackle the task of estimating the 6D pose of an object from point cloud data. While recent learning-based approaches have shown remarkable success on synthetic datasets, we have observed them to fail in the presence of real-world data. We investigate the root causes of these failures and identify two main challenges: The sensitivity of the widely-used SVD-based loss function to the range of rotation between the two point clouds, and the difference in feature distributions between the source and target point clouds.
View Article and Find Full Text PDFWhile adversarial training and its variants have shown to be the most effective algorithms to defend against adversarial attacks, their extremely slow training process makes it hard to scale to large datasets like ImageNet. The key idea of recent works to accelerate adversarial training is to substitute multi-step attacks (e.g.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
May 2023
In this article we propose an unsupervised feature extraction method to capture temporal information on monocular videos, where we detect and encode subject of interest in each frame and leverage contrastive self-supervised (CSS) learning to extract rich latent vectors. Instead of simply treating the latent features of nearby frames as positive pairs and those of temporally-distant ones as negative pairs as in other CSS approaches, we explicitly disentangle each latent vector into a time-variant component and a time-invariant one. We then show that applying contrastive loss only to the time-variant features and encouraging a gradual transition on them between nearby and away frames while also reconstructing the input, extract rich temporal features, well-suited for human pose estimation.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
December 2022
IEEE Trans Neural Netw Learn Syst
June 2023
Training certifiable neural networks enables us to obtain models with robustness guarantees against adversarial attacks. In this work, we introduce a framework to obtain a provable adversarial-free region in the neighborhood of the input data by a polyhedral envelope, which yields more fine-grained certified robustness than existing methods. We further introduce polyhedral envelope regularization (PER) to encourage larger adversarial-free regions and thus improve the provable robustness of the models.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
November 2022
Weight sharing promises to make neural architecture search (NAS) tractable even on commodity hardware. Existing methods in this space rely on a diverse set of heuristics to design and train the shared-weight backbone network, a.k.
View Article and Find Full Text PDFModern methods for counting people in crowded scenes rely on deep networks to estimate people densities in individual images. As such, only very few take advantage of temporal consistency in video sequences, and those that do only impose weak smoothness constraints across consecutive frames. In this paper, we advocate estimating people flows across image locations between consecutive images and inferring the people densities from these flows instead of directly regressing them.
View Article and Find Full Text PDFEigendecomposition of symmetric matrices is at the heart of many computer vision algorithms. However, the derivatives of the eigenvectors tend to be numerically unstable, whether using the SVD to compute them analytically or using the Power Iteration (PI) method to approximate them. This instability arises in the presence of eigenvalues that are close to each other.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
January 2022
IEEE Trans Pattern Anal Mach Intell
September 2021
Many classical Computer Vision problems, such as essential matrix computation and pose estimation from 3D to 2D correspondences, can be tackled by solving a linear least-square problem, which can be done by finding the eigenvector corresponding to the smallest, or zero, eigenvalue of a matrix representing a linear system. Incorporating this in deep learning frameworks would allow us to explicitly encode known notions of geometry, instead of having the network implicitly learn them from data. However, performing eigendecomposition within a network requires the ability to differentiate this operation.
View Article and Find Full Text PDFWe present an Unsupervised Domain Adaptation strategy to compensate for domain shifts on Electron Microscopy volumes. Our method aggregates visual correspondences-motifs that are visually similar across different acquisitions-to infer changes on the parameters of pretrained models, and enable them to operate on new data. In particular, we examine the annotations of an existing acquisition to determine pivot locations that characterize the reference segmentation, and use a patch matching algorithm to find their candidate visual correspondences in a new volume.
View Article and Find Full Text PDFThe performance of a classifier trained on data coming from a specific domain typically degrades when applied to a related but different one. While annotating many samples from the new domain would address this issue, it is often too expensive or impractical. Domain Adaptation has therefore emerged as a solution to this problem; It leverages annotated data from a source domain, in which it is abundant, to train a classifier to operate in a target domain, in which it is either sparse or even lacking altogether.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
April 2019
Multi-label submodular Markov Random Fields (MRFs) have been shown to be solvable using max-flow based on an encoding of the labels proposed by Ishikawa, in which each variable X is represented by l nodes (where l is the number of labels) arranged in a column. However, this method in general requires 2 l edges for each pair of neighbouring variables. This makes it inapplicable to realistic problems with many variables and labels, due to excessive memory requirement.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
June 2018
Pixel-level annotations are expensive and time consuming to obtain. Hence, weak supervision using only image tags could have a significant impact in semantic segmentation. Recently, CNN-based methods have proposed to fine-tune pre-trained networks using image tags.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
January 2018
Representing images and videos with Symmetric Positive Definite (SPD) matrices, and considering the Riemannian geometry of the resulting space, has been shown to yield high discriminative power in many visual recognition tasks. Unfortunately, computation on the Riemannian manifold of SPD matrices -especially of high-dimensional ones- comes at a high cost that limits the applicability of existing techniques. In this paper, we introduce algorithms able to handle high-dimensional SPD matrices by constructing a lower-dimensional SPD manifold.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
December 2015
In this paper, we develop an approach to exploiting kernel methods with manifold-valued data. In many computer vision problems, the data can be naturally represented as points on a Riemannian manifold. Due to the non-Euclidean geometry of Riemannian manifolds, usual Euclidean computer vision and machine learning algorithms yield inferior results on such data.
View Article and Find Full Text PDFThis paper tackles the problem of reconstructing the shape of a smooth mirror surface from a single image. In particular, we consider the case where the camera is observing the reflection of a static reference target in the unknown mirror. We first study the reconstruction problem given dense correspondences between 3D points on the reference target and image locations.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
December 2014
Low-dimensional representations are key to the success of many video classification algorithms. However, the commonly-used dimensionality reduction techniques fail to account for the fact that only part of the signal is shared across all the videos in one class. As a consequence, the resulting representations contain instance-specific information, which introduces noise in the classification process.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
June 2012
Most recent approaches to monocular nonrigid 3D shape recovery rely on exploiting point correspondences and work best when the whole surface is well textured. The alternative is to rely on either contours or shading information, which has only been demonstrated in very restrictive settings. Here, we propose a novel approach to monocular deformable shape recovery that can operate under complex lighting and handle partially textured surfaces.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
May 2011
Recovering the 3D shape of a nonrigid surface from a single viewpoint is known to be both ambiguous and challenging. Resolving the ambiguities typically requires prior knowledge about the most likely deformations that the surface may undergo. It often takes the form of a global deformation model that can be learned from training data.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
August 2007
Three-dimensional detection and shape recovery of a nonrigid surface from video sequences require deformation models to effectively take advantage of potentially noisy image data. Here, we introduce an approach to creating such models for deformable 3D surfaces. We exploit the fact that the shape of an inextensible triangulated mesh can be parameterized in terms of a small subset of the angles between its facets.
View Article and Find Full Text PDF