Publications by authors named "Gerard Medioni"

We propose a method designed to push the frontiers of unconstrained face recognition in the wild with an emphasis on extreme out-of-plane pose variations. Existing methods either expect a single model to learn pose invariance by training on massive amounts of data or else normalize images by aligning faces to a single frontal pose. Contrary to these, our method is designed to explicitly tackle pose variations.

View Article and Find Full Text PDF

This paper concerns the problem of facial landmark detection. We provide a unique new analysis of the features produced at intermediate layers of a convolutional neural network (CNN) trained to regress facial landmark coordinates. This analysis shows that while being processed by the CNN, face images can be partitioned in an unsupervised manner into subsets containing faces in similar poses (i.

View Article and Find Full Text PDF

Geographic atrophy (GA) is a manifestation of the advanced or late stage of age-related macular degeneration (AMD). AMD is the leading cause of blindness in people over the age of 65 in the western world. The purpose of this study is to develop a fully automated supervised pixel classification approach for segmenting GA, including uni- and multifocal patches in fundus autofluorescene (FAF) images.

View Article and Find Full Text PDF

We address the problem of structure learning of human motion in order to recognize actions from a continuous monocular motion sequence of an arbitrary person from an arbitrary viewpoint. Human motion sequences are represented by multivariate time series in the joint-trajectories space. Under this structured time series framework, we first propose Kernelized Temporal Cut (KTC), an extension of previous works on change-point detection by incorporating Hilbert space embedding of distributions, to handle the nonparametric and high dimensionality issues of human motions.

View Article and Find Full Text PDF

Purpose: Geographic atrophy (GA) is the atrophic late-stage manifestation of age-related macular degeneration (AMD), which may result in severe vision loss and blindness. The purpose of this study was to develop a reliable, effective approach for GA segmentation in both spectral-domain optical coherence tomography (SD-OCT) and fundus autofluorescence (FAF) images using a level set-based approach and to compare the segmentation performance in the two modalities.

Methods: To identify GA regions in SD-OCT images, three retinal surfaces were first segmented in volumetric SD-OCT images using a double-surface graph search scheme.

View Article and Find Full Text PDF

Existing hierarchical techniques that decompose an image into a smooth image and high frequency components based on Gaussian filter and bilateral filter suffer from halo effects, whereas techniques based on weighted least squares extract low contrast features as details. Other techniques require multiple images and are not tolerant to noise. We use a single image to enhance sharpness based on a hierarchical framework using a modified Laplacian pyramid.

View Article and Find Full Text PDF

Retinal prostheses for the blind have demonstrated the ability to provide the sensation of light in otherwise blind individuals. However, visual task performance in these patients remains poor relative to someone with normal vision. Computer vision algorithms for navigation and object detection were evaluated for their ability to improve task performance.

View Article and Find Full Text PDF

We prove a closed-form solution to tensor voting (CFTV): Given a point set in any dimensions, our closed-form solution provides an exact, continuous, and efficient algorithm for computing a structure-aware tensor that simultaneously achieves salient structure detection and outlier attenuation. Using CFTV, we prove the convergence of tensor voting on a Markov random field (MRF), thus termed as MRFTV, where the structure-aware tensor at each input site reaches a stationary state upon convergence in structure propagation. We then embed structure-aware tensor into expectation maximization (EM) for optimizing a single linear structure to achieve efficient and robust parameter estimation.

View Article and Find Full Text PDF

We present a light-weight, cheap and low-power, wearable system for assisting the visually impaired in performing routine mobility tasks. Our system extends the range of the white cane by providing the user with vibro-tactile cues corresponding to the location of obstacles and a safe path for traversal through a cluttered environment. The presented approach keeps cognitive load to a minimum, and while being autonomous, adapts to the changing mobility requirements of a navigating user.

View Article and Find Full Text PDF

We study the problem of image denoising where images are assumed to be samples from low dimensional (sub)manifolds. We propose the algorithm of locally linear denoising. The algorithm approximates manifolds with locally linear patches by constructing nearest neighbor graphs.

View Article and Find Full Text PDF

We propose a framework for tracking multiple targets, where the input is a set of candidate regions in each frame, as obtained from a state-of-the-art background segmentation module, and the goal is to recover trajectories of targets over time. Due to occlusions by targets and static objects, as also by noisy segmentation and false alarms, one foreground region may not correspond to one target faithfully. Therefore, the one-to-one assumption used in most data association algorithms is not always satisfied.

View Article and Find Full Text PDF

We present a novel local spatiotemporal approach to produce motion segmentation and dense temporal trajectories from an image sequence. A common representation of image sequences is a 3D spatiotemporal volume, (x,y,t), and its corresponding mathematical formalism is the fiber bundle. However, directly enforcing the spatiotemporal smoothness constraint is difficult in the fiber bundle representation.

View Article and Find Full Text PDF

This study presents methods to 2-D registration of retinal image sequences and 3-D shape inference from fluorescein images. The Y-feature is a robust geometric entity that is largely invariant across modalities as well as across the temporal grey level variations induced by the propagation of the dye in the vessels. We first present a Y-feature extraction method that finds a set of Y-feature candidates using local image gradient information.

View Article and Find Full Text PDF

We present a method for detecting motion regions in video sequences observed by a moving camera, in the presence of strong parallax due to static 3D structures. The proposed method classifies each image pixel into planar background, parallax or motion regions by sequentially applying 2D planar homographies, the epipolar constraint and a novel geometric constraint, called "structure consistency constraint". The structure consistency constraint, as the main contribution of this paper, is derived from the relative camera poses among three frames and implemented within the "Plane+Parallax" framework.

View Article and Find Full Text PDF

We present a method for the 3-D shape reconstruction of the retinal fundus from stereo paired images. Detection of retinal elevation plays a critical role in the diagnosis and management of many retinal diseases. However, since the shape of ocular fundus is nearly planar, its 3-D depth range is very narrow.

View Article and Find Full Text PDF

We address the fundamental problem of matching in two static images. The remaining challenges are related to occlusion and lack of texture. Our approach addresses these difficulties within a perceptual organization framework, considering both binocular and monocular cues.

View Article and Find Full Text PDF

The three-dimensional (3-D) reconstruction of generalized cylinders (GCs) is an important research field in computer vision. One of the main difficulties is that some contour features in images cannot be reconstructed by traditional stereovision because they do not correspond to reflectance discontinuities of surface in space. In this paper, we present a novel, parametric approach for the 3-D reconstruction of circular generalized cylinders (CGCs) only from the limb edges of CGCs in two images.

View Article and Find Full Text PDF

Most approaches for motion analysis and interpretation rely on restrictive parametric models and involve iterative methods which depend heavily on initial conditions and are subject to instability. Further difficulties are encountered in image regions where motion is not smooth-typically around motion boundaries. This work addresses the problem of visual motion analysis and interpretation by formulating it as an inference of motion layers from a noisy and possibly sparse point set in a 4D space.

View Article and Find Full Text PDF

We address the problem of simultaneous two-view epipolar geometry estimation and motion segmentation from nonstatic scenes. Given a set of noisy image pairs containing matches of n objects, we propose an unconventional, efficient, and robust method, 4D tensor voting, for estimating the unknown n epipolar geometries, and segmenting the static and motion matching pairs into n independent motions. By considering the 4D isotropic and orthogonal joint image space, only two tensor voting passes are needed, and a very high noise to signal ratio (up to five) can be tolerated.

View Article and Find Full Text PDF

Most computer vision applications require the reliable detection of boundaries. In the presence of outliers, missing data, orientation discontinuities, and occlusion, this problem is particularly challenging. We propose to address it by complementing the tensor voting framework, which was limited to second order properties, with first order representation and voting.

View Article and Find Full Text PDF