IEEE Trans Pattern Anal Mach Intell
July 2024
We introduce Ego4D, a massive-scale egocentric video dataset and benchmark suite. It offers 3,670 hours of daily-life activity video spanning hundreds of scenarios (household, outdoor, workplace, leisure, etc.) captured by 931 unique camera wearers from 74 worldwide locations and 9 different countries.
View Article and Find Full Text PDFDeep learning has been effective for histology image analysis in digital pathology. However, many current deep learning approaches require large, strongly- or weakly labeled images and regions of interest, which can be time-consuming and resource-intensive to obtain. To address this challenge, we present HistoPerm, a view generation method for representation learning using joint embedding architectures that enhances representation learning for histology images.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
December 2022
Few-shot learning aims to recognize novel classes from a few examples. Although significant progress has been made in the image domain, few-shot video classification is relatively unexplored. We argue that previous methods underestimate the importance of video feature learning and propose to learn spatiotemporal features using a 3D CNN.
View Article and Find Full Text PDFDeveloping deep learning models to analyze histology images has been computationally challenging, as the massive size of the images causes excessive strain on all parts of the computing pipeline. This paper proposes a novel deep learning-based methodology for improving the computational efficiency of histology image classification. The proposed approach is robust when used with images that have reduced input resolution, and it can be trained effectively with limited labeled data.
View Article and Find Full Text PDFContext: Histopathological characterization of colorectal polyps is critical for determining the risk of colorectal cancer and future rates of surveillance for patients. However, this characterization is a challenging task and suffers from significant inter- and intra-observer variability.
Aims: We built an automatic image analysis method that can accurately classify different types of colorectal polyps on whole-slide images to help pathologists with this characterization and diagnosis.
IEEE Trans Pattern Anal Mach Intell
June 2015
Traditional Web search engines do not use the images in the HTML pages to find relevant documents for a given query. Instead, they typically operate by computing a measure of agreement between the keywords provided by the user and only the text portion of each page. In this paper we study whether the content of the pictures appearing in a Web page can be used to enrich the semantic description of an HTML document and consequently boost the performance of a keyword-based search engine.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
October 2014
This paper describes compact image descriptors enabling accurate object categorization with linear classification models, which offer the advantage of being efficient to both train and test. The shared property of our descriptors is the use of classifiers to produce the features of each image. Intuitively, these classifiers evaluate the presence of a set of basis classes inside the image.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
February 2013
In this paper, we present a new approach for establishing correspondences between sparse image features related by an unknown nonrigid mapping and corrupted by clutter and occlusion, such as points extracted from images of different instances of the same object category. We formulate this matching task as an energy minimization problem by defining an elaborate objective function of the appearance and the spatial arrangement of the features. Optimization of this energy is an instance of graph matching, which is in general an NP-hard problem.
View Article and Find Full Text PDFIEEE Trans Pattern Anal Mach Intell
May 2008
This paper describes methods for recovering time-varying shape and motion of non-rigid 3D objects from uncalibrated 2D point tracks. For example, given a video recording of a talking person, we would like to estimate the 3D shape of the face at each instant, and learn a model of facial deformation. Time-varying shape is modeled as a rigid transformation combined with a non-rigid deformation.
View Article and Find Full Text PDF