In this article, we introduce SMPLicit, a novel generative model to jointly represent body pose, shape and clothing geometry; and LayerNet, a deep network that given a single image of a person simultaneously performs detailed 3D reconstruction of body and clothes. In contrast to existing learning-based approaches that require training specific models for each type of garment, SMPLicit can represent in a unified manner different garment topologies (e.g. from sleeveless tops to hoodies and open jackets), while controlling other properties like garment size or tightness/looseness. LayerNet follows a coarse-to-fine multi-stage strategy by first predicting smooth cloth geometries from SMPLicit, which are then refined by an image-guided displacement network that gracefully fits the body recovering high-frequency details and wrinkles. LayerNet achieves competitive accuracy in the task of 3D reconstruction against current 'garment-agnostic' state of the art for images of people in up-right positions and controlled environments, and consistently surpasses these methods on challenging body poses and uncontrolled settings. Furthermore, the semantically rich outcome of our approach is suitable for performing Virtual Try-on tasks directly on 3D, a task which, so far, has only been addressed in the 2D domain.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1109/TPAMI.2023.3332677 | DOI Listing |
IEEE Trans Pattern Anal Mach Intell
February 2024
In this article, we introduce SMPLicit, a novel generative model to jointly represent body pose, shape and clothing geometry; and LayerNet, a deep network that given a single image of a person simultaneously performs detailed 3D reconstruction of body and clothes. In contrast to existing learning-based approaches that require training specific models for each type of garment, SMPLicit can represent in a unified manner different garment topologies (e.g.
View Article and Find Full Text PDFIEEE Comput Graph Appl
November 2023
We have collected a novel, nighttime scene dataset, called Rebecca, including 600 real images captured at night with pixel-level semantic annotations, which is currently scarce and can be invoked as a new benchmark. In addition, we proposed a one-step layered network, named LayerNet, to combine local features rich in appearance information in the shallow layer, global features abundant in semantic information in the deep layer, and middle-level features in between by explicitly modeling multistage features of objects in the nighttime. In addition, a multihead decoder and a well-designed hierarchical module are utilized to extract and fuse features of different depths.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!