Publications by authors named "Yebin Liu"

Article Synopsis
  • This study introduces a new method for compressing dense light field images captured by Plenoptic 2.0 cameras, using advanced statistical models like the 5-D Epanechnikov Kernel.
  • To address limitations in traditional modeling techniques, the researchers propose a novel 5-D Epanechnikov Mixture-of-Experts approach that uses Gaussian Initialization, which performs better than existing models like 5-D Gaussian Mixture Regression.
  • Experimental results show that this new compression method produces higher quality rendered images than High Efficiency Video Coding (HEVC) and JPEG 2000, especially at low bit depths below 0.06 bits per pixel (bpp).
View Article and Find Full Text PDF

We present a general, fast, and practical solution for interpolating novel views of diverse real-world scenes given a sparse set of nearby views. Existing generic novel view synthesis methods rely on time-consuming scene geometry pre-computation or redundant sampling of the entire space for neural volumetric rendering, limiting the overall efficiency. Instead, we incorporate learned MVS priors into the neural volume rendering pipeline while improving the rendering efficiency by reducing sampling points under the guidance of depth probability distributions.

View Article and Find Full Text PDF

Understandings of the three-dimensional social behaviors of freely moving large-size mammals are valuable for both agriculture and life science, yet challenging due to occlusions in close interactions. Although existing animal pose estimation methods captured keypoint trajectories, they ignored deformable surfaces which contained geometric information essential for social interaction prediction and for dealing with the occlusions. In this study, we develop a Multi-Animal Mesh Model Alignment (MAMMAL) system based on an articulated surface mesh model.

View Article and Find Full Text PDF

Estimating human pose and shape from monocular images is a long-standing problem in computer vision. Since the release of statistical body models, 3D human mesh recovery has been drawing broader attention. With the same goal of obtaining well-aligned and physically plausible mesh results, two paradigms have been developed to overcome challenges in the 2D-to-3D lifting process: i) an optimization-based paradigm, where different data terms and regularization terms are exploited as optimization objectives; and ii) a regression-based paradigm, where deep learning techniques are embraced to solve the problem in an end-to-end fashion.

View Article and Find Full Text PDF

Recent neural rendering methods have made great progress in generating photorealistic human avatars. However, these methods are generally conditioned only on low-dimensional driving signals (e.g.

View Article and Find Full Text PDF

We present PyMAF-X, a regression-based approach to recovering a parametric full-body model from a single image. This task is very challenging since minor parametric deviation may lead to noticeable misalignment between the estimated mesh and the input image. Moreover, when integrating part-specific estimations into the full-body model, existing solutions tend to either degrade the alignment or produce unnatural wrist poses.

View Article and Find Full Text PDF

African swine fever virus (ASFV) infection causes substantial economic losses to the swine industry worldwide, and there are still no safe and effective vaccines or therapeutics available. The granulated virus antigen improves the antigen present process and elicits high antibody reaction than the subunit antigen. In this study, the SpyTag peptide-p10 fusion protein was altered and displayed on the surface of the T7 phage to construct an engineered phage (T7-ST).

View Article and Find Full Text PDF

In this paper, we propose a controllable high-quality free viewpoint video generation method based on the motion graph and neural radiance fields (NeRF). Different from existing pose-driven NeRF or time/structure conditioned NeRF works, we propose to first construct a directed motion graph of the captured sequence. Such a sequence-motion-parameterization strategy not only enables flexible pose control for free viewpoint video rendering but also avoids redundant calculation of similar poses and thus improves the overall reconstruction efficiency.

View Article and Find Full Text PDF

Garment representation, editing and animation are challenging topics in the area of computer vision and graphics. It remains difficult for existing garment representations to achieve smooth and plausible transitions between different shapes and topologies. In this work, we introduce, DeepCloth, a unified framework for garment representation, reconstruction, animation and editing.

View Article and Find Full Text PDF

Typical learning-based light field reconstruction methods demand in constructing a large receptive field by deepening their networks to capture correspondences between input views. In this paper, we propose a spatial-angular attention network to perceive non-local correspondences in the light field, and reconstruct high angular resolution light field in an end-to-end manner. Motivated by the non-local attention mechanism (Wang et al.

View Article and Find Full Text PDF

In this paper, we propose an efficient method for robust and accurate 3D self-portraits using a single RGBD camera. Our method can generate detailed and realistic 3D self-portraits in seconds and shows the ability to handle subjects wearing extremely loose clothes. To achieve highly efficient and robust reconstruction, we propose PIFusion, which combines learning-based 3D recovery with volumetric non-rigid fusion to generate accurate sparse partial scans of the subject.

View Article and Find Full Text PDF

Realistic speech-driven 3D facial animation is a challenging problem due to the complex relationship between speech and face. In this paper, we propose a deep architecture, called Geometry-guided Dense Perspective Network (GDPnet), to achieve speaker-independent realistic 3D facial animation. The encoder is designed with dense connections to strengthen feature propagation and encourage the re-use of audio features, and the decoder is integrated with an attention mechanism to adaptively recalibrate point-wise feature responses by explicitly modeling interdependencies between different neuron units.

View Article and Find Full Text PDF

3D human reconstruction from a single image is a challenging problem. Existing methods have difficulties to infer 3D clothed human models with consistent topologies for various poses. In this paper, we propose an efficient and effective method using a hierarchical graph transformation network.

View Article and Find Full Text PDF

The light field (LF) reconstruction is mainly confronted with two challenges, large disparity and the non-Lambertian effect. Typical approaches either address the large disparity challenge using depth estimation followed by view synthesis or eschew explicit depth information to enable non-Lambertian rendering, but rarely solve both challenges in a unified framework. In this paper, we revisit the classic LF rendering framework to address both challenges by incorporating it with advanced deep learning techniques.

View Article and Find Full Text PDF

Modeling 3D humans accurately and robustly from a single image is very challenging, and the key for such an ill-posed problem is the 3D representation of the human models. To overcome the limitations of regular 3D representations, we propose Parametric Model-Conditioned Implicit Representation (PaMIR), which combines the parametric body model with the free-form deep implicit function. In our PaMIR-based reconstruction framework, a novel deep neural network is proposed to regularize the free-form deep implicit function using the semantic features of the parametric model, which improves the generalization ability under the scenarios of challenging poses and various clothing topologies.

View Article and Find Full Text PDF

Human pose transfer, which aims at transferring the appearance of a given person to a target pose, is very challenging and important in many applications. Previous work ignores the guidance of pose features or only uses local attention mechanism, leading to implausible and blurry results. We propose a new human pose transfer method using a generative adversarial network (GAN) with simplified cascaded blocks.

View Article and Find Full Text PDF

We introduce MulayCap, a novel human performance capture method using a monocular video camera without the need for pre-scanning. The method uses "multi-layer" representations for geometry reconstruction and texture rendering, respectively. For geometry reconstruction, we decompose the clothed human into multiple geometry layers, namely a body mesh layer and a garment piece layer.

View Article and Find Full Text PDF

This paper proposes a new method for simultaneous 3D reconstruction and semantic segmentation for indoor scenes. Unlike existing methods that require recording a video using a color camera and/or a depth camera, our method only needs a small number of (e.g.

View Article and Find Full Text PDF

Noni is a dry and mature fruit of Morinda citrifolia, which is widely distributed in the islands in the southern Pacific Ocean and the Indochina Peninsula in Asia. It is edible and has been used as a natural medicine for thousands of years. At present, Noni has been legally introduced into China, but there is no clear standard of traditional Chinese medicine properties and clinical application of traditional Chinese medicine, which greatly limits the application of compatibility with traditional Chinese medicine in China.

View Article and Find Full Text PDF

As an emerging imaging modality, transient imaging that records the transient information of light transport has significantly shaped our understanding of scenes. In spite of the great progress made in computer vision and optical imaging fields, commonly used multi-frequency time-of-flight (ToF) sensors are still afflicted with the band-limited modulation frequency and long acquisition process. To overcome such barriers, more effective image-formation schemes and reconstruction algorithms are highly desired.

View Article and Find Full Text PDF

While dynamic scene reconstruction has made revolutionary progress from the earliest setup using a mass of static cameras in studio environment to the latest egocentric or hand-held moving camera based schemes, it is still restricted by the recording volume, user comfortability, human labor and expertise. In this paper, a novel solution is proposed through a real-time and robust dynamic fusion scheme using a single flying depth camera, denoted as FlyFusion. By proposing a novel topology compactness strategy for effectively regularizing the complex topology changes, and the Geometry And Motion Energy (GAME) metric for guiding the viewpoint optimization in the volumetric space, FlyFusion succeeds to enable intelligent viewpoint selection based on the immediate dynamic reconstruction result.

View Article and Find Full Text PDF

We propose DoubleFusion, a new real-time system that combines volumetric non-rigid reconstruction with data-driven template fitting to simultaneously reconstruct detailed surface geometry, large non-rigid motion and the optimized human body shape from a single depth camera. One of the key contributions of this method is a double-layer representation consisting of a complete parametric body model inside, and a gradually fused detailed surface outside. A pre-defined node graph on the body parameterizes the non-rigid deformations near the body, and a free-form dynamically changing graph parameterizes the outer surface layer far from the body, which allows more general reconstruction.

View Article and Find Full Text PDF

We propose a method to automatically detect 3D poses of closely interactive humans from sparse multi-view images at one time instance. It is a challenging problem due to the strong partial occlusion and truncation between humans and no tracking process to provide priori poses information. To solve this problem, we first obtain 2D joints in every image using OpenPose and human semantic segmentation results from Mask R-CNN.

View Article and Find Full Text PDF

A high-quality 4D geometry and texture reconstruction for human activities usually requires multiview perceptions via highly structured multi-camera setup, where both the specifically designed cameras and the tedious pre-calibration restrict the popularity of professional multi-camera systems for daily applications. In this paper, we propose UnstructuredFusion, a practicable realtime markerless human performance capture method using unstructured commercial RGBD cameras. Along with the flexible hardware setup using simply three unstructured RGBD cameras without any careful pre-calibration, the challenge 4D reconstruction through multiple asynchronous videos is solved by proposing three novel technique contributions, i.

View Article and Find Full Text PDF

Research in light field reconstruction focuses on synthesizing novel views with the assistance of depth information. In this paper, we present a learning-based light field reconstruction approach by fusing a set of sheared epipolar plane images (EPIs). We start by showing that a patch in a sheared EPI will exhibit a clear structure when the sheared value equals the depth of that patch.

View Article and Find Full Text PDF