In order to enhance the matting performance in multi-person dynamic scenarios, we introduce a robust, real-time, high-resolution, and controllable human video matting method that achieves state of the art on all metrics. Unlike most existing methods that perform video matting frame by frame as independent images, we design a unified architecture using a controllable generation model to solve the problem of the lack of overall semantic information in multi-person video. Our method, called ControlMatting, uses an independent recurrent architecture to exploit temporal information in videos and achieves significant improvements in temporal coherence and detailed matting quality. ControlMatting adopts a mixed training strategy comprised of matting and a semantic segmentation dataset, which effectively improves the semantic understanding ability of the model. Furthermore, we propose a novel deep learning-based image filter algorithm that enforces our detailed augmentation ability on both matting and segmentation objectives. Our experiments have proved that prior information about the human body from the image itself can effectively combat the defect masking problem caused by complex dynamic scenarios with multiple people.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11086136 | PMC |
http://dx.doi.org/10.3390/s24092795 | DOI Listing |
IEEE Trans Pattern Anal Mach Intell
September 2024
Multi-view multi-human association and tracking (MvMHAT), is an emerging yet important problem for multi-person scene video surveillance, aiming to track a group of people over time in each view, as well as to identify the same person across different views at the same time, which is different from previous MOT and multi-camera MOT tasks only considering the over-time human tracking. This way, the videos for MvMHAT require more complex annotations while containing more information for self-learning. In this work, we tackle this problem with an end-to-end neural network in a self-supervised learning manner.
View Article and Find Full Text PDFJ Clin Monit Comput
October 2024
Department of Convergence Medicine, Asan Medical Institute of Convergence Science and Technology, University of Ulsan College of Medicine, Asan Medical Center, Seoul, 05505, Republic of Korea.
Hand hygiene among anesthesia personnel is important to prevent hospital-acquired infections in operating rooms; however, an efficient monitoring system remains elusive. In this study, we leverage a deep learning approach based on operating room videos to detect alcohol-based hand hygiene actions of anesthesia providers. Videos were collected over a period of four months from November, 2018 to February, 2019, at a single operating room.
View Article and Find Full Text PDFSensors (Basel)
April 2024
School of Information and Communication Engineering, Beijing University of Posts and Telecommunications, Beijing 100876, China.
In order to enhance the matting performance in multi-person dynamic scenarios, we introduce a robust, real-time, high-resolution, and controllable human video matting method that achieves state of the art on all metrics. Unlike most existing methods that perform video matting frame by frame as independent images, we design a unified architecture using a controllable generation model to solve the problem of the lack of overall semantic information in multi-person video. Our method, called ControlMatting, uses an independent recurrent architecture to exploit temporal information in videos and achieves significant improvements in temporal coherence and detailed matting quality.
View Article and Find Full Text PDFJ Imaging
December 2023
Alqualsadi Research Team, Rabat IT Center, ENSIAS, Mohammed V University in Rabat, Rabat 10112, Morocco.
Three-dimensional human pose estimation has made significant advancements through the integration of deep learning techniques. This survey provides a comprehensive review of recent 3D human pose estimation methods, with a focus on monocular images, videos, and multi-view cameras. Our approach stands out through a systematic literature review methodology, ensuring an up-to-date and meticulous overview.
View Article and Find Full Text PDFNeuroimage
August 2023
Neuroscience of Perception and Action Lab, Italian Institute of Technology (IIT), Viale Regina Elena 291, Rome, Italy.
Synchronization of neural activity across brains - Interpersonal Neural Synchrony (INS) - is emerging as a powerful marker of social interaction that predicts success of multi-person coordination, communication, and cooperation. As the origins of INS are poorly understood, we tested whether and how INS might emerge from spontaneous dyadic behavior. We recorded neural activity (EEG) and human behavior (full-body kinematics, eye movements, and facial expressions) while dyads of participants were instructed to look at each other without speaking or making co-verbal gestures.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!