Recently, many deep learning based researches are conducted to explore the potential quality improvement of compressed videos. These methods mostly utilize either the spatial or temporal information to perform frame-level video enhancement. However, they fail in combining different spatial-temporal information to adaptively utilize adjacent patches to enhance the current patch and achieve limited enhancement performance especially on scene-changing and strong-motion videos. To overcome these limitations, we propose a patch-wise spatial-temporal quality enhancement network which firstly extracts spatial and temporal features, then recalibrates and fuses the obtained spatial and temporal features. Specifically, we design a temporal and spatial-wise attention-based feature distillation structure to adaptively utilize the adjacent patches for distilling patch-wise temporal features. For adaptively enhancing different patch with spatial and temporal information, a channel and spatial-wise attention fusion block is proposed to achieve patch-wise recalibration and fusion of spatial and temporal features. Experimental results demonstrate our network achieves peak signal-to-noise ratio improvement, 0.55 - 0.69 dB compared with the compressed videos at different quantization parameters, outperforming state-of-the-art approach.

Download full-text PDF

Source
http://dx.doi.org/10.1109/TIP.2021.3092949DOI Listing

Publication Analysis

Top Keywords

spatial temporal
20
temporal features
16
patch-wise spatial-temporal
8
spatial-temporal quality
8
quality enhancement
8
compressed videos
8
adaptively utilize
8
utilize adjacent
8
adjacent patches
8
temporal
7

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!