Enhancing VVC with Deep Learning based Multi-Frame Post-Processing

About

This paper describes a CNN-based multi-frame post-processing approach based on a perceptually-inspired Generative Adversarial Network architecture, CVEGAN. This method has been integrated with the Versatile Video Coding Test Model (VTM) 15.2 to enhance the visual quality of the final reconstructed content. The evaluation results on the CLIC 2022 validation sequences show consistent coding gains over the original VVC VTM at the same bitrates when assessed by PSNR. The integrated codec has been submitted to the Challenge on Learned Image Compression (CLIC) 2022 (video track), and the team name associated with this submission is BVI_VC.


Source code

The source code of CVEGAN has been released at here

Framework and Model


Results

Participated the Challenge on Learned Image Compression (CLIC) in IEEE/CVF CVPR 2022, and ranks top six in the video track..


Citation

@misc{MFCNN_PP,
  doi = {10.48550/ARXIV.2205.09458},
  url = {https://arxiv.org/abs/2205.09458},
  author = {Danier, Duolikun and Feng, Chen and Zhang, Fan and Bull, David},  
  keywords = {Image and Video Processing (eess.IV), FOS: Electrical engineering, electronic engineering, information engineering, FOS: Electrical engineering, electronic engineering, information engineering}, 
  title = {Enhancing VVC with Deep Learning based Multi-Frame Post-Processing}, 
  publisher = {arXiv},
  year = {2022},
  copyright = {Creative Commons Attribution Non Commercial Share Alike 4.0 International}}[paper]