Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.vscode		.vscode
figs		figs
README.md		README.md

Repository files navigation

Video-Quality-Assessment-A-Comprehensive-Survey

A collection of papers related to Video Quality Assessment (VQA).

The organization of papers refers to our survey "Video Quality Assessment: A Comprehensive Survey". We will continue to update both arxiv paper and this repo considering the fast development of this field.

Please let us know if you have any suggestions by e-mail: qzheng21@m.fudan.edu.cn and tzz@tamu.edu.

If you find our survey useful for your research, please cite the following paper:

@misc{zheng2024videoqualityassessmentcomprehensive,
      title={Video Quality Assessment: A Comprehensive Survey}, 
      author={Qi Zheng and Yibo Fan and Leilei Huang and Tianyu Zhu and Jiaming Liu and Zhijian Hao and Shuo Xing and Chia-Ju Chen and Xiongkuo Min and Alan C. Bovik and Zhengzhong Tu},
      year={2024},
      eprint={2412.04508},
      archivePrefix={arXiv},
      primaryClass={eess.IV},
      url={https://arxiv.org/abs/2412.04508}, 
}

Table of Contents

Video-Quality-Assessment-A-Comprehensive-Survey

Taxonomy of Subjective and Objective Video Quality Assessment

Classification and Evolution of Objective Quality Assessment Models

Application Overview of Video Quality Assessment

Video Quality Assessment Datasets

Legacy Datasets

LIVE-VQA: Study of Subjective and Objective Quality Assessment of Video

CVD2014: CVD2014—A Database for Evaluating No-Reference Video Quality Assessment Algorithms

MCL-V: MCL-V: A streaming video quality assessment database

BVI-HFR: A Study of High Frame Rate Video Formats

LIVE-YT-HFR: Subjective and objective quality assessment of high frame rate videos

BVI-VFI: BVI-VFI: A Video Quality Database for Video Frame Interpolation

User-generated Content (UGC) Datasets

KoNViD-1k: The Konstanz natural video database (KoNViD-1k)

LIVE-VQC: Large-Scale Study of Perceptual Video Quality

YouTube-UGC: YouTube UGC Dataset for Video Compression Research

FlickrVid-150k: No-Reference Video Quality Assessment using Multi-Level Spatially Pooled Features

LSVQ: Patch-VQ: ‘Patching Up’ the Video Quality Problem

Youku-V1K: Perceptual quality assessment of internet videos

PUGCQ: PUGCQ: A Large Scale Dataset for Quality Assessment of Professional User-Generated Content

YT-UGC+: Rich features for perceptual quality assessment of UGC videos

LIVE-YT-Gaming: Subjective and Objective Analysis of Streamed Gaming Videos

Tele-VQA: Telepresence Video Quality Assessment

Maxwell: Towards Explainable In-the-Wild Video Quality Assessment: A Database and a Language-Prompted Approach

DIVIDE-3k: Exploring video quality assessment on user generated contents from aesthetic and technical perspectives

TaoLive: MD-VQA: Multi-Dimensional Quality Assessment for UGC Live Videos

KVQ: KVQ: Kwai video quality assessment for short-form videos

AI-generated Content (AIGC) Datasets

Chivileva et al.: Measuring the Quality of Text-to-Video Model Outputs: Metrics and Dataset

EvalCrafter: EvalCrafter: Benchmarking and Evaluating Large Video Generation Models

FETV: FETV: A Benchmark for Fine-Grained Evaluation of Open-Domain Text-to-Video Generation

VBench: VBench: Comprehensive Benchmark Suite for Video Generative Models

T2VQA-DB: Subjective-Aligned Dataset and Metric for Text-to-Video Quality Assessment

GAIA: GAIA: Rethinking Action Quality Assessment for AI-Generated Videos

LGVQ: Benchmarking AIGC Video Quality Assessment: A Dataset and Unified Model

Objective Video Quality Assessment Models

Full-Reference Video Quality Assessment

1) Knowledge-driven FR VQA methods

Type i: Pixel-error-based

MSE

PSNR

Type ii: Structural Similarity-based VQA

SSIM: Image quality assessment: from error visibility to structural similarity

MS-SSIM: Multiscale structural similarity for image quality assessment

CW-SSIM: Complex Wavelet Structural Similarity: A New Image Similarity Index

IW-SSIM: Information Content Weighting for Perceptual Image Quality Assessment

FSIM: FSIM: A Feature Similarity Index for Image Quality Assessment

ESSIM: Edge Strength Similarity for Image Quality Assessment

GMSD: Gradient Magnitude Similarity Deviation: A Highly Efficient Perceptual Image Quality Index

Liu et al.: Image Quality Assessment Based on Gradient Similarity

VSI: VSI: A Visual Saliency-Induced Index for Perceptual Image Quality Assessment

Video SSIM: Video quality assessment based on structural distortion measurement

Wang and Li: Video quality assessment using a statistical model of human visual speed perception

V-SSIM: A structural similarity metric for video based on motion models

MC-SSIM: Efficient Video Quality Assessment Along Temporal Trajectories

Manasa and Channappayya: An Optical Flow-Based Full Reference Video Quality Assessment Algorithm

3D-SSIM: 3D-SSIM for video quality assessment

Type iii: Neurostatistics-based VQA

VIF: Image information and visual quality

MAD: Most apparent distortion: full-reference image quality assessment and the role of strategy

ST-RRED: Video Quality Assessment by Reduced Reference Spatio-Temporal Entropic Differencing

ST-GREED: ST-GREED: Space-Time Generalized Entropic Differences for Frame Rate Dependent Video Quality Prediction

Type iv: Feature fusion-based VQA

VMAF: Toward a practical perceptual video quality metric

ST-VMAF: Spatiotemporal Feature Integration and Model Fusion for Full Reference Video Quality Assessment

E-VMAF: Spatiotemporal Feature Integration and Model Fusion for Full Reference Video Quality Assessment

FUNQUE: Funque: Fusion of Unified Quality Evaluators

FUNQUE+: One Transform to Compute Them All: Efficient Fusion-Based Full-Reference Video Quality Assessment

Type v: Low-level motion feature-based VQA

MOVIE: Motion Tuned Spatio-Temporal Quality Assessment of Natural Videos

ST-MAD: A spatiotemporal most-apparent-distortion model for video quality assessment

AFViQ: Attention Driven Foveated Video Quality Assessment

VQM-VFD: Temporal Video Quality Model Accounting for Variable Frame Delay Distortions

PVM: A Perception-Based Hybrid Model for Video Quality Assessment

FRQM: A frame rate dependent video quality metric based on temporal wavelet decomposition and spatiotemporal pooling

2) Deep Learning-Based FR IQA Methods

Type i: Deep feature-based IQA

Bosse et al.: Deep Neural Networks for No-Reference and Full-Reference Image Quality Assessment

DeepQA: Deep Learning of Human Visual Sensitivity in Image Quality Assessment Framework

Ahn et al.: Deep Learning-Based Distortion Sensitivity Prediction for Full-Reference Image Quality Assessment

SAMScore: SAMScore: A Semantic Structural Similarity Metric for Image Translation Evaluation

SAM-IQA: SAM-IQA: Can Segment Anything Boost Image Quality Assessment?

Type ii: Feature pyramid-based IQA

LPIPS: The Unreasonable Effectiveness of Deep Features as a Perceptual Metric

E-LPIPS: E-LPIPS: Robust Perceptual Image Similarity via Random Transformation Ensembles

DeepWSD: DeepWSD: Projecting Degradations in Perceptual Space to Wasserstein Distance in Deep Feature Space

Tariq et al.: Why Are Deep Representations Good Perceptual Quality Features?

DISTS: Image Quality Assessment: Unifying Structure and Texture Similarity

A-DISTS: Locally Adaptive Structure and Texture Similarity for Image Quality Assessment

TOPIQ: TOPIQ: A Top-Down Approach From Semantics to Distortions for Image Quality Assessment

3) Deep Learning Based FR VQA Methods

Type i: Temporal pooling-based VQA

FloLPIPS: FloLPIPS: A Bespoke Video Quality Metric for Frame Interpolation

C3DVQA: C3DVQA: Full-Reference Video Quality Assessment with 3D Convolutional Neural Network

DeepVQA: Deep Video Quality Assessor: From Spatio-temporal Visual Sensitivity to A Convolutional Neural Aggregation Network

Sun et al.: Deep Learning Based Full-Reference and No-Reference Quality Assessment Models for Compressed UGC Videos

Type ii: Temporal NN module-based VQA

Chen et al.: Deep Neural Networks for End-to-End Spatiotemporal Video Quality Prediction and Aggregation

DVQM-HT: Deep VQA based on a Novel Hybrid Training Methodology

STRA-VQA: Video Quality Assessment for Spatio-Temporal Resolution Adaptive Coding

No-Reference Video Quality Assessment

1) Knowledge-driven BVQA methods

Type i: Low-level visual feature-based VQA

CPBDM: A No-Reference Image Blur Metric Based on the Cumulative Probability of Blur Detection (CPBD)

LPCM: Image Sharpness Assessment Based on Local Phase Coherence

NJQA: No-Reference Quality Assessment of JPEG Images via a Quality Relevance Map

JPEG-NR: No-reference perceptual quality assessment of JPEG compressed images

TLVQM: Two-Level Approach for No-Reference Consumer Video Quality Assessment

CORNIA: Unsupervised feature learning framework for no-reference image quality assessment

HOSA: Blind Image Quality Assessment Based on High Order Statistics Aggregation

Type ii: Neurostatistics-based VQA

BRISQUE: No-Reference Image Quality Assessment in the Spatial Domain

GM-LOG: Blind Image Quality Assessment Using Joint Statistics of Gradient Magnitude and Laplacian Features

HIGRADE: No-Reference Quality Assessment of Tone-Mapped HDR Pictures

FRIQUEE: Perceptual quality prediction on authentically distorted images using a bag of features approach

V-BLIINDS: Blind Prediction of Natural Video Quality

Li et al.: Spatiotemporal Statistics for Video Quality Assessment

VIDEVAL: UGC-VQA: Benchmarking Blind Video Quality Assessment for User Generated Content

STS-QA: Blind Video Quality Assessment via Space-Time Slice Statistics

FAVER: FAVER: Blind quality prediction of variable frame rate videos

NIQE: Making a “Completely Blind” Image Quality Analyzer

IL-NIQE: A Feature-Enriched Completely Blind Image Quality Evaluator

NPQI: Blind Image Quality Assessment by Natural Scene Statistics and Perceptual Characteristics

SNP-NIQE: Unsupervised Blind Image Quality Evaluation via Statistical Measurements of Structure, Naturalness, and Perception

VIIDEO: A Completely Blind Video Integrity Oracle

STEM: Completely Blind Quality Assessment of User Generated Video Content

TPQI: Exploring the Effectiveness of Video Perceptual Representation in Blind Video Quality Assessment

SLEEQ: A no-reference video quality predictor for compression and scaling artifacts

BPRI: Blind Quality Assessment Based on Pseudo-Reference Image

VIQE: A Completely Blind Video Quality Evaluator

2) Deep Learning Based BIQA Methods

Type i: CNN feature extraction and FC fusion

Kang et al.: Convolutional Neural Networks for No-Reference Image Quality Assessment

Bosse et al.: A deep neural network for image quality assessment

MEON: End-to-End Blind Image Quality Assessment Using Deep Neural Networks

NIMA: NIMA: Neural Image Assessment

PQR: A Probabilistic Quality Representation Approach to Deep Blind Image Quality Prediction

DB-CNN: Blind Image Quality Assessment Using a Deep Bilinear Convolutional Neural Network

PaQ-2-PiQ: From Patches to Pictures (PaQ-2-PiQ): Mapping the Perceptual Space of Picture Quality

Sun et al.: Blind Quality Assessment for in-the-Wild Images via Hierarchical Feature Fusion and Iterative Mixed Database Training

Type ii: CNN feature extraction and special fusion

QCN: Blind Image Quality Assessment Based on Geometric Order Learning

GraphIQA: GraphIQA: Learning Distortion Graph Representations for Blind Image Quality Assessment

Gao et al.: Image Quality Assessment: From Mean Opinion Score to Opinion Score Distribution

FPR: Hallucinated-IQA: No-Reference Image Quality Assessment via Adversarial Learning

HIQA: No-Reference Image Quality Assessment by Hallucinating Pristine Features

BIECON: Fully Deep Blind Image Quality Predictor

Type iii: Transformer-based IQA

MUSIQ: MUSIQ: Multi-Scale Image Quality Transformer

TRIQ: Transformer For Image Quality Assessment

DEIQT: Data-Efficient Image Quality Assessment with Attention-Panel Decoder

TReS: No-Reference Image Quality Assessment via Transformers, Relative Ranking, and Self-Consistency

MANIQA: MANIQA: Multi-Dimension Attention Network for No-Reference Image Quality Assessment

Type iv: Multi-task learning

Zhang et al.: Continual Learning for Blind Image Quality Assessment

MetaIQA: MetaIQA: Deep Meta-Learning for No-Reference Image Quality Assessment

Su et al.: From Distortion Manifold to Perceptual Quality: a Data Efficient Blind Image Quality Assessment Approach

SLIF: Forgetting to Remember: A Scalable Incremental Learning Framework for Cross-Task Blind Image Quality Assessment

Li et al.: Continual Learning of Blind Image Quality Assessment with Channel Modulation Kernel

Wang et al.: Deep Blind Image Quality Assessment Powered by Online Hard Example Mining

Zhang et al.: Task-Specific Normalization for Continual Learning of Blind Image Quality Models

Type v: Unsupervised and self-supervised learning

CONTRIQUE: Image Quality Assessment Using Contrastive Learning

Shukla et al.: Opinion Unaware Image Quality Assessment via Adversarial Convolutional Variational Autoencoder

Re-IQA: Re-IQA: Unsupervised Learning for Image Quality Assessment in the Wild

Babu et al.: No Reference Opinion Unaware Quality Assessment of Authentically Distorted Images

ARNIQA: ARNIQA: Learning Distortion Manifold for Image Quality Assessment

Zhao et al: Quality-Aware Pre-Trained Models for Blind Image Quality Assessment

Type vi: Large multimodality model-based IQA

CLIP-IQA: Exploring CLIP for Assessing the Look and Feel of Images

CLIP-IQA+: Exploring CLIP for Assessing the Look and Feel of Images

LIQE: Blind Image Quality Assessment via Vision-Language Correspondence: A Multitask Learning Perspective

TCDs: Towards transparent deep image aesthetics assessment with tag-based content descriptors

Q-Bench: Q-Bench: A Benchmark for General-Purpose Foundation Models on Low-level Vision

Q-Instruct: Q-Instruct: Improving Low-level Visual Abilities for Multi-modality Foundation Models

Q-Align: Q-Align: Teaching LMMs for Visual Scoring via Discrete Text-Defined Levels

Co-Instruct: Towards Open-Ended Visual Quality Comparison

3) Deep Learning Based BVQA

Type i: 2D CNNs with simple score/feature averaging

Domonkos Varga: No-Reference Video Quality Assessment Based on the Temporal Pooling of Deep Features

NAVE: A No-Reference Autoencoder Video Quality Metric

CNN-TLVQM: Blind Natural Video Quality Prediction via Statistical Temporal Features and Deep Spatial Features

SIONR: Semantic Information Oriented No-Reference Video Quality Assessment

CenseoQoE: A strong baseline for image and video quality assessment

RAPIQUE: RAPIQUE: Rapid and Accurate Video Quality Prediction of User Generated Content

SWDF-DF-VQA: No-Reference Video Quality Assessment Using Multi-Pooled, Saliency Weighted Deep Features and Decision Fusion

NR-VMAF: No-Reference VMAF: A Deep Neural Network-Based Approach to Blind Video Quality Assessment

Type ii: 2D CNNs with temporal aggregation networks

MLSP-VQA: No-Reference Video Quality Assessment using Multi-Level Spatially Pooled Features

Varga et al.: No-reference video quality assessment via pretrained CNN and LSTM networks

VSFA: Quality Assessment of In-the-Wild Videos

MGQA: Video Quality Assessment for Online Processing: From Spatial to Temporal Sampling

RIRNet: RIRNet: Recurrent-In-Recurrent Network for Video Quality Assessment

STDAM: Perceptual Quality Assessment of Internet Videos

MDTVSFA: Unified Quality Assessment of in-the-Wild Videos with Mixed Datasets Training

AB-VQA: Attention Based Network For No-Reference UGC Video Quality Assessment

Li et al.: Study on no-reference video quality assessment method incorporating dual deep learning networks

GSTVQA: Learning Generalized Spatial-Temporal Deep Feature Representation for No-Reference Video Quality Assessment

2BiVQA: 2BiVQA: Double Bi-LSTM-based Video Quality Assessment of UGC Videos

Type iii: 3D CNNs / Transformation

SACONVA: No-Reference Video Quality Assessment With 3D Shearlet Transform and Convolutional Neural Networks

V-MEON: End-to-End Blind Quality Assessment of Compressed Videos Using Deep Neural Networks

You et al.: Deep Neural Networks for No-Reference Video Quality Assessment

Hou et al.: No-reference video quality evaluation by a deep transfer CNN architecture

Patch-VQ: Patch-VQ: 'Patching Up' the Video Quality Problem

CoINVQ: Rich Features for Perceptual Quality Assessment of UGC Videos

Sun et al.: A Deep Learning based No-reference Quality Assessment Model for UGC Videos

Li et al.: Blindly Assess Quality of In-the-Wild Videos via Quality-Aware Pre-Training and Motion Perception

MD-VQA: MD-VQA: Multi-Dimensional Quality Assessment for UGC Live Videos

UCDA: Unsupervised Curriculum Domain Adaptation for No-Reference Video Quality Assessment

Shen et al.: A Blind Video Quality Assessment Method via Spatiotemporal Pyramid Attention

Type iv: Transformer-based models

StarVQA: Starvqa: Space-Time Attention for Video Quality Assessment

PHIQNet: Long Short-term Convolutional Transformer for No-Reference Video Quality Assessment

DisCoVQA: DisCoVQA: Temporal Distortion-Content Transformers for Video Quality Assessment

FAST-VQA: FAST-VQA: Efficient End-to-End Video Quality Assessment with Fragment Sampling

SAMA: Scaling and Masking: A New Paradigm of Data Sampling for Image and Video Quality Assessment

DOVER: Exploring Video Quality Assessment on User Generated Contents from Aesthetic and Technical Perspectives

SSL-VQA: Knowledge Guided Semi-supervised Learning for Quality Assessment of User Generated Videos

Type v: Large multimodality model-based VQA

PTM-VQA: PTM-VQA: Efficient Video Quality Assessment Leveraging Diverse PreTrained Models from the Wild

COVER: COVER: A Comprehensive Video Quality Evaluator

KSVQE: KVQ: Kwai Video Quality Assessment for Short-form Videos

Wen et al.: Modular Blind Video Quality Assessment

BUONA-VISTA: Exploring Opinion-unaware Video Quality Assessment with Semantic Affinity Criterion

MaxVQA: Towards Explainable In-the-Wild Video Quality Assessment: A Database and a Language-Prompted Approach

ZE-FESG: ZE-FESG: A Zero-Shot Feature Extraction Method Based on Semantic Guidance for No-Reference Video Quality Assessment

Q-Align: Q-Align: Teaching LMMs for Visual Scoring via Discrete Text-Defined Levels

LMM-VQA: LMM-VQA: Advancing Video Quality Assessment with Large Multimodal Models