Skip to content

videoscore2

Modules

fastvideo.eval.metrics.videoscore2.metric

VideoScore2 — VLM-based video quality scoring.

Uses a Qwen2.5-VL model fine-tuned to score generated videos on three dimensions: visual quality, text-to-video alignment, and physical consistency. Scores are extracted from token logits as upstream's ll_based_soft_score_normed weighting (1-5 scale).

Reference: TIGER-AI-Lab/VideoScore2 (vs2_inference.py).

Classes

fastvideo.eval.metrics.videoscore2.metric.VideoScore2Metric
VideoScore2Metric(model_name: str = 'TIGER-Lab/VideoScore2', infer_fps: float = 2.0, max_tokens: int = 1024, temperature: float = 0.7, do_sample: bool = True)

Bases: BaseMetric

VideoScore2: VLM-based video quality scoring (3 dimensions).

Requires sample["text_prompt"] for text-to-video alignment. Supports batched generation for GPU efficiency.

Source code in fastvideo/eval/metrics/videoscore2/metric.py
def __init__(
    self,
    model_name: str = "TIGER-Lab/VideoScore2",
    infer_fps: float = 2.0,
    max_tokens: int = 1024,
    temperature: float = 0.7,
    do_sample: bool = True,
) -> None:
    super().__init__()
    self._model_name = model_name
    self.infer_fps = infer_fps
    self.max_tokens = max_tokens
    self.temperature = temperature
    self.do_sample = do_sample
    self._model: Any = None
    self._processor: Any = None
    self._tokenizer: Any = None

Functions