Skip to content

metric

AudioBox Aesthetics (CE, CU, PC, PQ).

Thin wrapper around Meta's audiobox_aesthetics predictor. Returns four per-clip dimensions (CE — Content Enjoyment, CU — Content Usefulness, PC — Production Complexity, PQ — Production Quality); score exposes PQ, the dimension V2A papers typically report on. The remaining three are surfaced under details.

The earlier Verse-Bench combined score (CE + CU + PQ + (11 − PC)) / 4 is non-standard and is deliberately not used.

Classes

fastvideo.eval.metrics.audio.audiobox_aesthetics.metric.AudioBoxAestheticsMetric

AudioBoxAestheticsMetric()

Bases: BaseMetric

AudioBox Aesthetics: PQ as the primary score, CE/CU/PC/PQ in details.

Source code in fastvideo/eval/metrics/audio/audiobox_aesthetics/metric.py
def __init__(self) -> None:
    super().__init__()
    self._predictor: Any = None

Functions