desync
¶
Modules¶
fastvideo.eval.metrics.audio.desync.metric
¶
Synchformer DeSync — average audio-video desynchronization in seconds.
Per-sample. Ports hkchengrex/av-benchmark's DeSync against the
24-01-04T16-39-21 Synchformer checkpoint (the one MMAudio reports
against). argmax over the 21-class grid in [-2, +2] seconds for
the first 14 and last 14 segments separately; lower is better, 0 = ideal.
Synchformer's transformer carries a fixed positional embedding sized for
~14 segments per modality, so keep input clips in the 3-10 s range —
shorter clips raise in _segment_video and longer clips ignore the
middle.
Classes¶
fastvideo.eval.metrics.audio.desync.metric.DeSyncMetric
¶
DeSyncMetric(src_fps: float | None = None)
Bases: BaseMetric
Per-sample Synchformer DeSync magnitude in seconds (lower is better).