Skip to content

metric

Frechet Audio Distance over PaSST embeddings (FD_PaSST).

Corpus-vs-corpus Fréchet distance between Gaussian moments of two PaSST 768-d embedding sets. Ports av_bench.metrics.fad.compute_fd 1:1.

References are supplied per-sample via reference_audio / role="reference", or once from a .pt cache at $FASTVIDEO_FAD_REF_FEATURES. Skips when either side has fewer than two finite embeddings.

Classes

fastvideo.eval.metrics.audio.frechet_distance.metric.FrechetAudioDistanceMetric

FrechetAudioDistanceMetric()

Bases: BaseMetric

Corpus-vs-corpus Frechet Audio Distance with PaSST embeddings.

Source code in fastvideo/eval/metrics/audio/frechet_distance/metric.py
def __init__(self) -> None:
    super().__init__()
    self._model: Any = None
    self._gen_buf: list[np.ndarray] = []
    self._ref_buf: list[np.ndarray] = []
    self._cached_ref_path: str | None = os.environ.get(REF_FEATURES_ENV)
    self._cached_ref_mu: np.ndarray | None = None
    self._cached_ref_sigma: np.ndarray | None = None
    self._n_cached_ref: int = 0

Functions