extractors
¶
Pluggable video feature extractors for FVD.
Three extractors, sharing the _BaseExtractor contract:
i3d— Kinetics-400 I3D (TorchScript,flateon/FVD-I3D-torchscript). The standard FVD feature space used in the literature.clip— CLIP ViT-B/32 per-frame embeddings, mean-pooled over time. Captures semantic / content quality.videomae— VideoMAE-base last-hidden-state, mean-pooled over patch tokens. Captures structural / motion quality.
The contract is intentionally narrow: each extractor takes a
(B, T, C, H, W) float tensor in [0, 1] and returns (B, D) numpy
features. Preprocessing (resize, normalize, layout) is the extractor's job;
its callers should not care.
Functions¶
fastvideo.eval.metrics.common.fvd.extractors.load_extractor
¶
load_extractor(name: str, device: device) -> _BaseExtractor
Instantiate the named extractor on device. Raises ValueError on unknown names.