datasets
¶
Prompt-corpus datasets for end-to-end benchmark evaluation.
Public API mirrors :mod:fastvideo.eval (metrics side):
from fastvideo.eval.datasets import (
PromptDataset, Sample,
register_dataset, get_dataset, list_datasets,
)
A dataset is an iterable of plain dicts (one per sample). Built-in
datasets self-register at import time. To add one, drop a module into
this package that subclasses :class:PromptDataset and decorates with
@register_dataset("name") — auto-discovery picks it up.
Classes¶
fastvideo.eval.datasets.PromptDataset
¶
fastvideo.eval.datasets.Sample
¶
Bases: TypedDict
Documented schema for a row yielded by :class:PromptDataset.
Only prompt is required. Extra keys beyond these are forwarded to
the runner's eval-kwargs builder verbatim, so action-conditioned or
audio-bearing benchmarks can add their own fields without changing
the base class.
fastvideo.eval.datasets.VBenchPromptDataset
¶
Bases: PromptDataset
VBench prompts filtered by evaluation dimension.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dimensions
|
list[str] | str
|
List of dimension names, or |
'all'
|
full_info_path
|
str | Path | None
|
Optional override for |
None
|
A prompt that belongs to several requested dimensions is yielded once;
its dimensions list carries all matches so the scorer can route.
Source code in fastvideo/eval/datasets/vbench.py
Functions¶
fastvideo.eval.datasets.get_dataset
¶
Instantiate a registered dataset by name.
Source code in fastvideo/eval/datasets/registry.py
fastvideo.eval.datasets.list_datasets
¶
fastvideo.eval.datasets.register_dataset
¶
register_dataset(name: str)
Decorator to register a prompt-dataset class.
Usage::
@register_dataset("vbench")
class VBenchPromptDataset(BasePromptDataset):
...
Source code in fastvideo/eval/datasets/registry.py
Modules¶
fastvideo.eval.datasets.base
¶
Prompt-corpus datasets.
A :class:PromptDataset is an iterable of sample dicts describing the
prompts and conditions for a benchmark. Each sample is a plain dict —
no dataclass, no schema enforcement — that flows directly into both
generation (VideoGenerator.generate_video(**sample)) and scoring
(Evaluator.evaluate(**eval_kwargs)). The runner picks well-known
keys (prompt, n_samples, dimensions, auxiliary_info,
...) and passes the rest through.
This matches the surrounding FastVideo style:
- :class:
fastvideo.dataset.validation_dataset.ValidationDatasetyields dicts. - :meth:
fastvideo.VideoGenerator.generate_videoconsumes**kwargs. - :meth:
fastvideo.eval.Evaluator.evaluateconsumes**kwargs.
To add a new benchmark:
- Subclass :class:
PromptDataset, populateself._rowswith dicts in__init__. - Decorate with
@register_dataset("my_bench").
Convention for auxiliary_info: a flat dict of metric-keyed values
(e.g. {"color": "red"}). Benchmarks with nested aux schemas (VBench's
{dim: {key: val}}) flatten at load time so every consumer sees the
same shape.
Classes¶
fastvideo.eval.datasets.base.PromptDataset
¶
fastvideo.eval.datasets.base.Sample
¶
Bases: TypedDict
Documented schema for a row yielded by :class:PromptDataset.
Only prompt is required. Extra keys beyond these are forwarded to
the runner's eval-kwargs builder verbatim, so action-conditioned or
audio-bearing benchmarks can add their own fields without changing
the base class.
fastvideo.eval.datasets.physics_iq
¶
Physics-IQ benchmark prompt corpus.
Yields one sample dict per take-1 scenario, paired with its take-2
reference and both takes' real motion masks. Each row drops straight
into :meth:fastvideo.eval.Evaluator.evaluate for the physics_iq
metric:
{
"prompt": <description>,
"reference": "<take-1 mp4>",
"reference_take2": "<take-2 mp4>",
"reference_mask": "<take-1 mask mp4>",
"reference_take2_mask": "<take-2 mask mp4>",
"scenario": <scenario_id>,
"view": <camera view>,
"auxiliary_info": { ... metadata ... },
}
Self-contained dataset: the manifest CSV is vendored under
fastvideo/eval/metrics/physics_iq/_vendored/descriptions.csv;
per-scenario videos/masks/switch-frames auto-fetch on first use from the public
DeepMind bucket into ${FASTVIDEO_EVAL_CACHE}/datasets/physics_iq/.
Pass auto_download=False (or dataset_root= pointing at a
pre-downloaded copy) to opt out of network fetches.
Classes¶
fastvideo.eval.datasets.physics_iq.PhysicsIQPromptDataset
¶
PhysicsIQPromptDataset(dataset_root: str | Path | None = None, *, fps: int = _DEFAULT_FPS, limit: int | None = None, generated_dir: str | Path | None = None, auto_download: bool = True)
Bases: PromptDataset
Physics-IQ benchmark prompt corpus.
Self-contained: get_dataset("physics_iq") works with no kwargs.
The manifest CSV is vendored next to the metric, and per-scenario
assets auto-fetch on first miss from the public bucket into
${FASTVIDEO_EVAL_CACHE}/datasets/physics_iq/.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dataset_root
|
str | Path | None
|
path to a pre-downloaded copy of the Physics-IQ
release. Defaults to |
None
|
fps
|
int
|
target frame rate. The release ships at 30 FPS; other rates
transcode once on first access into |
_DEFAULT_FPS
|
limit
|
int | None
|
optional truncation for quick smoke runs. Apply this kwarg (not a post-construction slice) so we only fetch the assets for the scenarios actually requested. |
None
|
generated_dir
|
str | Path | None
|
optional directory of pre-generated videos —
attaches each manifest row's expected output path to the
sample dict under |
None
|
auto_download
|
bool
|
when True (the default), missing testing videos,
masks, and switch frames are fetched from the public bucket
into |
True
|
Source code in fastvideo/eval/datasets/physics_iq.py
fastvideo.eval.datasets.physics_iq.PhysicsIQScenario
dataclass
¶
PhysicsIQScenario(scenario_id: str, view: str, scenario_name: str, take1_video_path: str, take2_video_path: str, switch_frame_path: str, caption: str, expected_gen_filename: str, generated_video_path: str | None = None, take1_mask_path: str | None = None, take2_mask_path: str | None = None)
One row of the Physics-IQ manifest, fully resolved on disk.
Functions¶
fastvideo.eval.datasets.registry
¶
Registry for prompt-corpus datasets, mirroring :mod:fastvideo.eval.registry.
Functions¶
fastvideo.eval.datasets.registry.get_dataset
¶
Instantiate a registered dataset by name.
Source code in fastvideo/eval/datasets/registry.py
fastvideo.eval.datasets.registry.list_datasets
¶
fastvideo.eval.datasets.vbench
¶
VBench prompt corpus.
Single source of truth: upstream's VBench_full_info.json (946 entries,
each with prompt_en, a dimension list, optional auxiliary_info
keyed by dimension).
Classes¶
fastvideo.eval.datasets.vbench.VBenchPromptDataset
¶
Bases: PromptDataset
VBench prompts filtered by evaluation dimension.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
dimensions
|
list[str] | str
|
List of dimension names, or |
'all'
|
full_info_path
|
str | Path | None
|
Optional override for |
None
|
A prompt that belongs to several requested dimensions is yielded once;
its dimensions list carries all matches so the scorer can route.