metric
¶
Compare optical flow extracted from a generated video against optical flow synthesized analytically from per-frame actions.
The reference flow is not observed from a ground-truth video — it's
predicted from the action stream via a third-person camera-kinematics
model (Longuet-Higgins linearization + off-pivot translation correction;
no depth). Observed flow comes from the same ptlflow model used by
gt_optical_flow, and the two are compared with the identical metric
set, so scores are directly comparable across the two metrics.
Required sample keys¶
video
(B, T, C, H, W) float in [0, 1].
actions
dict (or list-of-dicts of length B) with two np.ndarray keys:
* ``keyboard`` of shape ``(T, 6)`` — ``[W, S, A, D, turn_left, turn_right]``
* ``mouse`` of shape ``(T, 2)`` — ``[pitch, yaw]``
calibration
Either a path to a ThirdPersonCalibration JSON file, or a dict
of fitted parameters. May also be set once at construction time via
calibration_path= and reused across samples.
Optional sample keys¶
mouse_pitch_sign
+1 (default) or -1 if the dataset's mouse-pitch sign is
flipped (mhuo's data carries this in metadata).
Classes¶
fastvideo.eval.metrics.optical_flow.synthetic_optical_flow.metric.SyntheticOpticalFlowMetric
¶
SyntheticOpticalFlowMetric(model_name: str = 'dpflow', ckpt: str = 'things', calibration_path: str | Path | None = None, min_mag: float = 0.5, max_mag_pct: float = 80.0, grid_size: int = 8)
Bases: BaseMetric
Action-driven synthetic flow vs. video-extracted observed flow.
Pass calibration_path at construction to bind the calibration
once across all samples; otherwise supply sample["calibration"]
per call. Missing actions or calibration produce a skipped result
(score=None) rather than raising.