checkpoint
¶
Classes¶
fastvideo.train.utils.checkpoint.CheckpointManager
¶
CheckpointManager(*, method: Any, dataloader: Any, output_dir: str, config: CheckpointConfig, callbacks: Any | None = None, raw_config: dict[str, Any] | None = None)
Role-based checkpoint manager for training runtime.
- Checkpoint policy lives in YAML (via TrainingArgs fields).
- Resume path is typically provided via CLI (
--resume-from-checkpoint).
Source code in fastvideo/train/utils/checkpoint.py
Functions¶
fastvideo.train.utils.checkpoint.CheckpointManager.load_metadata
staticmethod
¶
Read metadata.json from a checkpoint dir.
Source code in fastvideo/train/utils/checkpoint.py
fastvideo.train.utils.checkpoint.CheckpointManager.load_rng_snapshot
¶
load_rng_snapshot(checkpoint_path: str) -> None
Restore per-rank RNG state from the snapshot file.
Must be called AFTER dcp.load and after
iter(dataloader) so no later operation can
clobber the restored state.