base
¶
Classes¶
fastvideo.train.methods.base.TrainingMethod
¶
Bases: Module, ABC
Base training method (algorithm layer).
Subclasses own their role models (student, teacher, critic, …) as
plain attributes and manage optimizers directly — no RoleManager
or RoleHandle.
The constructor receives role_models (a dict[str, ModelBase])
and a cfg object. It calls init_preprocessors on the student
and builds self.role_modules for FSDP wrapping.
A single shared CUDA RNG generator (cuda_generator) is
created in :meth:on_train_start. All torch.randn /
torch.randint calls in methods and models must use this
generator instead of relying on global RNG state.
Source code in fastvideo/train/methods/base.py
Functions¶
fastvideo.train.methods.base.TrainingMethod.checkpoint_state
¶
Return DCP-ready checkpoint state for all trainable roles.
Keys follow the convention:
roles.<role>.<module>, optimizers.<role>,
schedulers.<role>, random_state.*.
EMA state is managed by the EMACallback and is
checkpointed through the callback state mechanism.
Source code in fastvideo/train/methods/base.py
fastvideo.train.methods.base.TrainingMethod.get_grad_clip_targets
¶
Return modules whose gradients should be clipped.
Override in subclasses to add/conditionally include modules (e.g. critic, conditionally student). Default: student transformer.
Source code in fastvideo/train/methods/base.py
fastvideo.train.methods.base.TrainingMethod.seed_optimizer_state_for_resume
¶
Seed optimizer state so DCP can load saved state.
A fresh optimizer has empty state (exp_avg, exp_avg_sq, step are only created on the first optimizer.step()). DCP needs matching entries to load into; without them the saved optimizer state is silently dropped.