models
¶
Model build plugins for Phase 2/2.9 distillation.
These are "model plugins" selected by recipe.family / roles.<role>.family.
Modules¶
fastvideo.train.models.base
¶
Classes¶
fastvideo.train.models.base.CausalModelBase
¶
Bases: ModelBase
Extension for causal / streaming model plugins.
Cache state is internal to the model instance and keyed by cache_tag (no role handle needed).
Functions¶
fastvideo.train.models.base.CausalModelBase.clear_caches
abstractmethod
¶clear_caches(*, cache_tag: str = 'pos') -> None
fastvideo.train.models.base.CausalModelBase.predict_noise_streaming
abstractmethod
¶predict_noise_streaming(noisy_latents: Tensor, timestep: Tensor, batch: TrainingBatch, *, conditional: bool, cache_tag: str = 'pos', store_kv: bool = False, cur_start_frame: int = 0, cfg_uncond: dict[str, Any] | None = None, attn_kind: Literal['dense', 'vsa'] = 'dense') -> Tensor | None
Streaming predict-noise that may update internal caches.
Source code in fastvideo/train/models/base.py
fastvideo.train.models.base.CausalModelBase.predict_x0_streaming
¶predict_x0_streaming(noisy_latents: Tensor, timestep: Tensor, batch: TrainingBatch, *, conditional: bool, cache_tag: str = 'pos', store_kv: bool = False, cur_start_frame: int = 0, cfg_uncond: dict[str, Any] | None = None, attn_kind: Literal['dense', 'vsa'] = 'dense') -> Tensor | None
Predict x0 streaming via
predict_noise_streaming + conversion.
Source code in fastvideo/train/models/base.py
fastvideo.train.models.base.ModelBase
¶
Bases: ABC
Per-role model instance.
Every role (student, teacher, critic, …) gets its own ModelBase
instance. Each instance owns its own transformer and
noise_scheduler. Heavyweight resources (VAE, dataloader, RNG
seeds) are loaded lazily via :meth:init_preprocessors, which the
method calls only on the student.
Attributes¶
fastvideo.train.models.base.ModelBase.device
property
¶The local CUDA device for this rank.
fastvideo.train.models.base.ModelBase.num_train_timesteps
property
¶num_train_timesteps: int
Return the scheduler's training timestep horizon.
Functions¶
fastvideo.train.models.base.ModelBase.add_noise
abstractmethod
¶ fastvideo.train.models.base.ModelBase.backward
abstractmethod
¶ fastvideo.train.models.base.ModelBase.init_preprocessors
¶Load VAE, build dataloader, seed RNGs.
Called only on the student by the method's __init__.
Default is a no-op so teacher/critic instances skip this.
Source code in fastvideo/train/models/base.py
fastvideo.train.models.base.ModelBase.on_train_start
¶ fastvideo.train.models.base.ModelBase.predict_noise
abstractmethod
¶predict_noise(noisy_latents: Tensor, timestep: Tensor, batch: TrainingBatch, *, conditional: bool, cfg_uncond: dict[str, Any] | None = None, attn_kind: Literal['dense', 'vsa'] = 'dense') -> Tensor
Predict noise/flow for the given noisy latents.
Source code in fastvideo/train/models/base.py
fastvideo.train.models.base.ModelBase.predict_x0
¶predict_x0(noisy_latents: Tensor, timestep: Tensor, batch: TrainingBatch, *, conditional: bool, cfg_uncond: dict[str, Any] | None = None, attn_kind: Literal['dense', 'vsa'] = 'dense') -> Tensor
Predict x0 via predict_noise + conversion.
Source code in fastvideo/train/models/base.py
fastvideo.train.models.base.ModelBase.prepare_batch
abstractmethod
¶prepare_batch(raw_batch: dict[str, Any], *, generator: Generator, latents_source: Literal['data', 'zeros'] = 'data') -> TrainingBatch
Convert a dataloader batch into forward primitives.
fastvideo.train.models.base.ModelBase.shift_and_clamp_timestep
¶Functions¶
fastvideo.train.models.hunyuan
¶
Hunyuan model plugin package.
Classes¶
Modules¶
fastvideo.train.models.hunyuan.hunyuan
¶
Hunyuan model plugin (per-role instance).
Subclasses WanModel since HunyuanVideo uses the same FlowMatchEulerDiscreteScheduler and linear-interpolation noise schedule. Differences: - transformer class name - normalize_dit_input("hunyuan", ...) instead of ("wan", ...) - forward kwargs: no encoder_attention_mask, no return_dict - default flow_shift = 7
Classes¶
fastvideo.train.models.hunyuan.hunyuan.HunyuanModel
¶HunyuanModel(*, init_from: str, training_config: TrainingConfig, trainable: bool = True, disable_custom_init_weights: bool = False, flow_shift: float = 7.0, enable_gradient_checkpointing_type: str | None = None, transformer_override_safetensor: str | None = None)
Bases: WanModel
HunyuanVideo per-role model.
Inherits most behaviour from WanModel (noise scheduler, timestep sampling, attention metadata, backward). Overrides only the pieces that differ for Hunyuan.
Source code in fastvideo/train/models/hunyuan/hunyuan.py
fastvideo.train.models.hunyuan.hunyuan.HunyuanModel.ensure_negative_conditioning
¶Encode the negative prompt with dual text encoders (LLaMA + CLIP).
Every rank encodes independently to avoid NCCL deadlocks when only a subset of ranks would otherwise participate.
Source code in fastvideo/train/models/hunyuan/hunyuan.py
162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 | |
fastvideo.train.models.hunyuan.hunyuan.HunyuanModel.prepare_batch
¶prepare_batch(raw_batch: dict[str, Any], *, generator: Generator, latents_source: Literal['data', 'zeros'] = 'data') -> TrainingBatch
Same flow as Wan, but uses Hunyuan VAE normalisation.
Source code in fastvideo/train/models/hunyuan/hunyuan.py
fastvideo.train.models.wan
¶
Wan model plugin package.
Classes¶
Modules¶
fastvideo.train.models.wan.wan
¶
Wan model plugin (per-role instance).
Classes¶
fastvideo.train.models.wan.wan.WanModel
¶WanModel(*, init_from: str, training_config: TrainingConfig, trainable: bool = True, disable_custom_init_weights: bool = False, flow_shift: float = 3.0, enable_gradient_checkpointing_type: str | None = None, transformer_override_safetensor: str | None = None)
Bases: ModelBase
Wan per-role model: owns transformer + noise_scheduler.
Source code in fastvideo/train/models/wan/wan.py
Functions¶
fastvideo.train.models.wan.wan_causal
¶
Wan causal model plugin (per-role instance, streaming/cache).
Classes¶
fastvideo.train.models.wan.wan_causal.WanCausalModel
¶WanCausalModel(*, init_from: str, training_config: TrainingConfig, trainable: bool = True, disable_custom_init_weights: bool = False, flow_shift: float = 3.0, enable_gradient_checkpointing_type: str | None = None, transformer_override_safetensor: str | None = None)
Bases: WanModel, CausalModelBase
Wan per-role model with causal/streaming primitives.