Skip to content

cosmos2_5_training_pipeline

Cosmos 2.5 training pipeline (text-to-world, full fine-tuning + LoRA).

Follows the same structure as wan_training_pipeline.py. Key Cosmos 2.5 specifics: - Text embeddings are (B, seq, 100352) from Reason1 full-concat (28 layers × 3584 dim) - forward() takes padding_mask (B, 1, H, W) and optional fps int - Normalisation is handled by Cosmos25WanVAEWrapper (handles_latent_norm=True), so _normalize_dit_input is a no-op and stored latents are already normalised. - Timesteps are sigma values in [0, 1] (flow matching); (B,) is auto-expanded inside the model to (B, 1).

Classes

fastvideo.training.cosmos2_5_training_pipeline.Cosmos25TrainingPipeline

Cosmos25TrainingPipeline(model_path: str, fastvideo_args: TrainingArgs, required_config_modules: list[str] | None = None, loaded_modules: dict[str, Module] | None = None)

Bases: TrainingPipeline

Training pipeline for Cosmos 2.5 (text-to-world).

Supports: - Full fine-tuning (all transformer parameters) - LoRA via the inherited LoRAPipeline mechanism (lora_param_names_mapping is set on Cosmos25Transformer3DModel)

Source code in fastvideo/training/training_pipeline.py
def __init__(self,
             model_path: str,
             fastvideo_args: TrainingArgs,
             required_config_modules: list[str] | None = None,
             loaded_modules: dict[str, torch.nn.Module] | None = None) -> None:
    fastvideo_args.inference_mode = False
    self.lora_training = fastvideo_args.lora_training
    if self.lora_training and fastvideo_args.lora_rank is None:
        raise ValueError("lora rank must be set when using lora training")

    set_random_seed(fastvideo_args.seed)  # for lora param init
    super().__init__(model_path, fastvideo_args, required_config_modules, loaded_modules)  # type: ignore
    self.tracker = DummyTracker()
    self.validation_ref_videos_logged = False

Functions

fastvideo.training.cosmos2_5_training_pipeline.Cosmos25TrainingPipeline.initialize_pipeline
initialize_pipeline(fastvideo_args: FastVideoArgs)

Create the flow-matching scheduler with Cosmos 2.5's shift=5.0.

Source code in fastvideo/training/cosmos2_5_training_pipeline.py
def initialize_pipeline(self, fastvideo_args: FastVideoArgs):
    """Create the flow-matching scheduler with Cosmos 2.5's shift=5.0."""
    self.modules["scheduler"] = FlowUniPCMultistepScheduler(shift=fastvideo_args.pipeline_config.flow_shift)
fastvideo.training.cosmos2_5_training_pipeline.Cosmos25TrainingPipeline.initialize_validation_pipeline
initialize_validation_pipeline(training_args: TrainingArgs)

Build a full Cosmos2_5Pipeline that reuses the training transformer.

Source code in fastvideo/training/cosmos2_5_training_pipeline.py
def initialize_validation_pipeline(self, training_args: TrainingArgs):
    """Build a full Cosmos2_5Pipeline that reuses the training transformer."""
    logger.info("Initializing Cosmos 2.5 validation pipeline...")
    args_copy = deepcopy(training_args)
    args_copy.inference_mode = True

    validation_pipeline = Cosmos2_5Pipeline.from_pretrained(
        training_args.model_path,
        args=args_copy,
        inference_mode=True,
        loaded_modules={
            "transformer": self.get_module("transformer"),
        },
        tp_size=training_args.tp_size,
        sp_size=training_args.sp_size,
        num_gpus=training_args.num_gpus,
        pin_cpu_memory=training_args.pin_cpu_memory,
        dit_cpu_offload=True,
    )
    self.validation_pipeline = validation_pipeline

Functions