tests ¶

Modules¶

fastvideo.tests.api ¶

Modules¶

fastvideo.tests.api.test_compat_translation ¶

Tests for fastvideo.api.compat translation helpers covering the typed CompileConfig + PipelineSelection.vae_tiling surfaces promoted in PR 6.

Classes¶

fastvideo.tests.api.test_compat_translation.TestCompileConfigRoundTrip ¶

typed CompileConfig -> FastVideoArgs.torch_compile_kwargs reconstruction drops None typed fields and merges extras.

fastvideo.tests.api.test_compat_translation.TestLegacyLtx2VaeTilingTranslation ¶

ltx2_vae_tiling flat kwarg promotes to generator.pipeline.vae_tiling; reverse direction emits the legacy name back to FastVideoArgs.

fastvideo.tests.api.test_compat_translation.TestLegacyTextEncoderCompileTranslation ¶

enable_torch_compile_text_encoder flat kwarg promotes to generator.engine.compile.text_encoder_enabled; reverse direction emits the legacy name back onto the FastVideoArgs kwargs dict so realtime-runtime consumers can read it before FastVideoArgs filters unknown fields.

fastvideo.tests.api.test_compat_translation.TestLegacyTorchCompileKwargsTranslation ¶

Legacy torch_compile_kwargs={...} gets split across the four first-class :class:CompileConfig fields and anything unknown falls into extras.

fastvideo.tests.api.test_ltx2_continuation ¶

Tests for the typed LTX-2 continuation state.

Covers:

round-trip through :class:ContinuationState (inline and blob-backed)
payload is JSON-serializable (Dynamo RPC / HTTP client constraint)
kind / schema_version validation on deserialization
compat-layer validation (known kinds, payload shape)
round-trip through :func:request_to_sampling_param attaches the state to the resulting :class:SamplingParam without losing fidelity

Attributes¶

Classes¶

fastvideo.tests.api.test_ltx2_continuation.TestBlobIndirection ¶

Large tensors live in the :class:BlobStore instead of the payload.

Functions¶

fastvideo.tests.api.test_ltx2_continuation.TestBlobIndirection.test_blob_id_held_when_store_unavailable ¶

test_blob_id_held_when_store_unavailable()

Deserializing without a blob store preserves the blob id so the caller can fetch it later.

Source code in fastvideo/tests/api/test_ltx2_continuation.py

def test_blob_id_held_when_store_unavailable(self):
    """Deserializing without a blob store preserves the blob id so
    the caller can fetch it later."""
    blob_store = InMemoryBlobStore()
    envelope = _make_typed_state().to_continuation_state(
        blob_store=blob_store,
        inline_threshold_bytes=0,
    )
    blob_id_video = envelope.payload["video"]["blob_id"]
    blob_id_audio = envelope.payload["audio"]["blob_id"]

    restored = LTX2ContinuationState.from_continuation_state(envelope)
    assert restored.video_frames is None
    assert restored.video_frames_blob_id == blob_id_video
    assert restored.audio_latents is None
    assert restored.audio_latents_blob_id == blob_id_audio

fastvideo.tests.api.test_ltx2_continuation.TestCompatLayerWireUp ¶

The public compat layer accepts request.state without reverting to NotImplementedError and attaches it to the SamplingParam path.

fastvideo.tests.api.test_ltx2_continuation.TestRoundTrip ¶

Round-trip through :class:ContinuationState preserves all fields.

Functions¶

fastvideo.tests.api.test_ltx2_continuation.TestRoundTrip.test_bf16_audio_latents_preserved ¶

test_bf16_audio_latents_preserved()

safetensors serialization must preserve bf16 dtype (numpy has no bf16, so a raw-bytes path would silently promote).

Source code in fastvideo/tests/api/test_ltx2_continuation.py

def test_bf16_audio_latents_preserved(self):
    """safetensors serialization must preserve bf16 dtype (numpy
    has no bf16, so a raw-bytes path would silently promote)."""
    state = LTX2ContinuationState(
        segment_index=0,
        audio_latents=torch.randn(1, 4, 16, 64, dtype=torch.bfloat16),
    )
    envelope = state.to_continuation_state()
    restored = LTX2ContinuationState.from_continuation_state(envelope)
    assert restored.audio_latents is not None
    assert restored.audio_latents.dtype == torch.bfloat16
    torch.testing.assert_close(
        restored.audio_latents, state.audio_latents)

fastvideo.tests.api.test_ltx2_continuation.TestValidation ¶

Invalid payloads error cleanly.

Modules¶

fastvideo.tests.api.test_ltx2_gpu_pool_translation ¶

gpu_pool-style flat-kwarg integration tests.

Mirrors the load_kwargs dict that the FastVideo-internal ui/ltx2-streaming/server/gpu_pool.py passes to VideoGenerator.from_pretrained(**load_kwargs) and asserts that the public typed GeneratorConfig surface (introduced across PRs 0-6) can represent it end-to-end, with no fields silently falling through to pipeline.experimental.

This is the parity guard PR 7.6 depends on: the public gpu_pool upstream must be able to construct a typed GeneratorConfig without knowing any legacy LTX-2 kwarg name, and downstream Dynamo (FastVideoArgGroup) must be able to do the same.

Classes¶

fastvideo.tests.api.test_ltx2_gpu_pool_translation.TestCompileExtrasPreserved ¶

Additional torch.compile kwargs beyond the four typed fields round-trip through CompileConfig.extras.

fastvideo.tests.api.test_ltx2_gpu_pool_translation.TestGpuPoolForwardTranslation ¶

gpu_pool flat kwargs -> typed GeneratorConfig.

Functions¶

fastvideo.tests.api.test_ltx2_gpu_pool_translation.TestGpuPoolForwardTranslation.test_no_experimental_leakage ¶

test_no_experimental_leakage(config) -> None

Every gpu_pool kwarg should have a typed home — nothing should silently fall through to pipeline.experimental.

Source code in fastvideo/tests/api/test_ltx2_gpu_pool_translation.py

def test_no_experimental_leakage(self, config) -> None:
    """Every gpu_pool kwarg should have a typed home — nothing should
    silently fall through to ``pipeline.experimental``."""
    assert config.pipeline.experimental == {}

fastvideo.tests.api.test_ltx2_gpu_pool_translation.TestGpuPoolReverseTranslation ¶

typed GeneratorConfig -> FastVideoArgs kwargs reproduces the original gpu_pool flat-kwarg shape.

This is what lets PR 7.6 wire the public gpu_pool through generator_config_to_fastvideo_args without the runtime noticing.

Functions¶

fastvideo.tests.api.test_ltx2_gpu_pool_translation.TestGpuPoolReverseTranslation.test_no_stray_refine_dict ¶

test_no_stray_refine_dict(args_kwargs) -> None

preset_overrides.refine must flatten to ltx2_refine_* kwargs rather than landing as a nested refine kwarg that FastVideoArgs doesn't understand.

Source code in fastvideo/tests/api/test_ltx2_gpu_pool_translation.py

def test_no_stray_refine_dict(self, args_kwargs) -> None:
    """preset_overrides.refine must flatten to ltx2_refine_* kwargs
    rather than landing as a nested ``refine`` kwarg that
    FastVideoArgs doesn't understand."""
    assert "refine" not in args_kwargs

fastvideo.tests.api.test_ltx2_gpu_pool_translation.TestRefineFlattenCoversAllTypedFields ¶

Every field on LTX2Refine{Preset,Stage}Override must survive the round-trip through preset_overrides.refine back to ltx2_refine_* kwargs. Guards against the hardcoded-key-tuple regression where image_crf / video_position_offset_sec silently dropped.

fastvideo.tests.api.test_ltx2_stage_overrides ¶

Tests for typed LTX-2 stage override dataclasses.

Classes¶

fastvideo.tests.api.test_ltx2_stage_overrides.TestStageOverridesMirrorPresetSchema ¶

The ltx2_two_stage preset's refine stage schema must list exactly the :class:LTX2RefineStageOverride field names.

Functions¶

fastvideo.tests.api.test_presets ¶

Classes¶

fastvideo.tests.api.test_presets.TestPresetCountIntegrity ¶

Functions¶

fastvideo.tests.api.test_presets.TestPresetCountIntegrity.test_total_preset_count ¶

test_total_preset_count() -> None

At least the baseline 37 presets from 13 families are registered.

Source code in fastvideo/tests/api/test_presets.py

def test_total_preset_count(self) -> None:
    """At least the baseline 37 presets from 13 families are registered."""
    import fastvideo.registry  # noqa: F401
    names = get_all_preset_names()
    assert len(names) >= 37

fastvideo.tests.api.test_presets.TestPresetDefaultTypes ¶

Preset defaults values must match the types on :class:SamplingParam. Assigning None to a typed-str field (e.g. negative_prompt) breaks downstream stages that assert the runtime type — see the CFG branch in pipelines/stages/text_encoding.py:81.

Functions¶

fastvideo.tests.api.test_presets.TestPresetDefaultTypes.test_ltx2_cfg_defaults_are_off ¶

test_ltx2_cfg_defaults_are_off() -> None

SamplingParam's LTX-2 CFG class defaults must be 1.0 (CFG off). ForwardBatch.__post_init__ force-enables do_classifier_free_guidance when either ltx2_cfg_scale_video or ltx2_cfg_scale_audio is != 1.0, so any non-1.0 default silently forces CFG on for every model family that doesn't explicitly override these fields. Guard against the regression that surfaced as the TurboDiffusion I2V SSIM crash (text_encoding.py:81 assertion on negative_prompt).

Source code in fastvideo/tests/api/test_presets.py

def test_ltx2_cfg_defaults_are_off(self) -> None:
    """SamplingParam's LTX-2 CFG class defaults must be 1.0 (CFG
    off). ``ForwardBatch.__post_init__`` force-enables
    ``do_classifier_free_guidance`` when either
    ``ltx2_cfg_scale_video`` or ``ltx2_cfg_scale_audio`` is != 1.0,
    so any non-1.0 default silently forces CFG on for every model
    family that doesn't explicitly override these fields. Guard
    against the regression that surfaced as the TurboDiffusion I2V
    SSIM crash (``text_encoding.py:81`` assertion on
    ``negative_prompt``)."""
    from fastvideo.api.sampling_param import SamplingParam
    sp = SamplingParam()
    assert sp.ltx2_cfg_scale_video == 1.0
    assert sp.ltx2_cfg_scale_audio == 1.0

fastvideo.tests.api.test_presets.TestWanPresets ¶

Verify the Wan presets registered from registry.py.

Functions¶

fastvideo.tests.api.test_schema_parity_inventory ¶

Classes¶

Modules¶

fastvideo.tests.conftest ¶

Functions¶

fastvideo.tests.conftest.distributed_setup ¶

distributed_setup()

Fixture to set up and tear down the distributed environment for tests.

This ensures proper cleanup even if tests fail.

Source code in fastvideo/tests/conftest.py

@pytest.fixture(scope="function")
def distributed_setup():
    """
    Fixture to set up and tear down the distributed environment for tests.

    This ensures proper cleanup even if tests fail.
    """
    torch.manual_seed(42)
    np.random.seed(42)
    maybe_init_distributed_environment_and_model_parallel(1, 1)
    yield

    cleanup_dist_env_and_memory()

fastvideo.tests.performance ¶

Modules¶

fastvideo.tests.performance.test_inference_performance ¶

Config-driven inference performance tests.

Benchmark configs live in .buildkite/performance-benchmarks/tests/*.json. Each JSON file defines model params, generation kwargs, run config, and per-device thresholds. This test module auto-discovers all configs and parametrizes a single test function over them.

Classes¶

Functions¶

fastvideo.tests.performance.test_inference_performance.test_inference_performance ¶

test_inference_performance(cfg)

Measure generation latency and peak GPU memory, assert against device-aware thresholds.

Source code in fastvideo/tests/performance/test_inference_performance.py

@pytest.mark.parametrize(
    "cfg",
    _BENCHMARK_CONFIGS,
    ids=[c["benchmark_id"] for c in _BENCHMARK_CONFIGS],
)
def test_inference_performance(cfg):
    """Measure generation latency and peak GPU memory,
    assert against device-aware thresholds."""
    run_config = cfg.get("run_config", {})
    required_gpus = run_config.get("required_gpus", 1)
    available = torch.cuda.device_count()
    if available < required_gpus:
        pytest.skip(f"Need {required_gpus} GPUs, only {available} available")

    model_info = cfg["model"]
    init_kwargs = dict(cfg.get("init_kwargs", {}))
    gen_kwargs = dict(cfg.get("generation_kwargs", {}))
    prompts = cfg.get("test_prompts", ["A cinematic video."])
    prompt = prompts[0]

    num_warmup = run_config.get("num_warmup_runs", 1)
    num_measure = run_config.get("num_measurement_runs", 3)
    thresholds = _get_thresholds(cfg)

    # Remap JSON keys to VideoGenerator kwargs
    text_enc_prec = init_kwargs.pop("text_encoder_precisions", None)
    if text_enc_prec is not None:
        init_kwargs["text_encoder_precisions"] = tuple(text_enc_prec)

    # Output directory for generated videos
    script_dir = os.path.dirname(os.path.abspath(__file__))
    output_dir = os.path.join(script_dir, "generated_videos",
                              cfg["benchmark_id"])
    os.makedirs(output_dir, exist_ok=True)
    gen_kwargs["output_path"] = output_dir

    generator = None
    try:
        generator = VideoGenerator.from_pretrained(
            model_path=model_info["model_path"],
            **init_kwargs,
        )

        for i in range(num_warmup):
            logger.info("Warmup run %d/%d", i + 1, num_warmup)
            _run_generation(generator, prompt, gen_kwargs)

        times = []
        peak_memories = []
        for i in range(num_measure):
            logger.info("Measurement run %d/%d", i + 1, num_measure)
            elapsed, peak_mb = _run_generation(generator, prompt, gen_kwargs)
            logger.info("  Time: %.2fs, Peak memory: %.0fMB", elapsed, peak_mb)
            times.append(elapsed)
            peak_memories.append(peak_mb)
    finally:
        _shutdown_executor(generator)

    avg_time = sum(times) / len(times)
    max_peak_memory = max(peak_memories)
    device_name = torch.cuda.get_device_name()

    results = {
        "benchmark_id": cfg["benchmark_id"],
        "model_short_name": model_info.get("model_short_name", ""),
        "device": device_name,
        "num_gpus": init_kwargs.get("num_gpus", 1),
        "num_warmup_runs": num_warmup,
        "num_measurement_runs": num_measure,
        "avg_generation_time_s": round(avg_time, 3),
        "individual_times_s": [round(t, 3) for t in times],
        "max_peak_memory_mb": round(max_peak_memory, 1),
        "individual_peak_memories_mb": [round(m, 1) for m in peak_memories],
        "thresholds": thresholds,
        "commit": os.environ.get("BUILDKITE_COMMIT", ""),
        "pr_number": os.environ.get("BUILDKITE_PULL_REQUEST", ""),
        "timestamp": datetime.now(timezone.utc).isoformat(),
    }

    logger.info(
        "Performance results: avg_time=%.2fs, "
        "max_peak_memory=%.0fMB", avg_time, max_peak_memory)
    _write_results(results)

    max_time = thresholds["max_generation_time_s"]
    max_mem = thresholds["max_peak_memory_mb"]

    assert avg_time <= max_time, (
        f"Average generation time {avg_time:.2f}s exceeds "
        f"threshold {max_time:.1f}s for {device_name}")

    assert max_peak_memory <= max_mem, (
        f"Peak memory {max_peak_memory:.0f}MB exceeds "
        f"threshold {max_mem:.0f}MB for {device_name}")

fastvideo.tests.ssim ¶

Modules¶

fastvideo.tests.ssim.conftest ¶

Functions¶

fastvideo.tests.ssim.conftest.pytest_collection_modifyitems ¶

pytest_collection_modifyitems(config, items)

Optionally keep only tests with a matching model_id parameter.

Source code in fastvideo/tests/ssim/conftest.py

def pytest_collection_modifyitems(config, items):
    """Optionally keep only tests with a matching model_id parameter."""
    model_id = os.environ.get("FASTVIDEO_SSIM_MODEL_ID")
    if not model_id:
        return

    selected = []
    deselected = []
    for item in items:
        callspec = getattr(item, "callspec", None)
        if callspec is None:
            deselected.append(item)
            continue
        if callspec.params.get("model_id") == model_id:
            selected.append(item)
        else:
            deselected.append(item)

    if deselected:
        config.hook.pytest_deselected(items=deselected)
    items[:] = selected

fastvideo.tests.ssim.reference_videos_cli ¶

Functions¶

fastvideo.tests.ssim.reference_videos_cli.ensure_reference_videos_available ¶

ensure_reference_videos_available(*, local_dir: Path | None = None, repo_id: str | None = None, repo_type: str | None = None, quality_tier: str = DEFAULT_OUTPUT_QUALITY_TIER) -> bool

Return True if downloaded from HF, False if already present locally.

Source code in fastvideo/tests/ssim/reference_videos_cli.py

def ensure_reference_videos_available(
    *,
    local_dir: Path | None = None,
    repo_id: str | None = None,
    repo_type: str | None = None,
    quality_tier: str = DEFAULT_OUTPUT_QUALITY_TIER,
) -> bool:
    """Return True if downloaded from HF, False if already present locally."""
    if quality_tier not in QUALITY_TIERS:
        raise ValueError(f"Unsupported quality tier: {quality_tier}")
    target_dir = local_dir or _ssim_dir()
    lock_path = target_dir / ".reference_videos_download.lock"
    with _exclusive_download_lock(lock_path):
        if _has_local_reference_videos(target_dir, quality_tier):
            print(f"Reference videos ({quality_tier}) already available at {target_dir}")
            return False

        resolved_repo_id = repo_id or _default_repo_id()
        resolved_repo_type = repo_type or _default_repo_type()
        if not resolved_repo_id:
            raise RuntimeError(
                f"No local reference videos found and no HF repo configured.\nSet {HF_REPO_ENV_KEY} or pass --repo-id."
            )

        print(f"Repo ID: {resolved_repo_id}")
        print(f"Quality tier: {quality_tier}")
        print(f"No local {quality_tier} reference videos found under {target_dir}. Starting download...")
        try:
            download_reference_videos(
                repo_id=resolved_repo_id,
                repo_type=resolved_repo_type,
                local_dir=target_dir,
                quality_tiers=[quality_tier],
            )
            print(f"Download completed for {quality_tier} reference videos.")
        except Exception as exc:
            print(f"ERROR: Failed to download {quality_tier} reference videos from {resolved_repo_id}.")
            print(
                f"Suggested command to retry: "
                f"python fastvideo/tests/ssim/reference_videos_cli.py download "
                f"--quality-tier {quality_tier}"
            )
            raise

        if not _has_local_reference_videos(target_dir, quality_tier):
            raise RuntimeError(f"HF download completed but no {quality_tier} *_reference_videos content found.")
        return True

fastvideo.tests.ssim.test_gamecraft_similarity ¶

SSIM regression test for HunyuanGameCraft (T2V and I2V).

Generates a video with deterministic seed and camera trajectory, then compares against a device-specific reference video via MS-SSIM.

Reference videos must be pre-generated and stored under

reference_videos//_reference_videos/HunyuanGameCraft/ /

To create initial reference videos, run this test once and copy the generated videos into the appropriate reference folder.

Classes¶

Functions¶

fastvideo.tests.ssim.test_gamecraft_similarity.test_gamecraft_i2v_similarity ¶

test_gamecraft_i2v_similarity(prompt, ATTENTION_BACKEND, model_id)

Generate an I2V video with GameCraft and compare to reference via SSIM.

Source code in fastvideo/tests/ssim/test_gamecraft_similarity.py

@pytest.mark.parametrize("prompt", I2V_TEST_PROMPTS)
@pytest.mark.parametrize("ATTENTION_BACKEND", ["FLASH_ATTN"])
@pytest.mark.parametrize("model_id", list(I2V_MODEL_TO_PARAMS.keys()))
def test_gamecraft_i2v_similarity(prompt, ATTENTION_BACKEND, model_id):
    """Generate an I2V video with GameCraft and compare to reference via SSIM."""
    os.environ["FASTVIDEO_ATTENTION_BACKEND"] = ATTENTION_BACKEND

    script_dir = os.path.dirname(os.path.abspath(__file__))
    output_dir = build_generated_output_dir(
        script_dir,
        device_reference_folder,
        model_id,
        ATTENTION_BACKEND,
    )
    output_video_name = f"{prompt[:100].strip()}.mp4"

    os.makedirs(output_dir, exist_ok=True)

    params_map = select_ssim_params(
        I2V_MODEL_TO_PARAMS,
        FULL_QUALITY_I2V_MODEL_TO_PARAMS,
    )
    BASE_PARAMS = params_map[model_id]
    num_inference_steps = BASE_PARAMS["num_inference_steps"]

    # Build camera trajectory
    camera_states = _create_camera_trajectory(
        action=BASE_PARAMS["action"],
        height=BASE_PARAMS["height"],
        width=BASE_PARAMS["width"],
        num_frames=BASE_PARAMS["num_frames"],
        action_speed=BASE_PARAMS["action_speed"],
        dtype=torch.bfloat16,
    )

    init_kwargs = {
        "num_gpus": BASE_PARAMS["num_gpus"],
        "use_fsdp_inference": True,
        "dit_cpu_offload": True,
        "vae_cpu_offload": True,
        "text_encoder_cpu_offload": True,
        "pin_cpu_memory": True,
    }

    generation_kwargs = {
        "num_inference_steps": num_inference_steps,
        "output_path": output_dir,
        "image_path": BASE_PARAMS["image_path"],
        "height": BASE_PARAMS["height"],
        "width": BASE_PARAMS["width"],
        "num_frames": BASE_PARAMS["num_frames"],
        "guidance_scale": BASE_PARAMS["guidance_scale"],
        "seed": BASE_PARAMS["seed"],
        "fps": 24,
        "camera_states": camera_states,
        "negative_prompt": BASE_PARAMS.get("negative_prompt", ""),
        "save_video": True,
    }

    generator: VideoGenerator | None = None
    try:
        generator = VideoGenerator.from_pretrained(
            model_path=BASE_PARAMS["model_path"], **init_kwargs
        )
        generator.generate_video(prompt, **generation_kwargs)
    finally:
        _shutdown_executor(generator)

    assert os.path.exists(output_dir), (
        f"Output video was not generated at {output_dir}"
    )

    # Compare to reference
    reference_folder = build_reference_folder_path(
        script_dir,
        device_reference_folder,
        model_id,
        ATTENTION_BACKEND,
    )

    if not os.path.exists(reference_folder):
        logger.error("Reference folder missing")
        raise FileNotFoundError(
            f"Reference video folder does not exist: {reference_folder}"
        )

    reference_video_name = None
    for filename in os.listdir(reference_folder):
        if filename.endswith(".mp4") and prompt[:100].strip() in filename:
            reference_video_name = filename
            break

    if not reference_video_name:
        logger.error(
            f"Reference video not found for prompt: {prompt} "
            f"with backend: {ATTENTION_BACKEND}"
        )
        raise FileNotFoundError("Reference video missing")

    reference_video_path = os.path.join(reference_folder, reference_video_name)
    generated_video_path = os.path.join(output_dir, output_video_name)

    logger.info(
        f"Computing SSIM between {reference_video_path} and {generated_video_path}"
    )
    ssim_values = compute_video_ssim_torchvision(
        reference_video_path, generated_video_path, use_ms_ssim=True
    )

    mean_ssim = ssim_values[0]
    logger.info(f"SSIM mean value: {mean_ssim}")
    logger.info(f"Writing SSIM results to directory: {output_dir}")

    success = write_ssim_results(
        output_dir,
        ssim_values,
        reference_video_path,
        generated_video_path,
        num_inference_steps,
        prompt,
    )

    if not success:
        logger.error("Failed to write SSIM results to file")

    min_acceptable_ssim = 0.93
    assert mean_ssim >= min_acceptable_ssim, (
        f"SSIM value {mean_ssim} is below threshold {min_acceptable_ssim} "
        f"for {model_id} with backend {ATTENTION_BACKEND}"
    )

fastvideo.tests.ssim.test_gamecraft_similarity.test_gamecraft_t2v_similarity ¶

test_gamecraft_t2v_similarity(prompt, ATTENTION_BACKEND, model_id)

Generate a T2V video with GameCraft and compare to reference via SSIM.

Source code in fastvideo/tests/ssim/test_gamecraft_similarity.py

@pytest.mark.parametrize("prompt", TEST_PROMPTS)
@pytest.mark.parametrize("ATTENTION_BACKEND", ["FLASH_ATTN"])
@pytest.mark.parametrize("model_id", list(MODEL_TO_PARAMS.keys()))
def test_gamecraft_t2v_similarity(prompt, ATTENTION_BACKEND, model_id):
    """Generate a T2V video with GameCraft and compare to reference via SSIM."""
    os.environ["FASTVIDEO_ATTENTION_BACKEND"] = ATTENTION_BACKEND

    script_dir = os.path.dirname(os.path.abspath(__file__))
    output_dir = build_generated_output_dir(
        script_dir,
        device_reference_folder,
        model_id,
        ATTENTION_BACKEND,
    )
    output_video_name = f"{prompt[:100].strip()}.mp4"

    os.makedirs(output_dir, exist_ok=True)

    params_map = select_ssim_params(
        MODEL_TO_PARAMS,
        FULL_QUALITY_MODEL_TO_PARAMS,
    )
    BASE_PARAMS = params_map[model_id]
    num_inference_steps = BASE_PARAMS["num_inference_steps"]

    # Build camera trajectory
    camera_states = _create_camera_trajectory(
        action=BASE_PARAMS["action"],
        height=BASE_PARAMS["height"],
        width=BASE_PARAMS["width"],
        num_frames=BASE_PARAMS["num_frames"],
        action_speed=BASE_PARAMS["action_speed"],
        dtype=torch.bfloat16,
    )

    init_kwargs = {
        "num_gpus": BASE_PARAMS["num_gpus"],
        "use_fsdp_inference": True,
        "dit_cpu_offload": True,
        "vae_cpu_offload": True,
        "text_encoder_cpu_offload": True,
        "pin_cpu_memory": True,
    }

    generation_kwargs = {
        "num_inference_steps": num_inference_steps,
        "output_path": output_dir,
        "height": BASE_PARAMS["height"],
        "width": BASE_PARAMS["width"],
        "num_frames": BASE_PARAMS["num_frames"],
        "guidance_scale": BASE_PARAMS["guidance_scale"],
        "seed": BASE_PARAMS["seed"],
        "fps": 24,
        "camera_states": camera_states,
        "negative_prompt": BASE_PARAMS.get("negative_prompt", ""),
        "save_video": True,
    }

    generator: VideoGenerator | None = None
    try:
        generator = VideoGenerator.from_pretrained(
            model_path=BASE_PARAMS["model_path"], **init_kwargs
        )
        generator.generate_video(prompt, **generation_kwargs)
    finally:
        _shutdown_executor(generator)

    assert os.path.exists(output_dir), (
        f"Output video was not generated at {output_dir}"
    )

    # Compare to reference
    reference_folder = build_reference_folder_path(
        script_dir,
        device_reference_folder,
        model_id,
        ATTENTION_BACKEND,
    )

    if not os.path.exists(reference_folder):
        logger.error("Reference folder missing")
        raise FileNotFoundError(
            f"Reference video folder does not exist: {reference_folder}"
        )

    reference_video_name = None
    for filename in os.listdir(reference_folder):
        if filename.endswith(".mp4") and prompt[:100].strip() in filename:
            reference_video_name = filename
            break

    if not reference_video_name:
        logger.error(
            f"Reference video not found for prompt: {prompt} "
            f"with backend: {ATTENTION_BACKEND}"
        )
        raise FileNotFoundError("Reference video missing")

    reference_video_path = os.path.join(reference_folder, reference_video_name)
    generated_video_path = os.path.join(output_dir, output_video_name)

    logger.info(
        f"Computing SSIM between {reference_video_path} and {generated_video_path}"
    )
    ssim_values = compute_video_ssim_torchvision(
        reference_video_path, generated_video_path, use_ms_ssim=True
    )

    mean_ssim = ssim_values[0]
    logger.info(f"SSIM mean value: {mean_ssim}")
    logger.info(f"Writing SSIM results to directory: {output_dir}")

    success = write_ssim_results(
        output_dir,
        ssim_values,
        reference_video_path,
        generated_video_path,
        num_inference_steps,
        prompt,
    )

    if not success:
        logger.error("Failed to write SSIM results to file")

    min_acceptable_ssim = 0.93
    assert mean_ssim >= min_acceptable_ssim, (
        f"SSIM value {mean_ssim} is below threshold {min_acceptable_ssim} "
        f"for {model_id} with backend {ATTENTION_BACKEND}"
    )

fastvideo.tests.ssim.test_gen3c_similarity ¶

SSIM regression test for GEN3C video generation.

Compares newly generated GEN3C videos against device-specific reference videos using MS-SSIM to detect quality regressions across code changes.

Usage

Requires 1+ GPU and reference videos.¶

pytest fastvideo/tests/ssim/test_gen3c_similarity.py -v

Environment variables

GEN3C_MODEL_PATH - Diffusers-format GEN3C model path/repo id. Default: FastVideo/GEN3C-Cosmos-7B-Diffusers (local converted path also supported)

Classes¶

Functions¶

fastvideo.tests.ssim.test_gen3c_similarity.test_gen3c_inference_similarity ¶

test_gen3c_inference_similarity(prompt, ATTENTION_BACKEND, model_id)

Generate a GEN3C video and compare against the reference using MS-SSIM.

Source code in fastvideo/tests/ssim/test_gen3c_similarity.py

@pytest.mark.skipif(
    device_reference_folder is None,
    reason=f"No reference videos for device {device_name}",
)
@pytest.mark.parametrize("prompt", TEST_PROMPTS)
@pytest.mark.parametrize("ATTENTION_BACKEND", ["TORCH_SDPA"])
@pytest.mark.parametrize("model_id", list(MODEL_TO_PARAMS.keys()))
def test_gen3c_inference_similarity(prompt, ATTENTION_BACKEND, model_id):
    """
    Generate a GEN3C video and compare against the reference using MS-SSIM.
    """
    os.environ["FASTVIDEO_ATTENTION_BACKEND"] = ATTENTION_BACKEND

    script_dir = os.path.dirname(os.path.abspath(__file__))
    base_output_dir = os.path.join(script_dir, "generated_videos", model_id)
    output_dir = os.path.join(base_output_dir, ATTENTION_BACKEND)
    output_video_name = CANDIDATE_VIDEO_NAME
    os.makedirs(output_dir, exist_ok=True)

    BASE_PARAMS = MODEL_TO_PARAMS[model_id]
    num_inference_steps = BASE_PARAMS["num_inference_steps"]
    model_path = BASE_PARAMS["model_path"]

    # Guard common misconfigurations to keep CI behavior explicit.
    if model_path.lower() == "nvidia/gen3c-cosmos-7b":
        pytest.skip(
            "nvidia/GEN3C-Cosmos-7B is the official raw checkpoint repo, not Diffusers format. "
            "Use GEN3C_MODEL_PATH=FastVideo/GEN3C-Cosmos-7B-Diffusers or a local converted path."
        )

    local_like = model_path.startswith(("/", "./", "../"))
    if local_like and not os.path.exists(model_path):
        pytest.skip(
            f"Local GEN3C model path not found: {model_path}. "
            "Set GEN3C_MODEL_PATH to a valid local path or HF Diffusers repo id."
        )

    if os.path.exists(model_path):
        model_index_path = os.path.join(model_path, "model_index.json")
        if not os.path.exists(model_index_path):
            pytest.skip(
                f"GEN3C_MODEL_PATH is not Diffusers-format (missing model_index.json): {model_path}"
            )

    init_kwargs = {
        "num_gpus": BASE_PARAMS["num_gpus"],
        "sp_size": BASE_PARAMS["sp_size"],
        "tp_size": BASE_PARAMS["tp_size"],
    }
    if "flow_shift" in BASE_PARAMS:
        init_kwargs["flow_shift"] = BASE_PARAMS["flow_shift"]

    generation_kwargs = {
        "num_inference_steps": num_inference_steps,
        "output_path": os.path.join(output_dir, output_video_name),
        "height": BASE_PARAMS["height"],
        "width": BASE_PARAMS["width"],
        "num_frames": BASE_PARAMS["num_frames"],
        "guidance_scale": BASE_PARAMS["guidance_scale"],
        "embedded_cfg_scale": BASE_PARAMS["embedded_cfg_scale"],
        "seed": BASE_PARAMS["seed"],
        "image_path": BASE_PARAMS["image_path"],
        "fps": BASE_PARAMS["fps"],
    }

    if not os.path.exists(generation_kwargs["image_path"]):
        pytest.skip(
            f"GEN3C test image not found: {generation_kwargs['image_path']}. "
            "Set GEN3C_TEST_IMAGE_PATH to a valid local image."
        )

    # Keep local reruns deterministic: remove prior candidate outputs so
    # VideoGenerator does not auto-suffix (_1, _2, ...).
    stale_pattern = os.path.join(output_dir, "gen3c_ssim_candidate*.mp4")
    for stale_video in glob.glob(stale_pattern):
        os.remove(stale_video)

    generator = VideoGenerator.from_pretrained(
        model_path=model_path, **init_kwargs
    )
    generator.generate_video(prompt, **generation_kwargs)

    if isinstance(generator.executor, MultiprocExecutor):
        generator.executor.shutdown()

    assert os.path.exists(output_dir), f"Output not generated at {output_dir}"

    reference_folder = os.path.join(
        script_dir, device_reference_folder, model_id, ATTENTION_BACKEND
    )
    if not os.path.exists(reference_folder):
        raise FileNotFoundError(
            f"Reference video folder does not exist: {reference_folder}"
        )

    reference_video_path = os.path.join(reference_folder, BASELINE_VIDEO_NAME)
    if not os.path.exists(reference_video_path):
        raise FileNotFoundError(
            f"Reference video not found: {reference_video_path}"
        )

    generated_video_path = os.path.join(output_dir, output_video_name)

    logger.info(f"Computing SSIM: {reference_video_path} vs {generated_video_path}")
    ssim_values = compute_video_ssim_torchvision(
        reference_video_path, generated_video_path, use_ms_ssim=True
    )

    mean_ssim = ssim_values[0]
    logger.info(f"GEN3C SSIM mean: {mean_ssim}")

    write_ssim_results(
        output_dir,
        ssim_values,
        reference_video_path,
        generated_video_path,
        num_inference_steps,
        prompt,
    )

    # GEN3C SSIM threshold for stable L40S reference comparisons.
    min_acceptable_ssim = 0.93
    assert mean_ssim >= min_acceptable_ssim, (
        f"SSIM {mean_ssim:.4f} < {min_acceptable_ssim} for {model_id} / {ATTENTION_BACKEND}"
    )

fastvideo.tests.ssim.test_lingbot_similarity ¶

SSIM-based similarity test for LingBotWorld I2V with camera control.

Camera trajectory is loaded from LingBot example npy files (poses.npy/intrinsics.npy), matching the official script workflow.

Note: num_inference_steps is reduced to 4 for faster CI.

Classes¶

Functions¶

fastvideo.tests.ssim.test_longcat_similarity ¶

SSIM-based similarity tests for LongCat video generation.

Tests three LongCat modes: - T2V (Text-to-Video): 480p video from text prompt - I2V (Image-to-Video): 480p video from image + text prompt
- VC (Video Continuation): 480p video continuation from input video + text prompt

Sampling parameters are derived from: - examples/inference/basic/basic_longcat_t2v.py - examples/inference/basic/basic_longcat_i2v.py - examples/inference/basic/basic_longcat_vc.py

Note: num_inference_steps is reduced for CI speed (4 steps vs 50 in examples).

Classes¶

Functions¶

fastvideo.tests.ssim.test_longcat_similarity.test_longcat_i2v_similarity ¶

test_longcat_i2v_similarity(prompt: str, ATTENTION_BACKEND: str)

Test LongCat I2V inference and compare output to reference videos using SSIM.

Parameters derived from examples/inference/basic/basic_longcat_i2v.py

Source code in fastvideo/tests/ssim/test_longcat_similarity.py

@pytest.mark.parametrize("prompt", I2V_TEST_PROMPTS)
@pytest.mark.parametrize("ATTENTION_BACKEND", ["FLASH_ATTN"])
def test_longcat_i2v_similarity(prompt: str, ATTENTION_BACKEND: str):
    """
    Test LongCat I2V inference and compare output to reference videos using SSIM.

    Parameters derived from examples/inference/basic/basic_longcat_i2v.py
    """
    os.environ["FASTVIDEO_ATTENTION_BACKEND"] = ATTENTION_BACKEND

    params = select_ssim_params(LONGCAT_I2V_PARAMS, LONGCAT_I2V_FULL_QUALITY_PARAMS)

    script_dir = os.path.dirname(os.path.abspath(__file__))
    model_id = "LongCat-Video-I2V"

    output_dir = build_generated_output_dir(
        script_dir,
        device_reference_folder,
        model_id,
        ATTENTION_BACKEND,
    )
    output_video_name = f"{prompt[:100].strip()}.mp4"
    os.makedirs(output_dir, exist_ok=True)

    # Get image path for this prompt
    prompt_idx = I2V_TEST_PROMPTS.index(prompt)
    image_path = _resolve_asset_path(I2V_IMAGE_PATHS[prompt_idx])

    init_kwargs = {
        "num_gpus": params["num_gpus"],
        "use_fsdp_inference": True,
        "dit_cpu_offload": True,
        "vae_cpu_offload": True,
        "text_encoder_cpu_offload": True,
        "enable_bsa": False,
    }

    generation_kwargs = {
        "output_path": output_dir,
        "image_path": image_path,
        "height": params["height"],
        "width": params["width"],
        "num_frames": params["num_frames"],
        "num_inference_steps": params["num_inference_steps"],
        "guidance_scale": params["guidance_scale"],
        "fps": params["fps"],
        "seed": params["seed"],
        "negative_prompt": params["negative_prompt"],
    }

    generator = VideoGenerator.from_pretrained(
        model_path=params["model_path"], **init_kwargs
    )
    generator.generate_video(prompt, **generation_kwargs)
    generator.shutdown()

    generated_video_path = os.path.join(output_dir, output_video_name)
    assert os.path.exists(generated_video_path), (
        f"Output video was not generated at {generated_video_path}"
    )

    # Find reference video
    reference_folder = build_reference_folder_path(
        script_dir,
        device_reference_folder,
        model_id,
        ATTENTION_BACKEND,
    )
    if not os.path.exists(reference_folder):
        raise FileNotFoundError(
            f"Reference video folder does not exist: {reference_folder}"
        )

    reference_video_name = None
    for filename in os.listdir(reference_folder):
        if filename.endswith(".mp4") and prompt[:100].strip() in filename:
            reference_video_name = filename
            break

    if not reference_video_name:
        raise FileNotFoundError(
            f"Reference video not found for prompt: {prompt[:50]}... with backend: {ATTENTION_BACKEND}"
        )

    reference_video_path = os.path.join(reference_folder, reference_video_name)

    logger.info(f"Computing SSIM between {reference_video_path} and {generated_video_path}")
    ssim_values = compute_video_ssim_torchvision(
        reference_video_path, generated_video_path, use_ms_ssim=True
    )

    mean_ssim = ssim_values[0]
    logger.info(f"SSIM mean value: {mean_ssim}")

    write_ssim_results(
        output_dir, ssim_values, reference_video_path, generated_video_path,
        params["num_inference_steps"], prompt
    )

    min_acceptable_ssim = 0.90
    assert mean_ssim >= min_acceptable_ssim, (
        f"SSIM value {mean_ssim} is below threshold {min_acceptable_ssim} "
        f"for {model_id} with backend {ATTENTION_BACKEND}"
    )

fastvideo.tests.ssim.test_longcat_similarity.test_longcat_t2v_similarity ¶

test_longcat_t2v_similarity(prompt: str, ATTENTION_BACKEND: str)

Test LongCat T2V inference and compare output to reference videos using SSIM.

Parameters derived from examples/inference/basic/basic_longcat_t2v.py

Source code in fastvideo/tests/ssim/test_longcat_similarity.py

@pytest.mark.parametrize("prompt", T2V_TEST_PROMPTS)
@pytest.mark.parametrize("ATTENTION_BACKEND", ["FLASH_ATTN"])
def test_longcat_t2v_similarity(prompt: str, ATTENTION_BACKEND: str):
    """
    Test LongCat T2V inference and compare output to reference videos using SSIM.

    Parameters derived from examples/inference/basic/basic_longcat_t2v.py
    """
    os.environ["FASTVIDEO_ATTENTION_BACKEND"] = ATTENTION_BACKEND

    params = select_ssim_params(LONGCAT_T2V_PARAMS, LONGCAT_T2V_FULL_QUALITY_PARAMS)

    script_dir = os.path.dirname(os.path.abspath(__file__))
    model_id = "LongCat-Video-T2V"

    output_dir = build_generated_output_dir(
        script_dir,
        device_reference_folder,
        model_id,
        ATTENTION_BACKEND,
    )
    output_video_name = f"{prompt[:100].strip()}.mp4"
    os.makedirs(output_dir, exist_ok=True)

    init_kwargs = {
        "num_gpus": params["num_gpus"],
        "use_fsdp_inference": True,
        "dit_cpu_offload": True,
        "vae_cpu_offload": True,
        "text_encoder_cpu_offload": True,
        "enable_bsa": False,
    }

    generation_kwargs = {
        "output_path": output_dir,
        "height": params["height"],
        "width": params["width"],
        "num_frames": params["num_frames"],
        "num_inference_steps": params["num_inference_steps"],
        "guidance_scale": params["guidance_scale"],
        "fps": params["fps"],
        "seed": params["seed"],
        "negative_prompt": params["negative_prompt"],
    }

    generator = VideoGenerator.from_pretrained(
        model_path=params["model_path"], **init_kwargs
    )
    generator.generate_video(prompt, **generation_kwargs)
    generator.shutdown()

    generated_video_path = os.path.join(output_dir, output_video_name)
    assert os.path.exists(generated_video_path), (
        f"Output video was not generated at {generated_video_path}"
    )

    # Find reference video
    reference_folder = build_reference_folder_path(
        script_dir,
        device_reference_folder,
        model_id,
        ATTENTION_BACKEND,
    )
    if not os.path.exists(reference_folder):
        raise FileNotFoundError(
            f"Reference video folder does not exist: {reference_folder}"
        )

    reference_video_name = None
    for filename in os.listdir(reference_folder):
        if filename.endswith(".mp4") and prompt[:100].strip() in filename:
            reference_video_name = filename
            break

    if not reference_video_name:
        raise FileNotFoundError(
            f"Reference video not found for prompt: {prompt[:50]}... with backend: {ATTENTION_BACKEND}"
        )

    reference_video_path = os.path.join(reference_folder, reference_video_name)

    logger.info(f"Computing SSIM between {reference_video_path} and {generated_video_path}")
    ssim_values = compute_video_ssim_torchvision(
        reference_video_path, generated_video_path, use_ms_ssim=True
    )

    mean_ssim = ssim_values[0]
    logger.info(f"SSIM mean value: {mean_ssim}")

    write_ssim_results(
        output_dir, ssim_values, reference_video_path, generated_video_path,
        params["num_inference_steps"], prompt
    )

    min_acceptable_ssim = 0.90
    assert mean_ssim >= min_acceptable_ssim, (
        f"SSIM value {mean_ssim} is below threshold {min_acceptable_ssim} "
        f"for {model_id} with backend {ATTENTION_BACKEND}"
    )

fastvideo.tests.ssim.test_longcat_similarity.test_longcat_vc_similarity ¶

test_longcat_vc_similarity(prompt: str, ATTENTION_BACKEND: str)

Test LongCat VC (Video Continuation) inference and compare output to reference videos using SSIM.

Parameters derived from examples/inference/basic/basic_longcat_vc.py

Source code in fastvideo/tests/ssim/test_longcat_similarity.py

@pytest.mark.parametrize("prompt", VC_TEST_PROMPTS)
@pytest.mark.parametrize("ATTENTION_BACKEND", ["FLASH_ATTN"])
def test_longcat_vc_similarity(prompt: str, ATTENTION_BACKEND: str):
    """
    Test LongCat VC (Video Continuation) inference and compare output to reference videos using SSIM.

    Parameters derived from examples/inference/basic/basic_longcat_vc.py
    """
    os.environ["FASTVIDEO_ATTENTION_BACKEND"] = ATTENTION_BACKEND

    params = select_ssim_params(LONGCAT_VC_PARAMS, LONGCAT_VC_FULL_QUALITY_PARAMS)

    script_dir = os.path.dirname(os.path.abspath(__file__))
    model_id = "LongCat-Video-VC"

    output_dir = build_generated_output_dir(
        script_dir,
        device_reference_folder,
        model_id,
        ATTENTION_BACKEND,
    )
    output_video_name = f"{prompt[:100].strip()}.mp4"
    os.makedirs(output_dir, exist_ok=True)

    # Get video path for this prompt
    prompt_idx = VC_TEST_PROMPTS.index(prompt)
    video_path = _resolve_asset_path(VC_VIDEO_PATHS[prompt_idx])

    if not os.path.exists(video_path):
        pytest.skip(f"Input video not found at {video_path}")

    init_kwargs = {
        "num_gpus": params["num_gpus"],
        "use_fsdp_inference": False,
        "dit_cpu_offload": False,
        "vae_cpu_offload": True,
        "text_encoder_cpu_offload": True,
        "pin_cpu_memory": False,
        "enable_bsa": False,
    }

    generation_kwargs = {
        "output_path": output_dir,
        "video_path": video_path,
        "num_cond_frames": params["num_cond_frames"],
        "height": params["height"],
        "width": params["width"],
        "num_frames": params["num_frames"],
        "num_inference_steps": params["num_inference_steps"],
        "guidance_scale": params["guidance_scale"],
        "fps": params["fps"],
        "seed": params["seed"],
        "negative_prompt": params["negative_prompt"],
    }

    generator = VideoGenerator.from_pretrained(
        model_path=params["model_path"], **init_kwargs
    )
    generator.generate_video(prompt, **generation_kwargs)
    generator.shutdown()

    generated_video_path = os.path.join(output_dir, output_video_name)
    assert os.path.exists(generated_video_path), (
        f"Output video was not generated at {generated_video_path}"
    )

    # Find reference video
    reference_folder = build_reference_folder_path(
        script_dir,
        device_reference_folder,
        model_id,
        ATTENTION_BACKEND,
    )
    if not os.path.exists(reference_folder):
        raise FileNotFoundError(
            f"Reference video folder does not exist: {reference_folder}"
        )

    reference_video_name = None
    for filename in os.listdir(reference_folder):
        if filename.endswith(".mp4") and prompt[:100].strip() in filename:
            reference_video_name = filename
            break

    if not reference_video_name:
        raise FileNotFoundError(
            f"Reference video not found for prompt: {prompt[:50]}... with backend: {ATTENTION_BACKEND}"
        )

    reference_video_path = os.path.join(reference_folder, reference_video_name)

    logger.info(f"Computing SSIM between {reference_video_path} and {generated_video_path}")
    ssim_values = compute_video_ssim_torchvision(
        reference_video_path, generated_video_path, use_ms_ssim=True
    )

    mean_ssim = ssim_values[0]
    logger.info(f"SSIM mean value: {mean_ssim}")

    write_ssim_results(
        output_dir, ssim_values, reference_video_path, generated_video_path,
        params["num_inference_steps"], prompt
    )

    min_acceptable_ssim = 0.90
    assert mean_ssim >= min_acceptable_ssim, (
        f"SSIM value {mean_ssim} is below threshold {min_acceptable_ssim} "
        f"for {model_id} with backend {ATTENTION_BACKEND}"
    )

fastvideo.tests.ssim.test_ltx2_similarity ¶

SSIM-based similarity test for LTX-2 distilled text-to-video.

Parameters derived from examples/inference/basic/basic_ltx2_distilled.py, with resolution + num_inference_steps reduced to keep GPU CI runtime bounded. Full-quality variant (via --ssim-full-quality) falls back to the ltx2_distilled preset defaults.

Classes¶

Functions¶

fastvideo.tests.ssim.test_matrixgame_similarity ¶

Classes¶

Functions¶

fastvideo.tests.ssim.test_matrixgame_similarity.test_matrixgame_similarity ¶

test_matrixgame_similarity(prompt, ATTENTION_BACKEND, model_id)

Test that runs inference with different parameters and compares the output to reference videos using SSIM.

Source code in fastvideo/tests/ssim/test_matrixgame_similarity.py

@pytest.mark.parametrize("prompt", TEST_PROMPTS)
@pytest.mark.parametrize("ATTENTION_BACKEND", ["FLASH_ATTN"])
@pytest.mark.parametrize("model_id", list(MODEL_TO_PARAMS.keys()))
def test_matrixgame_similarity(prompt, ATTENTION_BACKEND, model_id):
    """
    Test that runs inference with different parameters and compares the output
    to reference videos using SSIM.
    """
    os.environ["FASTVIDEO_ATTENTION_BACKEND"] = ATTENTION_BACKEND

    script_dir = os.path.dirname(os.path.abspath(__file__))

    output_dir = build_generated_output_dir(
        script_dir,
        device_reference_folder,
        model_id,
        ATTENTION_BACKEND,
    )
    output_video_name = "output.mp4"

    os.makedirs(output_dir, exist_ok=True)

    params_map = select_ssim_params(
        MODEL_TO_PARAMS,
        FULL_QUALITY_MODEL_TO_PARAMS,
    )
    BASE_PARAMS = params_map[model_id]
    num_inference_steps = BASE_PARAMS["num_inference_steps"]

    # Create action conditions for MatrixGame
    actions = create_action_presets(
        BASE_PARAMS["num_frames"], keyboard_dim=BASE_PARAMS["keyboard_dim"], seed=BASE_PARAMS["seed"]
    )
    latent_frames = (BASE_PARAMS["num_frames"] - 1) // 4 + 1
    grid_sizes = torch.tensor([latent_frames, 44, 80])

    init_kwargs = {
        "num_gpus": BASE_PARAMS["num_gpus"],
        "use_fsdp_inference": True,
        "dit_layerwise_offload": False,
        "dit_cpu_offload": False,
        "vae_cpu_offload": False,
        "text_encoder_cpu_offload": True,
        "pin_cpu_memory": True,
    }

    generation_kwargs = {
        "num_inference_steps": num_inference_steps,
        "output_path": output_dir,
        "image_path": TEST_IMAGE_PATHS[0],
        "height": BASE_PARAMS["height"],
        "width": BASE_PARAMS["width"],
        "num_frames": BASE_PARAMS["num_frames"],
        "seed": BASE_PARAMS["seed"],
        "mouse_cond": actions["mouse"].unsqueeze(0),
        "keyboard_cond": actions["keyboard"].unsqueeze(0),
        "grid_sizes": grid_sizes,
        "save_video": True,
    }

    generator = VideoGenerator.from_pretrained(model_path=BASE_PARAMS["model_path"], **init_kwargs)
    generator.generate_video(prompt, **generation_kwargs)

    if isinstance(generator.executor, MultiprocExecutor):
        generator.executor.shutdown()

    assert os.path.exists(output_dir), f"Output video was not generated at {output_dir}"

    reference_folder = build_reference_folder_path(
        script_dir,
        device_reference_folder,
        model_id,
        ATTENTION_BACKEND,
    )

    if not os.path.exists(reference_folder):
        logger.error("Reference folder missing")
        raise FileNotFoundError(f"Reference video folder does not exist: {reference_folder}")

    # Find the matching reference video based on the prompt
    reference_video_name = None

    for filename in os.listdir(reference_folder):
        if filename.endswith(".mp4"):
            reference_video_name = filename
            break

    if not reference_video_name:
        logger.error(f"Reference video not found for model: {model_id} with backend: {ATTENTION_BACKEND}")
        raise FileNotFoundError("Reference video missing")

    reference_video_path = os.path.join(reference_folder, reference_video_name)
    generated_video_path = os.path.join(output_dir, output_video_name)

    logger.info(f"Computing SSIM between {reference_video_path} and {generated_video_path}")
    ssim_values = compute_video_ssim_torchvision(reference_video_path, generated_video_path, use_ms_ssim=True)

    mean_ssim = ssim_values[0]
    logger.info(f"SSIM mean value: {mean_ssim}")
    logger.info(f"Writing SSIM results to directory: {output_dir}")

    success = write_ssim_results(
        output_dir,
        ssim_values,
        reference_video_path,
        generated_video_path,
        num_inference_steps,
        prompt,
    )

    if not success:
        logger.error("Failed to write SSIM results to file")

    min_acceptable_ssim = 0.98
    assert mean_ssim >= min_acceptable_ssim, (
        f"SSIM value {mean_ssim} is below threshold {min_acceptable_ssim} for {model_id} with backend {ATTENTION_BACKEND}"
    )

fastvideo.tests.utils ¶

Functions¶

fastvideo.tests.utils.compare_folders ¶

compare_folders(reference_folder, generated_folder, use_ms_ssim=True)

Compare videos with the same filename between reference_folder and generated_folder

Example usage:
    results = compare_folders(reference_folder, generated_folder,
                          args.use_ms_ssim)
    for video_name, ssim_value in results.items():
        if ssim_value is not None:
            print(
                f"{video_name}: {ssim_value[0]:.4f}, Min SSIM: {ssim_value[1]:.4f}, Max SSIM: {ssim_value[2]:.4f}"
            )
        else:
            print(f"{video_name}: Error during comparison")

    valid_ssims = [v for v in results.values() if v is not None]
    if valid_ssims:
        avg_ssim = np.mean([v[0] for v in valid_ssims])
        print(f"

Average SSIM across all videos: {avg_ssim:.4f}") else: print(" No valid SSIM values to average")

Source code in fastvideo/tests/utils.py

def compare_folders(reference_folder, generated_folder, use_ms_ssim=True):
    """
    Compare videos with the same filename between reference_folder and generated_folder

    Example usage:
        results = compare_folders(reference_folder, generated_folder,
                              args.use_ms_ssim)
        for video_name, ssim_value in results.items():
            if ssim_value is not None:
                print(
                    f"{video_name}: {ssim_value[0]:.4f}, Min SSIM: {ssim_value[1]:.4f}, Max SSIM: {ssim_value[2]:.4f}"
                )
            else:
                print(f"{video_name}: Error during comparison")

        valid_ssims = [v for v in results.values() if v is not None]
        if valid_ssims:
            avg_ssim = np.mean([v[0] for v in valid_ssims])
            print(f"\nAverage SSIM across all videos: {avg_ssim:.4f}")
        else:
            print("\nNo valid SSIM values to average")
    """

    reference_videos = [f for f in os.listdir(reference_folder) if f.endswith(".mp4")]

    results = {}

    for video_name in reference_videos:
        ref_path = os.path.join(reference_folder, video_name)
        gen_path = os.path.join(generated_folder, video_name)

        if os.path.exists(gen_path):
            print(f"\nComparing {video_name}...")
            try:
                ssim_value = compute_video_ssim_torchvision(ref_path, gen_path, use_ms_ssim)
                results[video_name] = ssim_value
            except Exception as e:
                print(f"Error comparing {video_name}: {e}")
                results[video_name] = None
        else:
            print(f"\nSkipping {video_name} - no matching file in generated folder")

    return results

fastvideo.tests.utils.compute_video_ssim_torchvision ¶

compute_video_ssim_torchvision(video1_path, video2_path, use_ms_ssim=True)

Compute SSIM between two videos.

Parameters:

Name	Description	Default
`video1_path`	Path to the first video.	required
`video2_path`	Path to the second video.	required
`use_ms_ssim`	Whether to use Multi-Scale Structural Similarity(MS-SSIM) instead of SSIM.	`True`

Source code in fastvideo/tests/utils.py

def compute_video_ssim_torchvision(video1_path, video2_path, use_ms_ssim=True):
    """
    Compute SSIM between two videos.

    Args:
        video1_path: Path to the first video.
        video2_path: Path to the second video.
        use_ms_ssim: Whether to use Multi-Scale Structural Similarity(MS-SSIM) instead of SSIM.
    """
    print(f"Computing SSIM between {video1_path} and {video2_path}...")
    if not os.path.exists(video1_path):
        raise FileNotFoundError(f"Video1 not found: {video1_path}")
    if not os.path.exists(video2_path):
        raise FileNotFoundError(f"Video2 not found: {video2_path}")

    frames1 = _read_video_frames(video1_path)
    frames2 = _read_video_frames(video2_path)

    # Ensure same number of frames
    min_frames = min(frames1.shape[0], frames2.shape[0])
    frames1 = frames1[:min_frames]
    frames2 = frames2[:min_frames]

    frames1 = frames1.float() / 255.0
    frames2 = frames2.float() / 255.0

    if torch.cuda.is_available():
        frames1 = frames1.cuda()
        frames2 = frames2.cuda()

    ssim_values = []

    # Process each frame individually
    for i in range(min_frames):
        img1 = frames1[i : i + 1]
        img2 = frames2[i : i + 1]

        with torch.no_grad():
            value = ms_ssim(img1, img2, data_range=1.0) if use_ms_ssim else ssim(img1, img2, data_range=1.0)

            ssim_values.append(value.item())

    if ssim_values:
        mean_ssim = np.mean(ssim_values)
        min_ssim = np.min(ssim_values)
        max_ssim = np.max(ssim_values)
        min_frame_idx = np.argmin(ssim_values)
        max_frame_idx = np.argmax(ssim_values)

        print(f"Mean SSIM: {mean_ssim:.4f}")
        print(f"Min SSIM: {min_ssim:.4f} (at frame {min_frame_idx})")
        print(f"Max SSIM: {max_ssim:.4f} (at frame {max_frame_idx})")

        return mean_ssim, min_ssim, max_ssim
    else:
        print("No SSIM values calculated")
        return 0, 0, 0

fastvideo.tests.utils.write_ssim_results ¶

write_ssim_results(output_dir, ssim_values, reference_path, generated_path, num_inference_steps, prompt)

Write SSIM results to a JSON file in the same directory as the generated videos.

Source code in fastvideo/tests/utils.py

def write_ssim_results(output_dir, ssim_values, reference_path, generated_path, num_inference_steps, prompt):
    """
    Write SSIM results to a JSON file in the same directory as the generated videos.
    """
    try:
        logger.info(f"Attempting to write SSIM results to directory: {output_dir}")

        if not os.path.exists(output_dir):
            os.makedirs(output_dir, exist_ok=True)

        mean_ssim, min_ssim, max_ssim = ssim_values

        result = {
            "mean_ssim": mean_ssim,
            "min_ssim": min_ssim,
            "max_ssim": max_ssim,
            "reference_video": reference_path,
            "generated_video": generated_path,
            "parameters": {"num_inference_steps": num_inference_steps, "prompt": prompt},
        }

        test_name = f"steps{num_inference_steps}_{prompt[:100]}"
        result_file = os.path.join(output_dir, f"{test_name}_ssim.json")
        logger.info(f"Writing JSON results to: {result_file}")
        with open(result_file, "w") as f:
            json.dump(result, f, indent=2)

        logger.info(f"SSIM results written to {result_file}")
        return True
    except Exception as e:
        logger.error(f"ERROR writing SSIM results: {str(e)}")
        return False