Skip to content

test_inference_performance

Config-driven inference performance tests.

Benchmark configs live in .buildkite/performance-benchmarks/tests/*.json. Each JSON file defines model params, generation kwargs, run config, and per-device thresholds. This test module auto-discovers all configs and parametrizes a single test function over them.

Classes

Functions:

fastvideo.tests.performance.test_inference_performance.test_inference_performance

test_inference_performance(cfg)

Measure generation latency, peak GPU memory, and component-level timings (text encoder, DiT, VAE decode). Assert each against device-aware thresholds.

Source code in fastvideo/tests/performance/test_inference_performance.py
@pytest.mark.parametrize(
    "cfg",
    _BENCHMARK_CONFIGS,
    ids=[c["benchmark_id"] for c in _BENCHMARK_CONFIGS],
)
def test_inference_performance(cfg):
    """Measure generation latency, peak GPU memory, and component-level timings
    (text encoder, DiT, VAE decode). Assert each against device-aware thresholds.
    """

    original_env = os.environ.get("FASTVIDEO_STAGE_LOGGING")
    os.environ["FASTVIDEO_STAGE_LOGGING"] = "1"
    try:
        _run_benchmark(cfg)
    finally:
        if original_env is None:
            os.environ.pop("FASTVIDEO_STAGE_LOGGING", None)
        else:
            os.environ["FASTVIDEO_STAGE_LOGGING"] = original_env