OpenAI-compatible HTTP Contract¶
The stateless FastVideo HTTP server lives at
fastvideo/entrypoints/openai/.
Launch: fastvideo serve --config serve.yaml.
Endpoints¶
| Method | Path | Description |
|---|---|---|
POST |
/v1/videos/generations |
Synchronous video generation |
GET |
/v1/videos |
List prior jobs held in the in-memory store |
GET |
/v1/videos/{id} |
Job status / result |
GET |
/v1/videos/{id}/content |
Download the MP4 once ready |
POST |
/v1/images/generations |
Synchronous image generation |
GET |
/v1/models |
Enumerate registered models |
GET |
/health |
Liveness probe |
VideoGenerationsRequest shape¶
Mirrors the OpenAI POST /v1/videos/generations shape:
{
"prompt": "a fox running through snow",
"size": "1024x1536",
"seconds": 5,
"fps": 24,
"num_frames": 121,
"seed": 42,
"num_inference_steps": 8,
"guidance_scale": 1.0,
"negative_prompt": "blurry, low quality",
"input_reference": "/path/to/init.png"
}
SGLang-compatible extensions carried today:
num_inference_steps, guidance_scale, guidance_scale_2,
true_cfg_scale, negative_prompt, enable_teacache, output_path.
Merge precedence¶
The server builds a GenerationRequest each call using three layers,
highest first:
- Request body (client-explicit) — only fields carried in
request.model_fields_set(Pydantic v2). Unset fields do not count, even if the Pydantic model has a schema default for them. ServeConfig.default_request(operator-explicit) — projected viaexplicit_request_updates(); only fields the operator actually wrote into the YAML count as defaults. Every other field inherits the schema default rather than being pinned.- Hardcoded fallback — e.g.
fps = 24.
The gate matters: both surfaces carry schema defaults. Without
model_fields_set / explicit-path tracking, schema defaults would
masquerade as intent and silently shadow the other side.
See video_api.py::_build_generation_kwargs
for the canonical implementation; the per-request assembly lives there,
not in pipeline code.
Continuation state¶
The stateless surface accepts an opaque ContinuationState round-trip.
Clients that want continuation pass the prior state blob back on the
next request, and receive a new one on the response when
request.output.return_state = true.
Shape:
Payload is always JSON-serializable. Large tensors may live in an
opaque blob-store reference the client simply round-trips; see
LTX2ContinuationState.
Continuation is not yet wired all the way through to
generator.generate_video(...) — PR 7.6 (GPU pool upstream) is the
pipeline-level consumer. PR 7 locked the envelope so this surface is
stable ahead of that plumbing.
Error codes¶
| HTTP | Condition |
|---|---|
400 Bad Request |
Parse/validation failure (unknown field, type mismatch, incompatible preset/state) |
404 Not Found |
GET /v1/videos/{id} for an unknown job |
409 Conflict |
Job id already exists |
500 Internal Server Error |
Pipeline raised; body mirrors upstream OpenAI error envelope |
503 Service Unavailable |
No generator loaded, or shutdown in progress |
Errors include a JSON body with
{"error": {"type": "...", "message": "..."}} matching the OpenAI
Python SDK's expectation.
What does not cross this boundary¶
- Flat legacy kwargs (
ltx2_refine_enabled,torch_compile_kwargs, etc.) — these are init-time, configured viaServeConfig.generator, never per-request. - Private Dreamverse-only fields — those live in a private adapter on the Dreamverse side; the public FastVideo surface never promises backward compatibility for them.
- Raw tensor payloads (
ltx2_audio_clean_latentet al.) — these are derived by the pipeline fromContinuationState, never shipped as request fields.