gpu_pool
¶
GPU pool manager for the streaming server.
Replaces the single-generator path in PR 7.5 with a typed pool abstraction. Three implementations ship here:
- :class:
InProcessGpuPool— one in-processVideoGenerator; used by tests and single-GPU dev deployments. - :class:
SubprocessGpuPool— onemultiprocessing.Processper GPU, each running :func:worker_mainagainst aGeneratorConfig. Jobs are dispatched viamultiprocessing.Queue. - :class:
GpuPool(abstract) — the interface both use.
Session-to-GPU binding lives in the pool so continuation state stays
on the GPU that generated the previous segment (matching the internal
gpu_pool.py's per-GPU cache behavior). Cross-GPU handoff is
supported via :class:SessionStore snapshot + hydrate, which
serializes the state before the migration and rehydrates it on the
new worker.
Typed config: workers start from a :class:GeneratorConfig (no flat
LTX-2 kwargs), satisfying the PR 6 + PR 7 contracts that the public
surface doesn't reintroduce the legacy kwarg bag.
Classes¶
fastvideo.entrypoints.streaming.gpu_pool.GpuPool
¶
Bases: ABC
Abstract GPU pool.
acquire binds a session to a worker and holds that binding
across segments so continuation state can stay hot. run submits
a single GenerationRequest for a bound session.
Acquire / release are independent of run — a session can run many segments on one acquired worker, and must release on disconnect.
fastvideo.entrypoints.streaming.gpu_pool.InProcessGpuPool
¶
InProcessGpuPool(generator: _GeneratorLike, *, gpu_id: int = 0, session_store: SessionStore | None = None)
Bases: GpuPool
Single-process pool backed by one :class:_GeneratorLike.
This is what PR 7.5's server uses by default; PR 7.6 adds the real
SubprocessGpuPool alternative but keeps this one for tests and
small deployments.
Source code in fastvideo/entrypoints/streaming/gpu_pool.py
fastvideo.entrypoints.streaming.gpu_pool.PoolAcquireTimeout
¶
Bases: RuntimeError
Raised when acquire times out waiting for a free worker.
fastvideo.entrypoints.streaming.gpu_pool.PoolAssignment
dataclass
¶
The worker a session is currently bound to.
fastvideo.entrypoints.streaming.gpu_pool.SubprocessGpuPool
¶
SubprocessGpuPool(generator_config: GeneratorConfig, *, pool_config: GpuPoolConfig, warmup_config: WarmupConfig | None = None, session_store: SessionStore | None = None, worker_factory: WorkerFactory | None = None)
Bases: GpuPool
One multiprocessing.Process per GPU.
Each worker boots :class:fastvideo.VideoGenerator from a typed
:class:GeneratorConfig inside the child process (post-
CUDA_VISIBLE_DEVICES setup) and consumes jobs from an mp Queue.
This is the production shape: the parent process stays CPU-only, and
GPU state never crosses process boundaries. Continuation state is
serialized through :class:SessionStore for cross-GPU handoff.
PR 7.6 ships this as an opt-in; PR 7.5's in-process pool remains the default until nightly runs validate the subprocess path.
Source code in fastvideo/entrypoints/streaming/gpu_pool.py
Functions¶
fastvideo.entrypoints.streaming.gpu_pool.SubprocessGpuPool.start
async
¶
Spawn worker processes and wait for each to report ready.
Source code in fastvideo/entrypoints/streaming/gpu_pool.py
Functions¶
fastvideo.entrypoints.streaming.gpu_pool.worker_main
¶
worker_main(*, gpu_id: int, worker_id: str, generator_config: GeneratorConfig, warmup_config: WarmupConfig, job_queue: Queue, result_queue: Queue, shutdown_event: Any) -> None
Per-worker subprocess entry.
Runs inside the child spawned by SubprocessGpuPool. Blocking
VideoGenerator construction + generation happens here, not in
the parent's event loop.