registry
¶
Replica registry + health-check loop.
The registry tracks the set of known backend replicas and their live health. The router consults it for "pick a backend for this session" decisions and a background task updates it from periodic HTTP probes.
State machine per replica::
HEALTHY ──(N consecutive failures)──▶ UNHEALTHY
▲ │
└──────(M consecutive successes)──────┘
Where N = :attr:RouterConfig.failure_threshold and
M = :attr:RouterConfig.recovery_threshold.
Attributes¶
fastvideo.entrypoints.streaming.router.registry.HttpProbe
module-attribute
¶
HttpProbe = Any
Structural alias for health-probe callables. Concrete signature is
async def __call__(url: str, *, timeout: float) -> tuple[float,
str | None]; typing.Callable cannot express keyword-only parameters,
so duck-typing is the pragmatic compromise.
Classes¶
fastvideo.entrypoints.streaming.router.registry.ReplicaRegistry
¶
ReplicaRegistry(replicas: list[ReplicaEndpoint])
Stateful map of replica URL → :class:Replica.
Selection favors primary replicas when healthy; otherwise the first
healthy non-primary is returned. When none are healthy, the
registry returns None so the router can reject incoming
sessions with gpu_unavailable.
Source code in fastvideo/entrypoints/streaming/router/registry.py
Functions¶
fastvideo.entrypoints.streaming.router.registry.ReplicaRegistry.select
¶
Pick the best healthy replica.
Priority order:
- The first healthy primary (insertion order).
- The first healthy non-primary (insertion order).
Nonewhen nothing is healthy.
This MVP picks the first match within each tier; it does NOT load-balance across multiple healthy replicas of the same tier. Round-robin and weighted distribution are deferred until a real N-way active deployment exists.
Source code in fastvideo/entrypoints/streaming/router/registry.py
Functions¶
fastvideo.entrypoints.streaming.router.registry.run_health_check_loop
async
¶
run_health_check_loop(registry: ReplicaRegistry, config: RouterConfig, *, stop_event: Event, http_get: HttpProbe | None = None) -> None
Poll all replicas' health endpoints in parallel on a fixed interval.
http_get is pluggable so unit tests can inject a deterministic
probe without hitting the network. The default builds a single
httpx.AsyncClient shared across the loop's lifetime so the
common case (steady polling against a stable replica set) reuses
TCP/TLS connections instead of paying handshake cost per probe.
Probes within one polling cycle run concurrently via asyncio.gather
so a slow replica doesn't push the cycle past
health_check_interval_seconds.