Skip to content

flux_2

Classes

fastvideo.configs.pipelines.flux_2.Flux2KleinEncoderArchConfig dataclass

Flux2KleinEncoderArchConfig(stacked_params_mapping: list[tuple[str, str, str]] = list(), architectures: list[str] = (lambda: [])(), _supported_attention_backends: tuple[AttentionBackendEnum, ...] = (FLASH_ATTN, TORCH_SDPA), output_hidden_states: bool = True, use_return_dict: bool = True)

Bases: EncoderArchConfig

Encoder arch config for Flux2 Klein (Qwen3); needs hidden states for layers 9, 18, 27.

fastvideo.configs.pipelines.flux_2.Flux2KleinPipelineConfig dataclass

Flux2KleinPipelineConfig(model_path: str = '', pipeline_config_path: str | None = None, embedded_cfg_scale: float | None = None, flow_shift: float | None = None, flow_shift_sr: float | None = None, disable_autocast: bool = False, scheduler_step_in_fp32: bool = True, is_causal: bool = False, dit_config: DiTConfig = Flux2Config(), dit_precision: str = 'bf16', upsampler_config: UpsamplerConfig = UpsamplerConfig(), upsampler_precision: str = 'fp32', vae_config: VAEConfig = Flux2VAEConfig(), vae_precision: str = 'fp32', vae_tiling: bool = False, vae_sp: bool = False, image_encoder_config: EncoderConfig = EncoderConfig(), image_encoder_precision: str = 'fp32', text_encoder_configs: tuple[EncoderConfig, ...] = (lambda: (Qwen3TextConfig(),))(), text_encoder_precisions: tuple[str, ...] = (lambda: ('bf16',))(), preprocess_text_funcs: tuple[Callable[[str], str], ...] = (lambda: (preprocess_text,))(), postprocess_text_funcs: tuple[Callable[[BaseEncoderOutput], Tensor], ...] = (lambda: (flux2_klein_postprocess_text,))(), dmd_denoising_steps: list[int] | None = None, ti2v_task: bool = False, lucy_edit_task: bool = False, boundary_ratio: float | None = None, flux2_text_encoder_type: str = 'qwen3', text_encoder_out_layers: tuple[int, ...] = (9, 18, 27))

Bases: Flux2PipelineConfig

Configuration for Flux2 Klein (distilled, 4-step, no guidance).

fastvideo.configs.pipelines.flux_2.Flux2KleinTextEncoderConfig dataclass

Flux2KleinTextEncoderConfig(arch_config: EncoderArchConfig = Flux2KleinEncoderArchConfig(), prefix: str = '', quant_config: QuantizationConfig | None = None, lora_config: Any | None = None)

Bases: EncoderConfig

Text encoder config for Flux2 Klein (Qwen3).

fastvideo.configs.pipelines.flux_2.Flux2PipelineConfig dataclass

Flux2PipelineConfig(model_path: str = '', pipeline_config_path: str | None = None, embedded_cfg_scale: float | None = 4.0, flow_shift: float | None = None, flow_shift_sr: float | None = None, disable_autocast: bool = False, scheduler_step_in_fp32: bool = True, is_causal: bool = False, dit_config: DiTConfig = Flux2Config(), dit_precision: str = 'bf16', upsampler_config: UpsamplerConfig = UpsamplerConfig(), upsampler_precision: str = 'fp32', vae_config: VAEConfig = Flux2VAEConfig(), vae_precision: str = 'fp32', vae_tiling: bool = False, vae_sp: bool = False, image_encoder_config: EncoderConfig = EncoderConfig(), image_encoder_precision: str = 'fp32', text_encoder_configs: tuple[EncoderConfig, ...] = (lambda: (Mistral3TextConfig(),))(), text_encoder_precisions: tuple[str, ...] = (lambda: ('bf16',))(), preprocess_text_funcs: tuple[Callable[[str], str], ...] = (lambda: (preprocess_text,))(), postprocess_text_funcs: tuple[Callable[[BaseEncoderOutput], Tensor], ...] = (lambda: (default_postprocess_text,))(), dmd_denoising_steps: list[int] | None = None, ti2v_task: bool = False, lucy_edit_task: bool = False, boundary_ratio: float | None = None, flux2_text_encoder_type: str = 'mistral3', text_encoder_out_layers: tuple[int, ...] = (10, 20, 30))

Bases: PipelineConfig

Configuration for Flux2 image generation pipeline.

Methods:

fastvideo.configs.pipelines.flux_2.Flux2PipelineConfig.default_postprocess_text staticmethod
default_postprocess_text(outputs: BaseEncoderOutput) -> Tensor

Default text postprocessing for Flux2.

Source code in fastvideo/configs/pipelines/flux_2.py
@staticmethod
def default_postprocess_text(outputs: BaseEncoderOutput) -> torch.Tensor:
    """Default text postprocessing for Flux2."""
    return outputs.last_hidden_state

Functions:

fastvideo.configs.pipelines.flux_2.flux2_klein_postprocess_text

flux2_klein_postprocess_text(outputs: BaseEncoderOutput) -> Tensor

Klein postprocess: hidden states from layers 9, 18, 27 (Qwen3).

Source code in fastvideo/configs/pipelines/flux_2.py
def flux2_klein_postprocess_text(outputs: BaseEncoderOutput) -> torch.Tensor:
    """Klein postprocess: hidden states from layers 9, 18, 27 (Qwen3)."""
    hidden_states_layers: list[int] = [9, 18, 27]
    if outputs.hidden_states is None:
        raise ValueError("Flux2 Klein requires output_hidden_states=True from text encoder")
    out = torch.stack([outputs.hidden_states[k] for k in hidden_states_layers], dim=1)
    batch_size, num_channels, seq_len, hidden_dim = out.shape
    prompt_embeds = out.permute(0, 2, 1, 3).reshape(batch_size, seq_len, num_channels * hidden_dim)
    return prompt_embeds