qwen3
¶
Qwen3 text encoder configuration for FastVideo diffusion models (e.g. Flux2 Klein).
Classes¶
fastvideo.configs.models.encoders.qwen3.Qwen3TextArchConfig
dataclass
¶
Qwen3TextArchConfig(stacked_params_mapping: list[tuple[str, str, str | int]] = (lambda: [('.qkv_proj', '.q_proj', 'q'), ('.qkv_proj', '.k_proj', 'k'), ('.qkv_proj', '.v_proj', 'v'), ('.gate_up_proj', '.gate_proj', 0), ('.gate_up_proj', '.up_proj', 1)])(), architectures: list[str] = (lambda: [])(), _supported_attention_backends: tuple[AttentionBackendEnum, ...] = (FLASH_ATTN, TORCH_SDPA), output_hidden_states: bool = True, use_return_dict: bool = True, vocab_size: int = 151936, hidden_size: int = 2560, num_hidden_layers: int = 36, num_attention_heads: int = 32, pad_token_id: int = 151643, eos_token_id: int = 151645, text_len: int = 512, hidden_state_skip_layer: int = 0, decoder_start_token_id: int = 0, output_past: bool = True, scalable_attention: bool = True, tie_word_embeddings: bool = True, tokenizer_kwargs: dict[str, Any] = dict(), _fsdp_shard_conditions: list = (lambda: [_is_transformer_layer, _is_embeddings, _is_final_norm])(), require_processor: bool = False, intermediate_size: int = 9728, num_key_value_heads: int = 8, hidden_act: str = 'silu', max_position_embeddings: int = 40960, initializer_range: float = 0.02, rms_norm_eps: float = 1e-06, use_cache: bool = True, bos_token_id: int = 151643, rope_theta: float = 1000000.0, rope_scaling: dict | None = None, attention_bias: bool = False, attention_dropout: float = 0.0, mlp_bias: bool = False, head_dim: int = 128)
Bases: TextEncoderArchConfig
Architecture config for Qwen3 text encoder.
Qwen3 is similar to LLaMA but with QK-Norm (RMSNorm on Q and K before attention). Used by Flux2 Klein.
fastvideo.configs.models.encoders.qwen3.Qwen3TextConfig
dataclass
¶
Qwen3TextConfig(arch_config: TextEncoderArchConfig = Qwen3TextArchConfig(), prefix: str = 'qwen3', quant_config: QuantizationConfig | None = None, lora_config: Any | None = None, is_chat_model: bool = True, treat_empty_as_dot: bool = False)
Bases: TextEncoderConfig
Top-level config for Qwen3 text encoder.