Skip to content

FastVideo

audio

Initializing search

hao-ai-lab/FastVideo

Home
Getting Started
Inference
Training
Distillation
Attention
Utilities
Design
Developer Guide
API Reference

FastVideo

hao-ai-lab/FastVideo

Home
Getting Started
Getting Started
Inference
Inference
- Quick Start
- Configuration
- Offloading
- Optimizations
- ComfyUI
- Support Matrix
- CLI
- Add Pipeline
- Examples
  Examples
Training
Training
- Overview
- Training Infrastructure
- Experiment Tracking
- Data Preprocessing
- Fine-tuning
- Attn-QAT Training
- Examples
  Examples
Distillation
Distillation
Attention
Attention
Utilities
Utilities
- LoRA
- Debugging
Design
Design
- Overview
- Training Architecture
- Server Contracts
  Server Contracts
Developer Guide
Developer Guide
- Overview
- Developer Environment
  Developer Environment
  - Index
  - Docker
  - RunPod
- Pull Requests
- CI/CD Architecture
- Testing
- Performance Benchmarks
- Profiling
- Activation Trace
- Coding Agents
- Dreamverse Development
- Backend Development
API Reference
API Reference
- Overview
- fastvideo
  fastvideo
  - fastvideo
  - api
    api
    
    api
    
    compat
    
    errors
    
    flux
    
    matrixgame2
    
    matrixgame3
    
    overrides
    
    parser
    
    presets
    
    request_metadata
    
    results
    
    sampling_param
    
    schema
  - attention
    attention
    
    attention
    
    backends
    backends
    
    backends
    
    abstract
    
    attn_qat_infer
    
    attn_qat_train
    
    bsa_attn
    
    flash_attn
    
    nabla
    
    sage_attn
    
    sage_attn3
    
    sdpa
    
    sla
    
    video_sparse_attn
    
    vmoba
    
    layer
    
    selector
  - configs
    configs
    
    configs
    
    configs
    
    models
    models
    
    models
    
    audio
    audio
    
    audio
    
    ltx2_audio_vae
    
    base
    
    dits
    dits
    
    dits
    
    base
    
    cosmos
    
    cosmos2_5
    
    dreamx_world
    
    flux
    
    flux_2
    
    gen3c
    
    glm_image
    
    hunyuangamecraft
    
    hunyuanvideo
    
    hunyuanvideo15
    
    hyworld
    
    kandinsky5
    
    lingbot_video
    
    lingbotworld
    
    lingbotworld2
    
    longcat
    
    ltx2
    
    magi_human
    
    matrixgame2
    
    matrixgame3
    
    sd3
    
    stable_audio
    
    wanvideo
    
    zimage
    
    encoders
    encoders
    
    encoders
    
    base
    
    clip
    
    gemma
    
    lingbot_video
    
    lingbotworld2_t5
    
    llama
    
    mistral3
    
    qwen2_5
    
    qwen3
    
    reason1
    
    siglip
    
    stable_audio_conditioner
    
    t5
    
    t5gemma
    
    upsamplers
    upsamplers
    
    upsamplers
    
    base
    
    hunyuan15
    
    vaes
    vaes
    
    vaes
    
    autoencoder_kl
    
    base
    
    cosmos2_5vae
    
    cosmosvae
    
    flux2vae
    
    gamecraftvae
    
    gen3cvae
    
    glm_image
    
    hunyuan15vae
    
    hunyuanvae
    
    ltx2vae
    
    oobleck
    
    wanvae
    
    pipelines
    pipelines
    
    pipelines
    
    base
    
    cosmos
    
    cosmos2_5
    
    dreamx_world
    
    flux
    
    flux_2
    
    gen3c
    
    glm_image
    
    hunyuan
    
    hunyuan15
    
    hunyuangamecraft
    
    hyworld
    
    kandinsky5
    
    lingbot_video
    
    lingbotworld
    
    lingbotworld2
    
    longcat
    
    matrixgame2
    
    matrixgame3
    
    sd35
    
    stable_audio
    
    turbodiffusion
    
    wan
    
    zimage
    
    utils
  - dataset
    dataset
    
    dataset
    
    latent_datasets
    
    ltx2_precomputed_dataset
    
    parquet_dataset_iterable_style
    
    parquet_dataset_map_style
    
    preprocessing_datasets
    
    transform
    
    upload_to_hf
    
    utils
    
    validation_dataset
  - distributed
    distributed
    
    distributed
    
    communication_op
    
    device_communicators
    device_communicators
    
    device_communicators
    
    base_device_communicator
    
    cpu_communicator
    
    cuda_communicator
    
    npu_communicator
    
    pyhccl
    
    pyhccl_wrapper
    
    pynccl
    
    pynccl_wrapper
    
    parallel_state
    
    utils
  - entrypoints
    entrypoints
    
    entrypoints
    
    cli
    cli
    
    cli
    
    bench
    
    bench_serving
    
    cli_types
    
    eval
    
    generate
    
    inference_config
    
    main
    
    router_serve
    
    serve
    
    utils
    
    openai
    openai
    
    openai
    
    api_server
    
    common_api
    
    image_api
    
    protocol
    
    state
    
    stores
    
    utils
    
    video_api
    
    streaming
    streaming
    
    streaming
    
    gpu_pool
    
    health
    
    mock_server
    
    prompt
    prompt
    
    prompt
    
    enhancer
    
    providers
    providers
    
    providers
    
    base
    
    cerebras
    
    groq
    
    rewrite
    
    safety
    
    protocol
    
    router
    router
    
    router
    
    config
    
    main
    
    registry
    
    server
    
    session
    
    session_init_image
    
    session_logger
    
    session_store
    
    stream
    
    worker
    
    streaming_generator
    
    video_generator
  - envs
  - eval
    eval
    
    eval
    
    api
    
    datasets
    datasets
    
    datasets
    
    base
    
    physics_iq
    
    registry
    
    vbench
    
    evaluator
    
    io
    io
    
    io
    
    audio
    
    inputs
    
    paths
    
    video
    
    memory
    
    metrics
    metrics
    
    metrics
    
    audio
    audio
    
    audio
    
    audiobox_aesthetics
    audiobox_aesthetics
    
    audiobox_aesthetics
    
    metric
    
    clap_score
    clap_score
    
    clap_score
    
    metric
    
    desync
    desync
    
    desync
    
    metric
    
    frechet_distance
    frechet_distance
    
    frechet_distance
    
    metric
    
    imagebind_score
    imagebind_score
    
    imagebind_score
    
    metric
    
    kl_divergence
    kl_divergence
    
    kl_divergence
    
    metric
    
    wer
    wer
    
    wer
    
    metric
    
    base
    
    common
    common
    
    common
    
    fvd
    fvd
    
    fvd
    
    extractors
    
    metric
    
    lpips
    lpips
    
    lpips
    
    metric
    
    psnr
    psnr
    
    psnr
    
    metric
    
    ssim
    ssim
    
    ssim
    
    metric
    
    judge
    judge
    
    judge
    
    third_person_separation
    third_person_separation
    
    third_person_separation
    
    metric
    
    optical_flow
    optical_flow
    
    optical_flow
    
    gt_optical_flow
    gt_optical_flow
    
    gt_optical_flow
    
    metric
    
    synthetic_optical_flow
    synthetic_optical_flow
    
    synthetic_optical_flow
    
    metric
    
    physics_iq
    physics_iq
    
    physics_iq
    
    metric
    
    mse
    mse
    
    mse
    
    metric
    
    spatial_iou
    spatial_iou
    
    spatial_iou
    
    metric
    
    spatiotemporal_iou
    spatiotemporal_iou
    
    spatiotemporal_iou
    
    metric
    
    utils
    
    weighted_spatial_iou
    weighted_spatial_iou
    
    weighted_spatial_iou
    
    metric
    
    vbench
    vbench
    
    vbench
    
    aesthetic_quality
    aesthetic_quality
    
    aesthetic_quality
    
    metric
    
    appearance_style
    appearance_style
    
    appearance_style
    
    metric
    
    background_consistency
    background_consistency
    
    background_consistency
    
    metric
    
    color
    color
    
    color
    
    metric
    
    dynamic_degree
    dynamic_degree
    
    dynamic_degree
    
    metric
    
    human_action
    human_action
    
    human_action
    
    metric
    
    imaging_quality
    imaging_quality
    
    imaging_quality
    
    metric
    
    motion_smoothness
    motion_smoothness
    
    motion_smoothness
    
    metric
    
    multiple_objects
    multiple_objects
    
    multiple_objects
    
    metric
    
    object_class
    object_class
    
    object_class
    
    metric
    
    overall_consistency
    overall_consistency
    
    overall_consistency
    
    metric
    
    scene
    scene
    
    scene
    
    metric
    
    spatial_relationship
    spatial_relationship
    
    spatial_relationship
    
    metric
    
    subject_consistency
    subject_consistency
    
    subject_consistency
    
    metric
    
    temporal_flickering
    temporal_flickering
    
    temporal_flickering
    
    metric
    
    temporal_style
    temporal_style
    
    temporal_style
    
    metric
    
    videoscore2
    videoscore2
    
    videoscore2
    
    metric
    
    models
    
    pool
    
    registry
    
    types
    
    worker
  - fastvideo_args
  - forward_context
  - hooks
    hooks
    
    hooks
    
    activation_trace
    
    hooks
    
    layerwise_offload
  - image_processor
  - layers
    layers
    
    layers
    
    activation
    
    custom_op
    
    fp4linear
    
    fp8linear
    
    layernorm
    
    linear
    
    mlp
    
    quantization
    quantization
    
    quantization
    
    absmax_fp8
    
    base_config
    
    fp8_config
    
    fp8_qat_train_config
    
    nvfp4_config
    
    nvfp4_qat_config
    
    nvfp4_qat_train_config
    
    rotary_embedding
    
    rotary_embedding_3d
    
    utils
    
    visual_embedding
    
    vocab_parallel_embedding
  - logger
  - logging_utils
    logging_utils
    
    logging_utils
    
    formatter
  - models
    models
    
    models
    
    audio
    audio
    
    audio
    
    ltx2_audio_processing
    
    ltx2_audio_vae
    
    camera
    camera
    
    camera
    
    trajectory
    
    dits
    dits
    
    dits
    
    base
    
    causal_wanvideo
    
    cosmos
    
    cosmos2_5
    
    dreamx_world
    
    dreamx_world_ar
    
    flux
    
    flux_2
    
    gen3c
    
    glm_image
    
    hunyuangamecraft
    
    hunyuanvideo
    
    hunyuanvideo15
    
    hyworld
    hyworld
    
    hyworld
    
    camera_rope
    
    data_utils
    
    hyworld
    
    pose
    
    resolution_utils
    
    retrieval_context
    
    trajectory
    
    kandinsky5
    
    lingbot_video
    
    lingbotworld
    lingbotworld
    
    lingbotworld
    
    cam_utils
    
    model
    
    lingbotworld2
    lingbotworld2
    
    lingbotworld2
    
    cam_utils
    
    causal_fast
    
    longcat
    
    ltx2
    
    magi_human
    
    matrixgame2
    matrixgame2
    
    matrixgame2
    
    action_module
    
    causal_model
    
    model
    
    utils
    
    matrixgame3
    matrixgame3
    
    matrixgame3
    
    action_module
    
    model
    
    utils
    
    sd3
    
    stable_audio
    
    wanvideo
    
    zimage
    
    encoders
    encoders
    
    encoders
    
    base
    
    bert
    
    clip
    
    gemma
    
    glm_image_ar_loader
    
    lingbot_video
    
    lingbotworld2_t5
    
    llama
    
    mistral3
    
    qwen2_5
    
    qwen2_5_vl_custom
    
    qwen3
    
    reason1
    
    siglip
    
    stable_audio_conditioner
    
    t5
    
    t5_hf
    
    t5gemma
    
    vision
    
    hf_transformer_utils
    
    loader
    loader
    
    loader
    
    benchmarks
    benchmarks
    
    benchmarks
    
    benchmark_weight_loading
    
    benchmark_weight_loading_comparison
    
    component_loader
    
    fsdp_load
    
    utils
    
    weight_utils
    
    mask_utils
    
    parameter
    
    registry
    
    schedulers
    schedulers
    
    schedulers
    
    base
    
    scheduling_flow_map_euler_discrete
    
    scheduling_flow_match_euler_discrete
    
    scheduling_flow_unipc_multistep
    
    scheduling_rcm
    
    scheduling_self_forcing_flow_match
    
    scheduling_unipc_multistep
    
    upsamplers
    upsamplers
    
    upsamplers
    
    hunyuan15
    
    ltx2_upsampler
    
    utils
    
    vaes
    vaes
    
    vaes
    
    autoencoder_kl
    
    common
    
    cosmos25wanvae
    
    flux2_components
    
    flux2vae
    
    gamecraftvae
    
    gamecraftvae_blocks
    
    gen3c_tokenizer_vae
    
    hunyuan15vae
    
    hunyuanvae
    
    hyworldvae
    
    lingbotworld2_wanvae
    
    ltx2vae
    
    oobleck
    
    sa_audio
    
    wanvae
    
    vision_utils
  - performance
    performance
    
    performance
    
    hf_store
    
    metric_policy
  - performance_dashboard
    performance_dashboard
    
    performance_dashboard
    
    api
    
    metrics
    
    service
  - pipelines
    pipelines
    
    pipelines
    
    basic
    basic
    
    basic
    
    cosmos
    cosmos
    
    cosmos
    
    cosmos2_5_pipeline
    
    cosmos_pipeline
    
    presets
    
    dreamx_world
    dreamx_world
    
    dreamx_world
    
    ar_denoising
    
    camera_conditioning
    
    config
    
    dreamx_world_ar_pipeline
    
    dreamx_world_pipeline
    
    presets
    
    stages
    
    flux
    flux
    
    flux
    
    flux_pipeline
    
    flux_2
    flux_2
    
    flux_2
    
    flux_2_klein_pipeline
    
    flux_2_latent_preparation
    
    flux_2_pipeline
    
    flux_2_text_encoding
    
    flux_2_timestep_preparation
    
    presets
    
    gamecraft
    gamecraft
    
    gamecraft
    
    gamecraft_pipeline
    
    presets
    
    gen3c
    gen3c
    
    gen3c
    
    cache_3d
    
    camera_utils
    
    depth_estimation
    
    gen3c_pipeline
    
    presets
    
    glm_image
    glm_image
    
    glm_image
    
    glm_image_pipeline
    
    stages
    stages
    
    stages
    
    before_denoising
    
    condition_encoding
    
    decoding
    
    denoising
    
    hunyuan
    hunyuan
    
    hunyuan
    
    hunyuan_pipeline
    
    presets
    
    hunyuan15
    hunyuan15
    
    hunyuan15
    
    hunyuan15_2sr_pipeline
    
    hunyuan15_i2v_pipeline
    
    hunyuan15_pipeline
    
    hunyuan15_sr_pipeline
    
    presets
    
    hyworld
    hyworld
    
    hyworld
    
    hyworld_pipeline
    
    presets
    
    kandinsky5
    kandinsky5
    
    kandinsky5
    
    kandinsky5_i2v_pipeline
    
    kandinsky5_pipeline
    
    presets
    
    lingbot_video
    lingbot_video
    
    lingbot_video
    
    lingbot_video_pipeline
    
    presets
    
    stages
    
    lingbotworld
    lingbotworld
    
    lingbotworld
    
    lingbotworld_pipeline
    
    presets
    
    lingbotworld2
    lingbotworld2
    
    lingbotworld2
    
    causal_fast_pipeline
    
    presets
    
    longcat
    longcat
    
    longcat
    
    longcat_i2v_pipeline
    
    longcat_pipeline
    
    longcat_vc_pipeline
    
    presets
    
    ltx2
    ltx2
    
    ltx2
    
    continuation
    
    ltx2_pipeline
    
    pipeline_configs
    
    presets
    
    stage_overrides
    
    stages
    stages
    
    stages
    
    ltx2_audio_decoding
    
    ltx2_denoising
    
    ltx2_image_conditioning
    
    ltx2_latent_preparation
    
    ltx2_refine
    
    ltx2_text_encoding
    
    magi_human
    magi_human
    
    magi_human
    
    magi_human_pipeline
    
    pipeline_configs
    
    presets
    
    stages
    stages
    
    stages
    
    audio_decoding
    
    denoising
    
    latent_preparation
    
    reference_image
    
    sr_denoising
    
    sr_latent_preparation
    
    matrixgame2
    matrixgame2
    
    matrixgame2
    
    matrixgame2_causal_dmd_pipeline
    
    matrixgame2_i2v_pipeline
    
    presets
    
    matrixgame3
    matrixgame3
    
    matrixgame3
    
    matrixgame3_i2v_pipeline
    
    presets
    
    sd35
    sd35
    
    sd35
    
    presets
    
    sd35_pipeline
    
    stable_audio
    stable_audio
    
    stable_audio
    
    presets
    
    stable_audio_pipeline
    
    stages
    stages
    
    stages
    
    conditioning
    
    decoding
    
    denoising
    
    latent_preparation
    
    turbodiffusion
    turbodiffusion
    
    turbodiffusion
    
    presets
    
    turbodiffusion_i2v_pipeline
    
    turbodiffusion_pipeline
    
    wan
    wan
    
    wan
    
    lucy_edit_pipeline
    
    presets
    
    wan_causal_dmd_pipeline
    
    wan_causal_pipeline
    
    wan_dmd_pipeline
    
    wan_i2v_dmd_pipeline
    
    wan_i2v_pipeline
    
    wan_pipeline
    
    wan_v2v_pipeline
    
    zimage
    zimage
    
    zimage
    
    presets
    
    stages
    
    zimage_pipeline
    
    composed_pipeline_base
    
    lora_pipeline
    
    pipeline_batch_info
    
    pipeline_registry
    
    preprocess
    preprocess
    
    preprocess
    
    ltx2
    ltx2
    
    ltx2
    
    ltx2_preprocess_pipelines
    
    matrixgame2
    matrixgame2
    
    matrixgame2
    
    matrixgame2_preprocess_pipeline
    
    matrixgame2_preprocess_pipeline_ode_trajectory
    
    preprocess_cosmos25_overfit
    
    preprocess_cosmos_overfit
    
    preprocess_hunyuan_overfit
    
    preprocess_ltx2_overfit
    
    preprocess_pipeline_base
    
    preprocess_pipeline_i2v
    
    preprocess_pipeline_ode_trajectory
    
    preprocess_pipeline_t2v
    
    preprocess_pipeline_text
    
    preprocess_stages
    
    v1_preprocess
    
    v1_preprocessing_new
    
    wan
    wan
    
    wan
    
    wan_preprocess_pipelines
    
    stages
    stages
    
    stages
    
    base
    
    causal_denoising
    
    conditioning
    
    decoding
    
    denoising
    
    encoding
    
    flux_stages
    
    gamecraft_denoising
    
    gamecraft_image_encoding
    
    gen3c_stages
    
    hyworld_denoising
    
    image_encoding
    
    input_validation
    
    kandinsky5
    
    latent_preparation
    
    longcat_denoising
    
    longcat_i2v_denoising
    
    longcat_i2v_latent_preparation
    
    longcat_image_vae_encoding
    
    longcat_kv_cache_init
    
    longcat_refine_init
    
    longcat_refine_timestep
    
    longcat_vc_denoising
    
    longcat_video_vae_encoding
    
    matrixgame2_denoising
    
    matrixgame3_denoising
    
    sd35_conditioning
    
    sr_denoising
    
    text_encoding
    
    timestep_preparation
    
    utils
    
    validators
    
    training
  - platforms
    platforms
    
    platforms
    
    cpu
    
    cuda
    
    interface
    
    mps
    
    npu
    
    rocm
  - profiler
  - registry
  - train
    train
    
    train
    
    callbacks
    callbacks
    
    callbacks
    
    callback
    
    ema
    
    grad_clip
    
    validation
    
    entrypoint
    entrypoint
    
    entrypoint
    
    dcp_to_diffusers
    
    misc
    misc
    
    misc
    
    wan_ode_init_conversion
    
    train
    
    methods
    methods
    
    methods
    
    base
    
    consistency_model
    consistency_model
    
    consistency_model
    
    causal_cd
    
    distribution_matching
    distribution_matching
    
    distribution_matching
    
    anyflow
    
    anyflow_pretrain
    
    dmd2
    
    self_forcing
    
    streaming_long_tuning
    
    fine_tuning
    fine_tuning
    
    fine_tuning
    
    dfsft
    
    finetune
    
    tfsft
    
    knowledge_distillation
    knowledge_distillation
    
    knowledge_distillation
    
    kd
    
    rl
    rl
    
    rl
    
    common
    common
    
    common
    
    prompt_sampling
    
    sampling
    
    validation
    
    diffusion_nft
    
    rewards
    rewards
    
    rewards
    
    frame_rewards
    
    media
    
    models
    models
    
    models
    
    base
    
    cosmos
    cosmos
    
    cosmos
    
    cosmos
    
    hunyuan
    hunyuan
    
    hunyuan
    
    hunyuan
    
    longcat
    longcat
    
    longcat
    
    longcat
    
    ltx2
    ltx2
    
    ltx2
    
    ltx2
    
    wan
    wan
    
    wan
    
    wan
    
    wan_causal
    
    trainer
    
    utils
    utils
    
    utils
    
    builder
    
    checkpoint
    
    config
    
    dataloader
    
    instantiate
    
    lora
    
    module_state
    
    moduleloader
    
    negative_prompt
    
    optimizer
    
    tracking
    
    training_config
  - training
    training
    
    training
    
    activation_checkpoint
    
    checkpointing_utils
    
    cosmos2_5_training_pipeline
    
    distillation_pipeline
    
    ltx2_training_pipeline
    
    matrixgame2_ar_diffusion_pipeline
    
    matrixgame2_ode_causal_pipeline
    
    matrixgame2_self_forcing_distillation_pipeline
    
    matrixgame2_training_pipeline
    
    ode_causal_pipeline
    
    self_forcing_distillation_pipeline
    
    trackers
    
    training_pipeline
    
    training_utils
    
    wan_distillation_pipeline
    
    wan_i2v_distillation_pipeline
    
    wan_i2v_training_pipeline
    
    wan_self_forcing_distillation_pipeline
    
    wan_training_pipeline
  - utils
  - version
  - worker
    worker
    
    worker
    
    executor
    
    gpu_worker
    
    multiproc_executor
    
    ray_distributed_executor
    
    ray_env
    
    ray_utils
    
    worker_base
  - workflow
    workflow
    
    workflow
    
    preprocess
    preprocess
    
    preprocess
    
    components
    
    preprocess_workflow
    
    preprocess_workflow_i2v
    
    preprocess_workflow_ltx2_t2v
    
    preprocess_workflow_t2v
    
    workflow_base

Home
API Reference
fastvideo
eval
metrics
audio

audio ¶

July 20, 2026

Made with Material for MkDocs