Skip to content

sample

Classes

fastvideo.configs.sample.HunyuanGameCraft129FrameSamplingParam dataclass

HunyuanGameCraft129FrameSamplingParam(data_type: str = 'video', image_path: str | None = None, pil_image: Any | None = None, video_path: str | None = None, mouse_cond: Any | None = None, keyboard_cond: Any | None = None, grid_sizes: Any | None = None, pose: str | None = None, c2ws_plucker_emb: Any | None = None, refine_from: str | None = None, t_thresh: float = 0.5, spatial_refine_only: bool = False, num_cond_frames: int = 0, stage1_video: Any | None = None, prompt: str | list[str] | None = None, negative_prompt: str = '', prompt_path: str | None = None, output_path: str = 'outputs/', output_video_name: str | None = None, num_videos_per_prompt: int = 1, seed: int = 1024, num_frames: int = 129, height: int = 704, width: int = 1280, height_sr: int = 1072, width_sr: int = 1920, fps: int = 24, num_inference_steps: int = 50, num_inference_steps_sr: int = 50, guidance_scale: float = 6.0, guidance_rescale: float = 0.0, boundary_ratio: float | None = None, sigmas: list[float] | None = None, save_video: bool = True, return_frames: bool = True, return_trajectory_latents: bool = False, return_trajectory_decoded: bool = False, camera_states: Any | None = None, camera_trajectory: str | None = None, action_list: list[str] | None = None, action_speed_list: list[float] | None = None, gt_latents: Any | None = None, conditioning_mask: Any | None = None)

Bases: HunyuanGameCraftSamplingParam

Sampling parameters for 129-frame GameCraft generation.

129 video frames -> 33 latent frames. This is the maximum supported by the official implementation.

fastvideo.configs.sample.HunyuanGameCraft65FrameSamplingParam dataclass

HunyuanGameCraft65FrameSamplingParam(data_type: str = 'video', image_path: str | None = None, pil_image: Any | None = None, video_path: str | None = None, mouse_cond: Any | None = None, keyboard_cond: Any | None = None, grid_sizes: Any | None = None, pose: str | None = None, c2ws_plucker_emb: Any | None = None, refine_from: str | None = None, t_thresh: float = 0.5, spatial_refine_only: bool = False, num_cond_frames: int = 0, stage1_video: Any | None = None, prompt: str | list[str] | None = None, negative_prompt: str = '', prompt_path: str | None = None, output_path: str = 'outputs/', output_video_name: str | None = None, num_videos_per_prompt: int = 1, seed: int = 1024, num_frames: int = 65, height: int = 704, width: int = 1280, height_sr: int = 1072, width_sr: int = 1920, fps: int = 24, num_inference_steps: int = 50, num_inference_steps_sr: int = 50, guidance_scale: float = 6.0, guidance_rescale: float = 0.0, boundary_ratio: float | None = None, sigmas: list[float] | None = None, save_video: bool = True, return_frames: bool = True, return_trajectory_latents: bool = False, return_trajectory_decoded: bool = False, camera_states: Any | None = None, camera_trajectory: str | None = None, action_list: list[str] | None = None, action_speed_list: list[float] | None = None, gt_latents: Any | None = None, conditioning_mask: Any | None = None)

Bases: HunyuanGameCraftSamplingParam

Sampling parameters for 65-frame GameCraft generation.

65 video frames -> 17 latent frames (with first frame as key frame). This is useful for longer video generation.

fastvideo.configs.sample.HunyuanGameCraftSamplingParam dataclass

HunyuanGameCraftSamplingParam(data_type: str = 'video', image_path: str | None = None, pil_image: Any | None = None, video_path: str | None = None, mouse_cond: Any | None = None, keyboard_cond: Any | None = None, grid_sizes: Any | None = None, pose: str | None = None, c2ws_plucker_emb: Any | None = None, refine_from: str | None = None, t_thresh: float = 0.5, spatial_refine_only: bool = False, num_cond_frames: int = 0, stage1_video: Any | None = None, prompt: str | list[str] | None = None, negative_prompt: str = '', prompt_path: str | None = None, output_path: str = 'outputs/', output_video_name: str | None = None, num_videos_per_prompt: int = 1, seed: int = 1024, num_frames: int = 33, height: int = 704, width: int = 1280, height_sr: int = 1072, width_sr: int = 1920, fps: int = 24, num_inference_steps: int = 50, num_inference_steps_sr: int = 50, guidance_scale: float = 6.0, guidance_rescale: float = 0.0, boundary_ratio: float | None = None, sigmas: list[float] | None = None, save_video: bool = True, return_frames: bool = True, return_trajectory_latents: bool = False, return_trajectory_decoded: bool = False, camera_states: Any | None = None, camera_trajectory: str | None = None, action_list: list[str] | None = None, action_speed_list: list[float] | None = None, gt_latents: Any | None = None, conditioning_mask: Any | None = None)

Bases: SamplingParam

Sampling parameters for HunyuanGameCraft video generation.

Supports camera/action conditioning via: - camera_trajectory: Plücker coordinates for camera motion - action_list: List of actions (e.g., ["forward", "left", "right"]) - action_speed_list: Speed multipliers for each action

Default resolution is 704x1280 (same as HunyuanVideo). Default frame count is 33 video frames -> 9 latent frames.

fastvideo.configs.sample.SamplingParam dataclass

SamplingParam(data_type: str = 'video', image_path: str | None = None, pil_image: Any | None = None, video_path: str | None = None, mouse_cond: Any | None = None, keyboard_cond: Any | None = None, grid_sizes: Any | None = None, pose: str | None = None, c2ws_plucker_emb: Any | None = None, refine_from: str | None = None, t_thresh: float = 0.5, spatial_refine_only: bool = False, num_cond_frames: int = 0, stage1_video: Any | None = None, prompt: str | list[str] | None = None, negative_prompt: str = 'Bright tones, overexposed, static, blurred details, subtitles, style, works, paintings, images, static, overall gray, worst quality, low quality, JPEG compression residue, ugly, incomplete, extra fingers, poorly drawn hands, poorly drawn faces, deformed, disfigured, misshapen limbs, fused fingers, still picture, messy background, three legs, many people in the background, walking backwards', prompt_path: str | None = None, output_path: str = 'outputs/', output_video_name: str | None = None, num_videos_per_prompt: int = 1, seed: int = 1024, num_frames: int = 125, height: int = 720, width: int = 1280, height_sr: int = 1072, width_sr: int = 1920, fps: int = 24, num_inference_steps: int = 50, num_inference_steps_sr: int = 50, guidance_scale: float = 1.0, guidance_rescale: float = 0.0, boundary_ratio: float | None = None, sigmas: list[float] | None = None, save_video: bool = True, return_frames: bool = True, return_trajectory_latents: bool = False, return_trajectory_decoded: bool = False)

Sampling parameters for video generation.

Functions

fastvideo.configs.sample.SamplingParam.add_cli_args staticmethod
add_cli_args(parser: Any) -> Any

Add CLI arguments for SamplingParam fields

Source code in fastvideo/configs/sample/base.py
@staticmethod
def add_cli_args(parser: Any) -> Any:
    """Add CLI arguments for SamplingParam fields"""
    parser.add_argument(
        "--prompt",
        type=str,
        default=SamplingParam.prompt,
        help="Text prompt for video generation",
    )
    parser.add_argument(
        "--negative-prompt",
        type=str,
        default=SamplingParam.negative_prompt,
        help="Negative text prompt for video generation",
    )
    parser.add_argument(
        "--prompt-path",
        type=str,
        default=SamplingParam.prompt_path,
        help="Path to a text file containing the prompt",
    )
    parser.add_argument(
        "--output-path",
        type=str,
        default=SamplingParam.output_path,
        help="Path to save the generated video",
    )
    parser.add_argument(
        "--output-video-name",
        type=str,
        default=SamplingParam.output_video_name,
        help="Name of the output video",
    )
    parser.add_argument(
        "--num-videos-per-prompt",
        type=int,
        default=SamplingParam.num_videos_per_prompt,
        help="Number of videos to generate per prompt",
    )
    parser.add_argument(
        "--seed",
        type=int,
        default=SamplingParam.seed,
        help="Random seed for generation",
    )
    parser.add_argument(
        "--num-frames",
        type=int,
        default=SamplingParam.num_frames,
        help="Number of frames to generate",
    )
    parser.add_argument(
        "--height",
        type=int,
        default=SamplingParam.height,
        help="Height of generated video",
    )
    parser.add_argument(
        "--width",
        type=int,
        default=SamplingParam.width,
        help="Width of generated video",
    )
    parser.add_argument(
        "--fps",
        type=int,
        default=SamplingParam.fps,
        help="Frames per second for saved video",
    )
    parser.add_argument(
        "--num-inference-steps",
        type=int,
        default=SamplingParam.num_inference_steps,
        help="Number of denoising steps",
    )
    parser.add_argument(
        "--guidance-scale",
        type=float,
        default=SamplingParam.guidance_scale,
        help="Classifier-free guidance scale",
    )
    parser.add_argument(
        "--guidance-rescale",
        type=float,
        default=SamplingParam.guidance_rescale,
        help="Guidance rescale factor",
    )
    parser.add_argument(
        "--boundary-ratio",
        type=float,
        default=SamplingParam.boundary_ratio,
        help="Boundary timestep ratio",
    )
    parser.add_argument(
        "--save-video",
        action="store_true",
        default=SamplingParam.save_video,
        help="Whether to save the video to disk",
    )
    parser.add_argument(
        "--no-save-video",
        action="store_false",
        dest="save_video",
        help="Don't save the video to disk",
    )
    parser.add_argument(
        "--return-frames",
        action="store_true",
        default=False,
        help="Whether to return the raw frames",
    )
    parser.add_argument(
        "--image-path",
        type=str,
        default=SamplingParam.image_path,
        help="Path to input image for image-to-video generation",
    )
    parser.add_argument(
        "--video-path",
        type=str,
        default=SamplingParam.video_path,
        help="Path to input video for video-to-video generation",
    )
    parser.add_argument(
        "--refine-from",
        type=str,
        default=SamplingParam.refine_from,
        help="Path to stage1 video for refinement (LongCat 480p->720p)",
    )
    parser.add_argument(
        "--t-thresh",
        type=float,
        default=SamplingParam.t_thresh,
        help="Threshold for timestep scheduling in refinement (default: 0.5)",
    )
    parser.add_argument(
        "--spatial-refine-only",
        action=StoreBoolean,
        default=SamplingParam.spatial_refine_only,
        help="Only perform spatial super-resolution (no temporal doubling)",
    )
    parser.add_argument(
        "--num-cond-frames",
        type=int,
        default=SamplingParam.num_cond_frames,
        help="Number of conditioning frames for refinement",
    )
    parser.add_argument(
        "--moba-config-path",
        type=str,
        default=None,
        help="Path to a JSON file containing V-MoBA specific configurations.",
    )
    parser.add_argument(
        "--return-trajectory-latents",
        action="store_true",
        default=SamplingParam.return_trajectory_latents,
        help="Whether to return the trajectory",
    )
    parser.add_argument(
        "--return-trajectory-decoded",
        action="store_true",
        default=SamplingParam.return_trajectory_decoded,
        help="Whether to return the decoded trajectory",
    )
    return parser

Modules

fastvideo.configs.sample.base

Classes

fastvideo.configs.sample.base.SamplingParam dataclass
SamplingParam(data_type: str = 'video', image_path: str | None = None, pil_image: Any | None = None, video_path: str | None = None, mouse_cond: Any | None = None, keyboard_cond: Any | None = None, grid_sizes: Any | None = None, pose: str | None = None, c2ws_plucker_emb: Any | None = None, refine_from: str | None = None, t_thresh: float = 0.5, spatial_refine_only: bool = False, num_cond_frames: int = 0, stage1_video: Any | None = None, prompt: str | list[str] | None = None, negative_prompt: str = 'Bright tones, overexposed, static, blurred details, subtitles, style, works, paintings, images, static, overall gray, worst quality, low quality, JPEG compression residue, ugly, incomplete, extra fingers, poorly drawn hands, poorly drawn faces, deformed, disfigured, misshapen limbs, fused fingers, still picture, messy background, three legs, many people in the background, walking backwards', prompt_path: str | None = None, output_path: str = 'outputs/', output_video_name: str | None = None, num_videos_per_prompt: int = 1, seed: int = 1024, num_frames: int = 125, height: int = 720, width: int = 1280, height_sr: int = 1072, width_sr: int = 1920, fps: int = 24, num_inference_steps: int = 50, num_inference_steps_sr: int = 50, guidance_scale: float = 1.0, guidance_rescale: float = 0.0, boundary_ratio: float | None = None, sigmas: list[float] | None = None, save_video: bool = True, return_frames: bool = True, return_trajectory_latents: bool = False, return_trajectory_decoded: bool = False)

Sampling parameters for video generation.

Functions
fastvideo.configs.sample.base.SamplingParam.add_cli_args staticmethod
add_cli_args(parser: Any) -> Any

Add CLI arguments for SamplingParam fields

Source code in fastvideo/configs/sample/base.py
@staticmethod
def add_cli_args(parser: Any) -> Any:
    """Add CLI arguments for SamplingParam fields"""
    parser.add_argument(
        "--prompt",
        type=str,
        default=SamplingParam.prompt,
        help="Text prompt for video generation",
    )
    parser.add_argument(
        "--negative-prompt",
        type=str,
        default=SamplingParam.negative_prompt,
        help="Negative text prompt for video generation",
    )
    parser.add_argument(
        "--prompt-path",
        type=str,
        default=SamplingParam.prompt_path,
        help="Path to a text file containing the prompt",
    )
    parser.add_argument(
        "--output-path",
        type=str,
        default=SamplingParam.output_path,
        help="Path to save the generated video",
    )
    parser.add_argument(
        "--output-video-name",
        type=str,
        default=SamplingParam.output_video_name,
        help="Name of the output video",
    )
    parser.add_argument(
        "--num-videos-per-prompt",
        type=int,
        default=SamplingParam.num_videos_per_prompt,
        help="Number of videos to generate per prompt",
    )
    parser.add_argument(
        "--seed",
        type=int,
        default=SamplingParam.seed,
        help="Random seed for generation",
    )
    parser.add_argument(
        "--num-frames",
        type=int,
        default=SamplingParam.num_frames,
        help="Number of frames to generate",
    )
    parser.add_argument(
        "--height",
        type=int,
        default=SamplingParam.height,
        help="Height of generated video",
    )
    parser.add_argument(
        "--width",
        type=int,
        default=SamplingParam.width,
        help="Width of generated video",
    )
    parser.add_argument(
        "--fps",
        type=int,
        default=SamplingParam.fps,
        help="Frames per second for saved video",
    )
    parser.add_argument(
        "--num-inference-steps",
        type=int,
        default=SamplingParam.num_inference_steps,
        help="Number of denoising steps",
    )
    parser.add_argument(
        "--guidance-scale",
        type=float,
        default=SamplingParam.guidance_scale,
        help="Classifier-free guidance scale",
    )
    parser.add_argument(
        "--guidance-rescale",
        type=float,
        default=SamplingParam.guidance_rescale,
        help="Guidance rescale factor",
    )
    parser.add_argument(
        "--boundary-ratio",
        type=float,
        default=SamplingParam.boundary_ratio,
        help="Boundary timestep ratio",
    )
    parser.add_argument(
        "--save-video",
        action="store_true",
        default=SamplingParam.save_video,
        help="Whether to save the video to disk",
    )
    parser.add_argument(
        "--no-save-video",
        action="store_false",
        dest="save_video",
        help="Don't save the video to disk",
    )
    parser.add_argument(
        "--return-frames",
        action="store_true",
        default=False,
        help="Whether to return the raw frames",
    )
    parser.add_argument(
        "--image-path",
        type=str,
        default=SamplingParam.image_path,
        help="Path to input image for image-to-video generation",
    )
    parser.add_argument(
        "--video-path",
        type=str,
        default=SamplingParam.video_path,
        help="Path to input video for video-to-video generation",
    )
    parser.add_argument(
        "--refine-from",
        type=str,
        default=SamplingParam.refine_from,
        help="Path to stage1 video for refinement (LongCat 480p->720p)",
    )
    parser.add_argument(
        "--t-thresh",
        type=float,
        default=SamplingParam.t_thresh,
        help="Threshold for timestep scheduling in refinement (default: 0.5)",
    )
    parser.add_argument(
        "--spatial-refine-only",
        action=StoreBoolean,
        default=SamplingParam.spatial_refine_only,
        help="Only perform spatial super-resolution (no temporal doubling)",
    )
    parser.add_argument(
        "--num-cond-frames",
        type=int,
        default=SamplingParam.num_cond_frames,
        help="Number of conditioning frames for refinement",
    )
    parser.add_argument(
        "--moba-config-path",
        type=str,
        default=None,
        help="Path to a JSON file containing V-MoBA specific configurations.",
    )
    parser.add_argument(
        "--return-trajectory-latents",
        action="store_true",
        default=SamplingParam.return_trajectory_latents,
        help="Whether to return the trajectory",
    )
    parser.add_argument(
        "--return-trajectory-decoded",
        action="store_true",
        default=SamplingParam.return_trajectory_decoded,
        help="Whether to return the decoded trajectory",
    )
    return parser

Functions

fastvideo.configs.sample.hunyuangamecraft

Sampling parameters for HunyuanGameCraft video generation.

GameCraft generates game-like videos with camera/action control. Default parameters are based on the official implementation.

Classes

fastvideo.configs.sample.hunyuangamecraft.HunyuanGameCraft129FrameSamplingParam dataclass
HunyuanGameCraft129FrameSamplingParam(data_type: str = 'video', image_path: str | None = None, pil_image: Any | None = None, video_path: str | None = None, mouse_cond: Any | None = None, keyboard_cond: Any | None = None, grid_sizes: Any | None = None, pose: str | None = None, c2ws_plucker_emb: Any | None = None, refine_from: str | None = None, t_thresh: float = 0.5, spatial_refine_only: bool = False, num_cond_frames: int = 0, stage1_video: Any | None = None, prompt: str | list[str] | None = None, negative_prompt: str = '', prompt_path: str | None = None, output_path: str = 'outputs/', output_video_name: str | None = None, num_videos_per_prompt: int = 1, seed: int = 1024, num_frames: int = 129, height: int = 704, width: int = 1280, height_sr: int = 1072, width_sr: int = 1920, fps: int = 24, num_inference_steps: int = 50, num_inference_steps_sr: int = 50, guidance_scale: float = 6.0, guidance_rescale: float = 0.0, boundary_ratio: float | None = None, sigmas: list[float] | None = None, save_video: bool = True, return_frames: bool = True, return_trajectory_latents: bool = False, return_trajectory_decoded: bool = False, camera_states: Any | None = None, camera_trajectory: str | None = None, action_list: list[str] | None = None, action_speed_list: list[float] | None = None, gt_latents: Any | None = None, conditioning_mask: Any | None = None)

Bases: HunyuanGameCraftSamplingParam

Sampling parameters for 129-frame GameCraft generation.

129 video frames -> 33 latent frames. This is the maximum supported by the official implementation.

fastvideo.configs.sample.hunyuangamecraft.HunyuanGameCraft65FrameSamplingParam dataclass
HunyuanGameCraft65FrameSamplingParam(data_type: str = 'video', image_path: str | None = None, pil_image: Any | None = None, video_path: str | None = None, mouse_cond: Any | None = None, keyboard_cond: Any | None = None, grid_sizes: Any | None = None, pose: str | None = None, c2ws_plucker_emb: Any | None = None, refine_from: str | None = None, t_thresh: float = 0.5, spatial_refine_only: bool = False, num_cond_frames: int = 0, stage1_video: Any | None = None, prompt: str | list[str] | None = None, negative_prompt: str = '', prompt_path: str | None = None, output_path: str = 'outputs/', output_video_name: str | None = None, num_videos_per_prompt: int = 1, seed: int = 1024, num_frames: int = 65, height: int = 704, width: int = 1280, height_sr: int = 1072, width_sr: int = 1920, fps: int = 24, num_inference_steps: int = 50, num_inference_steps_sr: int = 50, guidance_scale: float = 6.0, guidance_rescale: float = 0.0, boundary_ratio: float | None = None, sigmas: list[float] | None = None, save_video: bool = True, return_frames: bool = True, return_trajectory_latents: bool = False, return_trajectory_decoded: bool = False, camera_states: Any | None = None, camera_trajectory: str | None = None, action_list: list[str] | None = None, action_speed_list: list[float] | None = None, gt_latents: Any | None = None, conditioning_mask: Any | None = None)

Bases: HunyuanGameCraftSamplingParam

Sampling parameters for 65-frame GameCraft generation.

65 video frames -> 17 latent frames (with first frame as key frame). This is useful for longer video generation.

fastvideo.configs.sample.hunyuangamecraft.HunyuanGameCraftSamplingParam dataclass
HunyuanGameCraftSamplingParam(data_type: str = 'video', image_path: str | None = None, pil_image: Any | None = None, video_path: str | None = None, mouse_cond: Any | None = None, keyboard_cond: Any | None = None, grid_sizes: Any | None = None, pose: str | None = None, c2ws_plucker_emb: Any | None = None, refine_from: str | None = None, t_thresh: float = 0.5, spatial_refine_only: bool = False, num_cond_frames: int = 0, stage1_video: Any | None = None, prompt: str | list[str] | None = None, negative_prompt: str = '', prompt_path: str | None = None, output_path: str = 'outputs/', output_video_name: str | None = None, num_videos_per_prompt: int = 1, seed: int = 1024, num_frames: int = 33, height: int = 704, width: int = 1280, height_sr: int = 1072, width_sr: int = 1920, fps: int = 24, num_inference_steps: int = 50, num_inference_steps_sr: int = 50, guidance_scale: float = 6.0, guidance_rescale: float = 0.0, boundary_ratio: float | None = None, sigmas: list[float] | None = None, save_video: bool = True, return_frames: bool = True, return_trajectory_latents: bool = False, return_trajectory_decoded: bool = False, camera_states: Any | None = None, camera_trajectory: str | None = None, action_list: list[str] | None = None, action_speed_list: list[float] | None = None, gt_latents: Any | None = None, conditioning_mask: Any | None = None)

Bases: SamplingParam

Sampling parameters for HunyuanGameCraft video generation.

Supports camera/action conditioning via: - camera_trajectory: Plücker coordinates for camera motion - action_list: List of actions (e.g., ["forward", "left", "right"]) - action_speed_list: Speed multipliers for each action

Default resolution is 704x1280 (same as HunyuanVideo). Default frame count is 33 video frames -> 9 latent frames.

fastvideo.configs.sample.ltx2

Classes

fastvideo.configs.sample.ltx2.LTX2BaseSamplingParam dataclass
LTX2BaseSamplingParam(data_type: str = 'video', image_path: str | None = None, pil_image: Any | None = None, video_path: str | None = None, mouse_cond: Any | None = None, keyboard_cond: Any | None = None, grid_sizes: Any | None = None, pose: str | None = None, c2ws_plucker_emb: Any | None = None, refine_from: str | None = None, t_thresh: float = 0.5, spatial_refine_only: bool = False, num_cond_frames: int = 0, stage1_video: Any | None = None, prompt: str | list[str] | None = None, negative_prompt: str = 'blurry, out of focus, overexposed, underexposed, low contrast, washed out colors, excessive noise, grainy texture, poor lighting, flickering, motion blur, distorted proportions, unnatural skin tones, deformed facial features, asymmetrical face, missing facial features, extra limbs, disfigured hands, wrong hand count, artifacts around text, inconsistent perspective, camera shake, incorrect depth of field, background too sharp, background clutter, distracting reflections, harsh shadows, inconsistent lighting direction, color banding, cartoonish rendering, 3D CGI look, unrealistic materials, uncanny valley effect, incorrect ethnicity, wrong gender, exaggerated expressions, wrong gaze direction, mismatched lip sync, silent or muted audio, distorted voice, robotic voice, echo, background noise, off-sync audio, incorrect dialogue, added dialogue, repetitive speech, jittery movement, awkward pauses, incorrect timing, unnatural transitions, inconsistent framing, tilted camera, flat lighting, inconsistent tone, cinematic oversaturation, stylized filters, or AI artifacts.', prompt_path: str | None = None, output_path: str = 'outputs/', output_video_name: str | None = None, num_videos_per_prompt: int = 1, seed: int = 10, num_frames: int = 121, height: int = 512, width: int = 768, height_sr: int = 1072, width_sr: int = 1920, fps: int = 24, num_inference_steps: int = 40, num_inference_steps_sr: int = 50, guidance_scale: float = 3.0, guidance_rescale: float = 0.0, boundary_ratio: float | None = None, sigmas: list[float] | None = None, save_video: bool = True, return_frames: bool = True, return_trajectory_latents: bool = False, return_trajectory_decoded: bool = False, ltx2_cfg_scale_video: float = 3.0, ltx2_cfg_scale_audio: float = 7.0, ltx2_modality_scale_video: float = 3.0, ltx2_modality_scale_audio: float = 3.0, ltx2_rescale_scale: float = 0.7, ltx2_stg_scale_video: float = 1.0, ltx2_stg_scale_audio: float = 1.0, ltx2_stg_blocks_video: list[int] = (lambda: [29])(), ltx2_stg_blocks_audio: list[int] = (lambda: [29])())

Bases: SamplingParam

Default sampling parameters for LTX-2 base one-stage T2V.

Values follow the official LTX-2 one-stage defaults. Multi-modal CFG params are read by LTX2DenoisingStage.

fastvideo.configs.sample.ltx2.LTX2DistilledSamplingParam dataclass
LTX2DistilledSamplingParam(data_type: str = 'video', image_path: str | None = None, pil_image: Any | None = None, video_path: str | None = None, mouse_cond: Any | None = None, keyboard_cond: Any | None = None, grid_sizes: Any | None = None, pose: str | None = None, c2ws_plucker_emb: Any | None = None, refine_from: str | None = None, t_thresh: float = 0.5, spatial_refine_only: bool = False, num_cond_frames: int = 0, stage1_video: Any | None = None, prompt: str | list[str] | None = None, negative_prompt: str = '', prompt_path: str | None = None, output_path: str = 'outputs/', output_video_name: str | None = None, num_videos_per_prompt: int = 1, seed: int = 10, num_frames: int = 121, height: int = 1024, width: int = 1536, height_sr: int = 1072, width_sr: int = 1920, fps: int = 24, num_inference_steps: int = 8, num_inference_steps_sr: int = 50, guidance_scale: float = 1.0, guidance_rescale: float = 0.0, boundary_ratio: float | None = None, sigmas: list[float] | None = None, save_video: bool = True, return_frames: bool = True, return_trajectory_latents: bool = False, return_trajectory_decoded: bool = False)

Bases: SamplingParam

Default sampling parameters for LTX-2 distilled one-stage T2V.

fastvideo.configs.sample.turbodiffusion

TurboDiffusion sampling parameters.

TurboDiffusion uses RCM (recurrent Consistency Model) scheduler for 1-4 step video generation with no classifier-free guidance.

Classes

fastvideo.configs.sample.turbodiffusion.TurboDiffusionI2V_A14B_SamplingParam dataclass
TurboDiffusionI2V_A14B_SamplingParam(data_type: str = 'video', image_path: str | None = None, pil_image: Any | None = None, video_path: str | None = None, mouse_cond: Any | None = None, keyboard_cond: Any | None = None, grid_sizes: Any | None = None, pose: str | None = None, c2ws_plucker_emb: Any | None = None, refine_from: str | None = None, t_thresh: float = 0.5, spatial_refine_only: bool = False, num_cond_frames: int = 0, stage1_video: Any | None = None, prompt: str | list[str] | None = None, negative_prompt: str | None = None, prompt_path: str | None = None, output_path: str = 'outputs/', output_video_name: str | None = None, num_videos_per_prompt: int = 1, seed: int = 1024, num_frames: int = 81, height: int = 720, width: int = 1280, height_sr: int = 1072, width_sr: int = 1920, fps: int = 16, num_inference_steps: int = 4, num_inference_steps_sr: int = 50, guidance_scale: float = 1.0, guidance_rescale: float = 0.0, boundary_ratio: float | None = None, sigmas: list[float] | None = None, save_video: bool = True, return_frames: bool = True, return_trajectory_latents: bool = False, return_trajectory_decoded: bool = False)

Bases: SamplingParam

Sampling parameters for TurboDiffusion I2V A14B model.

Uses 4-step RCM sampling with dual-model switching (high/low noise).

fastvideo.configs.sample.turbodiffusion.TurboDiffusionT2V_14B_SamplingParam dataclass
TurboDiffusionT2V_14B_SamplingParam(data_type: str = 'video', image_path: str | None = None, pil_image: Any | None = None, video_path: str | None = None, mouse_cond: Any | None = None, keyboard_cond: Any | None = None, grid_sizes: Any | None = None, pose: str | None = None, c2ws_plucker_emb: Any | None = None, refine_from: str | None = None, t_thresh: float = 0.5, spatial_refine_only: bool = False, num_cond_frames: int = 0, stage1_video: Any | None = None, prompt: str | list[str] | None = None, negative_prompt: str | None = None, prompt_path: str | None = None, output_path: str = 'outputs/', output_video_name: str | None = None, num_videos_per_prompt: int = 1, seed: int = 1024, num_frames: int = 81, height: int = 720, width: int = 1280, height_sr: int = 1072, width_sr: int = 1920, fps: int = 16, num_inference_steps: int = 4, num_inference_steps_sr: int = 50, guidance_scale: float = 1.0, guidance_rescale: float = 0.0, boundary_ratio: float | None = None, sigmas: list[float] | None = None, save_video: bool = True, return_frames: bool = True, return_trajectory_latents: bool = False, return_trajectory_decoded: bool = False)

Bases: SamplingParam

Sampling parameters for TurboDiffusion T2V 14B model.

Uses 4-step RCM sampling with guidance_scale=1.0 (no CFG).

fastvideo.configs.sample.turbodiffusion.TurboDiffusionT2V_1_3B_SamplingParam dataclass
TurboDiffusionT2V_1_3B_SamplingParam(data_type: str = 'video', image_path: str | None = None, pil_image: Any | None = None, video_path: str | None = None, mouse_cond: Any | None = None, keyboard_cond: Any | None = None, grid_sizes: Any | None = None, pose: str | None = None, c2ws_plucker_emb: Any | None = None, refine_from: str | None = None, t_thresh: float = 0.5, spatial_refine_only: bool = False, num_cond_frames: int = 0, stage1_video: Any | None = None, prompt: str | list[str] | None = None, negative_prompt: str | None = None, prompt_path: str | None = None, output_path: str = 'outputs/', output_video_name: str | None = None, num_videos_per_prompt: int = 1, seed: int = 1024, num_frames: int = 81, height: int = 480, width: int = 832, height_sr: int = 1072, width_sr: int = 1920, fps: int = 16, num_inference_steps: int = 4, num_inference_steps_sr: int = 50, guidance_scale: float = 1.0, guidance_rescale: float = 0.0, boundary_ratio: float | None = None, sigmas: list[float] | None = None, save_video: bool = True, return_frames: bool = True, return_trajectory_latents: bool = False, return_trajectory_decoded: bool = False)

Bases: SamplingParam

Sampling parameters for TurboDiffusion T2V 1.3B model.

Uses 4-step RCM sampling with guidance_scale=1.0 (no CFG).

fastvideo.configs.sample.wan

Classes

fastvideo.configs.sample.wan.Wan2_1_Fun_1_3B_InP_SamplingParam dataclass
Wan2_1_Fun_1_3B_InP_SamplingParam(data_type: str = 'video', image_path: str | None = None, pil_image: Any | None = None, video_path: str | None = None, mouse_cond: Any | None = None, keyboard_cond: Any | None = None, grid_sizes: Any | None = None, pose: str | None = None, c2ws_plucker_emb: Any | None = None, refine_from: str | None = None, t_thresh: float = 0.5, spatial_refine_only: bool = False, num_cond_frames: int = 0, stage1_video: Any | None = None, prompt: str | list[str] | None = None, negative_prompt: str | None = '色调艳丽,过曝,静态,细节模糊不清,字幕,风格,作品,画作,画面,静止,整体发灰,最差质量,低质量,JPEG压缩残留,丑陋的,残缺的,多余的手指,画得不好的手部,画得不好的脸部,畸形的,毁容的,形态畸形的肢体,手指融合,静止不动的画面,杂乱的背景,三条腿,背景人很多,倒着走', prompt_path: str | None = None, output_path: str = 'outputs/', output_video_name: str | None = None, num_videos_per_prompt: int = 1, seed: int = 1024, num_frames: int = 81, height: int = 480, width: int = 832, height_sr: int = 1072, width_sr: int = 1920, fps: int = 16, num_inference_steps: int = 50, num_inference_steps_sr: int = 50, guidance_scale: float = 6.0, guidance_rescale: float = 0.0, boundary_ratio: float | None = None, sigmas: list[float] | None = None, save_video: bool = True, return_frames: bool = True, return_trajectory_latents: bool = False, return_trajectory_decoded: bool = False)

Bases: SamplingParam

Sampling parameters for Wan2.1 Fun 1.3B InP model.

fastvideo.configs.sample.wan.Wan2_2_Base_SamplingParam dataclass
Wan2_2_Base_SamplingParam(data_type: str = 'video', image_path: str | None = None, pil_image: Any | None = None, video_path: str | None = None, mouse_cond: Any | None = None, keyboard_cond: Any | None = None, grid_sizes: Any | None = None, pose: str | None = None, c2ws_plucker_emb: Any | None = None, refine_from: str | None = None, t_thresh: float = 0.5, spatial_refine_only: bool = False, num_cond_frames: int = 0, stage1_video: Any | None = None, prompt: str | list[str] | None = None, negative_prompt: str | None = '色调艳丽,过曝,静态,细节模糊不清,字幕,风格,作品,画作,画面,静止,整体发灰,最差质量,低质量,JPEG压缩残留,丑陋的,残缺的,多余的手指,画得不好的手部,画得不好的脸部,畸形的,毁容的,形态畸形的肢体,手指融合,静止不动的画面,杂乱的背景,三条腿,背景人很多,倒着走', prompt_path: str | None = None, output_path: str = 'outputs/', output_video_name: str | None = None, num_videos_per_prompt: int = 1, seed: int = 1024, num_frames: int = 125, height: int = 720, width: int = 1280, height_sr: int = 1072, width_sr: int = 1920, fps: int = 24, num_inference_steps: int = 50, num_inference_steps_sr: int = 50, guidance_scale: float = 1.0, guidance_rescale: float = 0.0, boundary_ratio: float | None = None, sigmas: list[float] | None = None, save_video: bool = True, return_frames: bool = True, return_trajectory_latents: bool = False, return_trajectory_decoded: bool = False)

Bases: SamplingParam

Sampling parameters for Wan2.2 TI2V 5B model.

fastvideo.configs.sample.wan.Wan2_2_TI2V_5B_SamplingParam dataclass
Wan2_2_TI2V_5B_SamplingParam(data_type: str = 'video', image_path: str | None = None, pil_image: Any | None = None, video_path: str | None = None, mouse_cond: Any | None = None, keyboard_cond: Any | None = None, grid_sizes: Any | None = None, pose: str | None = None, c2ws_plucker_emb: Any | None = None, refine_from: str | None = None, t_thresh: float = 0.5, spatial_refine_only: bool = False, num_cond_frames: int = 0, stage1_video: Any | None = None, prompt: str | list[str] | None = None, negative_prompt: str | None = '色调艳丽,过曝,静态,细节模糊不清,字幕,风格,作品,画作,画面,静止,整体发灰,最差质量,低质量,JPEG压缩残留,丑陋的,残缺的,多余的手指,画得不好的手部,画得不好的脸部,畸形的,毁容的,形态畸形的肢体,手指融合,静止不动的画面,杂乱的背景,三条腿,背景人很多,倒着走', prompt_path: str | None = None, output_path: str = 'outputs/', output_video_name: str | None = None, num_videos_per_prompt: int = 1, seed: int = 1024, num_frames: int = 121, height: int = 704, width: int = 1280, height_sr: int = 1072, width_sr: int = 1920, fps: int = 24, num_inference_steps: int = 50, num_inference_steps_sr: int = 50, guidance_scale: float = 5.0, guidance_rescale: float = 0.0, boundary_ratio: float | None = None, sigmas: list[float] | None = None, save_video: bool = True, return_frames: bool = True, return_trajectory_latents: bool = False, return_trajectory_decoded: bool = False)

Bases: Wan2_2_Base_SamplingParam

Sampling parameters for Wan2.2 TI2V 5B model.