Model Parameter Documentation

Text to Video

Cogvideox 5b

Required Arguments

Name	Type	Description
prompt	string	Text prompt to guide generation
model	string	"cogvideox-5b"

Optional Arguments

Name	Type	Default Value	Description
negative_prompt	string	""	The prompt or prompts not to guide video generation. Ignored if guidance_scale is less than 1.
height	int	480	The height in pixels of the generated video.
width	int	720	The width in pixels of the generated video.
num_frames	int	48	Number of frames to generate.
num_inference_steps	int	50	The number of denoising steps. More steps can improve quality but are slower.
timesteps	list		Custom timesteps to use for the denoising process, must be in descending order.
guidance_scale	float	7.0	Classifier-Free Diffusion guidance scale. Higher values align the video more closely with the prompt.
num_videos_per_prompt	int	1	The number of videos to generate for each prompt. Note: Tio Magic Animation Framework currently only supports 1 video output
generator	torch.Generator		A torch.Generator or List[torch.Generator] to make generation deterministic.
latents	torch.FloatTensor		Pre-generated noisy latents to be used as inputs for generation.
prompt_embeds	torch.FloatTensor		Pre-generated text embeddings, used as an alternative to the 'prompt' argument.
negative_prompt_embeds	torch.FloatTensor		Pre-generated negative text embeddings, used as an alternative to the 'negative_prompt' argument.
output_type	str	pil	The output format of the generated video. Choose between 'pil' or 'np.array'.
return_dict	bool	True	Whether to return a StableDiffusionXLPipelineOutput object instead of a plain tuple.
attention_kwargs	dict		A kwargs dictionary passed to the AttentionProcessor.
callback_on_step_end	Callable		A function called at the end of each denoising step during inference.
callback_on_step_end_tensor_inputs	list		The list of tensor inputs for the callback_on_step_end function.
max_sequence_length	int	226	Maximum sequence length in the encoded prompt.

Pusa V1

Because PusaV1 is relatively new, there is no central location for its required and optional arguments. Based on its examples, we have added some parameters to the optional arguments section.

Required Arguments

Name	Type	Description
prompt	string	Text prompt to guide generation
model	string	"pusa-v1"

Optional Arguments

Name	Type	Default Value	Description
negative_prompt	string	""	The prompt or prompts not to guide video generation.

Wan 2.1 Text to Video 14b

Required Arguments

Name	Type	Description
prompt	string	Text prompt to guide generation
model	string	"wan2.1-t2v-14b"

Optional Arguments

Name	Type	Default Value	Description
negative_prompt	string	""	The prompt or prompts not to guide video generation. Ignored if guidance_scale is less than 1.
height	int	480	The height in pixels of the generated video.
width	int	832	The width in pixels of the generated video.
num_frames	int	81	Number of frames in the generated video
num_inference_steps	int	50	The number of denoising steps. More steps usually lead to higher quality at the expense of slower inference.
guidance_scale	float	5.0	Guidance scale for classifier-free diffusion. Higher values encourage generation to be closely linked to the text prompt.
num_videos_per_prompt	int	1	The number of videos to generate for each prompt. Note: Tio Magic Animation Framework currently only supports 1 video output
generator	torch.Generator		A torch.Generator or List[torch.Generator] to make generation deterministic.
latents	torch.FloatTensor		Pre-generated noisy latents to be used as inputs for generation.
prompt_embeds	torch.FloatTensor		Pre-generated text embeddings, used as an alternative to the 'prompt' argument.
output_type	str	np	The output format of the generated video. Choose between 'pil' or 'np.array'.
return_dict	bool	True	Whether to return a WanPipelineOutput object instead of a plain tuple.
attention_kwargs	dict		A kwargs dictionary passed to the AttentionProcessor.
callback_on_step_end	Callable		A function called at the end of each denoising step during inference.
callback_on_step_end_tensor_inputs	list		The list of tensor inputs for the callback_on_step_end function.
max_sequence_length	int	512	Maximum sequence length in the encoded prompt.

Wan 2.1 14b Text to Video FusionX

This is a LoRA applied on top of Wan 2.1 Vace 14b. All of the required arguments and optional arguments are the same as Wan 2.1 Vace 14b, except for the model string.

Required Arguments

Name	Type	Description
prompt	string	Text prompt to guide generation
model	string	"wan2.1-14b-t2v-fusionx"

Wan 2.1 Vace 14b

Required Arguments

Name	Type	Description
prompt	string	Text prompt to guide generation
model	string	"wan2.1-vace-14b"

Optional Arguments

Name	Type	Default Value	Description
negative_prompt	string	""	The prompt or prompts not to guide video generation. Ignored if guidance_scale is less than 1.
height	int	480	The height in pixels of the generated video.
width	int	832	The width in pixels of the generated video.
conditioning_scale	float	1.0	The scale applied to the control conditioning latent stream. Can be a float, List[float], or torch.Tensor.
num_frames	int	81	Number of frames in the generated video
num_inference_steps	int	50	The number of denoising steps. More steps usually lead to higher quality at the expense of slower inference.
guidance_scale	float	5.0	Guidance scale for classifier-free diffusion. Higher values encourage generation to be closely linked to the text prompt.
num_videos_per_prompt	int	1	The number of videos to generate for each prompt. Note: Tio Magic Animation Framework currently only supports 1 video output
generator	torch.Generator		A torch.Generator or List[torch.Generator] to make generation deterministic.
latents	torch.FloatTensor		Pre-generated noisy latents to be used as inputs for generation.
prompt_embeds	torch.FloatTensor		Pre-generated text embeddings, used as an alternative to the 'prompt' argument.
output_type	str	np	The output format of the generated video. Choose between 'pil' or 'np.array'.
return_dict	bool	True	Whether to return a WanPipelineOutput object instead of a plain tuple.
attention_kwargs	dict		A kwargs dictionary passed to the AttentionProcessor.
callback_on_step_end	Callable		A function called at the end of each denoising step during inference.
callback_on_step_end_tensor_inputs	list		The list of tensor inputs for the callback_on_step_end function.
max_sequence_length	int	512	Maximum sequence length in the encoded prompt.
flow_shift	float	3.0	A value that estimates motion between two frames. A larger flow shift focuses on high motion or transformation. A smaller flow shift focuses on stability.

Wan 2.1 Vace 14b Phantom FusionX

This is a LoRA applied on top of Wan 2.1 Vace 14b. All of the required arguments and optional arguments are the same as Wan 2.1 Vace 14b, except for the model string.

Required Arguments

Name	Type	Description
prompt	string	Text prompt to guide generation
model	string	"wan2.1-vace-14b-phantom-fusionx"

Model Parameter Documentation

Image to Video

Cogvideox 5b Image to Video

Required Arguments

Name	Type	Description
prompt	string	Text prompt to guide generation
image	string	Local path or URL to input image. Note: this model only supports 720 x 480 resolution. Unlike other model implementations, we do not autofix the video to be in the resolution of the given image.
model	string	"cogvideox-5b-image-to-video"

Optional Arguments

Name	Type	Default Value	Description
negative_prompt	string	""	The prompt or prompts not to guide video generation. Ignored if guidance_scale is less than 1.
height	int	480	The height in pixels of the generated video.
width	int	720	The width in pixels of the generated video.
num_frames	int	48	Number of frames to generate.
num_inference_steps	int	50	The number of denoising steps. More steps can improve quality but are slower.
timesteps	list		Custom timesteps to use for the denoising process, must be in descending order.
guidance_scale	float	7.0	Classifier-Free Diffusion guidance scale. Higher values align the video more closely with the prompt.
num_videos_per_prompt	int	1	The number of videos to generate for each prompt. Note: Tio Magic Animation Framework currently only supports 1 video output
generator	torch.Generator		A torch.Generator or List[torch.Generator] to make generation deterministic.
latents	torch.FloatTensor		Pre-generated noisy latents to be used as inputs for generation.
prompt_embeds	torch.FloatTensor		Pre-generated text embeddings, used as an alternative to the 'prompt' argument.
negative_prompt_embeds	torch.FloatTensor		Pre-generated negative text embeddings, used as an alternative to the 'negative_prompt' argument.
output_type	str	pil	The output format of the generated video. Choose between 'pil' or 'np.array'.
return_dict	bool	True	Whether to return a StableDiffusionXLPipelineOutput object instead of a plain tuple.
attention_kwargs	dict		A kwargs dictionary passed to the AttentionProcessor.
callback_on_step_end	Callable		A function called at the end of each denoising step during inference.
callback_on_step_end_tensor_inputs	list		The list of tensor inputs for the callback_on_step_end function.
max_sequence_length	int	226	Maximum sequence length in the encoded prompt.

Framepack I2V HY

Required Arguments

Name	Type	Description
prompt	string	Text prompt to guide generation
image	string	Local path or URL to input image.
model	string	"framepack-i2v-hy"

Optional Arguments

Name	Type	Default Value	Description
prompt_2	string	""	A secondary prompt for the second text encoder; defaults to the main prompt if not provided.
negative_prompt	string	""	The prompt or prompts not to guide video generation. Ignored if guidance_scale is less than 1.
negative_prompt2	string	""	A secondary negative prompt for the second text encoder.
height	int	720	The height in pixels of the generated video.
width	int	1280	The width in pixels of the generated video.
num_frames	int	129	Number of frames to generate.
num_inference_steps	int	50	The number of denoising steps. More steps can improve quality but are slower.
sigmas	list		Custom sigmas for the denoising scheduler.
true_cfg_scale	float	1.0	Enables true classifier-free guidance when > 1.0.
guidance_scale	float	6.0	Guidance scale to control how closely the video adheres to the prompt.
num_videos_per_prompt	int	1	The number of videos to generate for each prompt. Note: Tio Magic Animation Framework currently only supports 1 video output
generator	torch.Generator		A torch.Generator or List[torch.Generator] to make generation deterministic.
image_latents	torch.Tensor		Pre-encoded image latents, bypassing the VAE for the first image.
last_image_latents	torch.Tensor		Pre-encoded image latents, bypassing the VAE for the last image.
prompt_embeds	torch.Tensor		Pre-generated text embeddings, an alternative to 'prompt'.
pooled_prompt_embeds	torch.FloatTensor		Pre-generated pooled text embeddings.
negative_prompt_embeds	torch.FloatTensor		Pre-generated negative text embeddings, an alternative to 'negative_prompt'.
output_type	str	pil	The output format of the generated video. Choose between 'pil' or 'np.array'.
return_dict	bool	True	Whether to return a HunyuanVideoFramepackPipelineOutput object instead of a plain tuple.
attention_kwargs	dict		A kwargs dictionary passed to the AttentionProcessor.
clip_skip	int		Number of final layers to skip from the CLIP model.
callback_on_step_end	Callable		A function called at the end of each denoising step during inference.
callback_on_step_end_tensor_inputs	list		The list of tensor inputs for the callback_on_step_end function.

LTX Video

Required Arguments

Name	Type	Description
prompt	string	Text prompt to guide generation
image	string	Local path or URL to input image.
model	string	"ltx-video"

Optional Arguments

Name	Type	Default Value	Description
negative_prompt	string	""	The prompt to avoid during video generation.
height	int	512	The height in pixels of the generated video.
width	int	704	The width in pixels of the generated video.
num_frames	int	161	Number of frames to generate.
num_inference_steps	int	50	The number of denoising steps. More steps can improve quality but are slower.
timesteps	list		Custom timesteps for the denoising process in descending order.
guidance_scale	float	3.0	Scale for classifier-free guidance.
num_videos_per_prompt	int	1	The number of videos to generate for each prompt. Note: Tio Magic Animation Framework currently only supports 1 video output
generator	torch.Generator		A torch.Generator to make generation deterministic.
latents	torch.Tensor		Pre-generated noisy latents.
prompt_embeds	torch.Tensor		Pre-generated text embeddings, an alternative to 'prompt'.
promt_attension_mask	torch.Tensor		Pre-generated attention mask for text embeddings.
negative_prompt_embeds	torch.FloatTensor		Pre-generated negative text embeddings.
negative_prompt_attension_mask	torch.FloatTensor		Pre-generated attention mask for negative text embeddings.
decode_timestep	float	0.0	The timestep at which the generated video is decoded.
decode_noise_scale	float	None	Interpolation factor between random noise and denoised latents at decode time.
output_type	str	pil	The output format of the generated video. Choose between 'pil' or 'np.array'.
return_dict	bool	True	Whether to return a LTXPipelineOutput object instead of a plain tuple.
attention_kwargs	dict		A kwargs dictionary passed to the AttentionProcessor.
callback_on_step_end	Callable		A function called at the end of each denoising step during inference.
callback_on_step_end_tensor_inputs	list		The list of tensor inputs for the callback_on_step_end function.
max_sequence_length	int	128	Maximum sequence length for the prompt.

Luma Ray 2

Required Arguments

Name	Type	Description
prompt	string	Text prompt to guide generation
image	string	URL to input image. Note that Luma does not accept local files.
model	string	"luma-ray-2"

Pusa V1

Because PusaV1 is relatively new, there is no central location for its required and optional arguments. Based on its examples, we have added some parameters to the optional arguments section.

Required Arguments

Name	Type	Description
prompt	string	Text prompt to guide generation
image	string	Local path or URL to input image.
model	string	"pusa-v1"

Optional Arguments

Name	Type	Default Value	Description
negative_prompt	string	""	The prompt or prompts not to guide video generation. Ignored if guidance_scale is less than 1.
cond_position	str	"0"	Comma-separated list of frame indices for conditioning. You can use any position from 0 to 20.
noise_multipliers	str	"0.0"	Comma-separated noise multipliers for conditioning frames. A value of 0 means the condition image is used as totally clean, higher value means adding more noise. For I2V, you can use 0.2 or any from 0 to 1. For Start-End-Frame, you can use 0.2,0.4, or any from 0 to 1.
lora_alpha	float	1.0	A bigger alpha would bring more temporal consistency (i.e., make generated frames more like the conditioning part), but may also cause small motion or even collapse. We recommend using a value around 1 to 2.
num_inference_steps	int	30	The number of denoising steps. More steps can improve quality but are slower.
num_frames	int	81

Veo 2.0 Generate 001

Required Arguments

Name	Type	Description
prompt	string	Text prompt to guide generation
image	string	Local path or URL to input image.
model	string	"veo-2.0-generate-002"

Optional Arguments

Name	Type	Default Value	Description
negativePrompt	string	""	Text string that describes anything you want to discourage the model from generating
aspectRatio	str	"16:9"	Defines the aspect ratio of the generated videos. Accepts '16:9' (landscape) or '9:16' (portrait).
personGeneration	str	"allow_adult"	Controls whether people or face generation is allowed. Accepts 'allow_adult' or 'disallow'.
numberOfVideos	int	1	The number of videos to generate for each prompt. Note: Tio Magic Animation Framework currently only supports 1 video output
durationSeconds	int	8	Veo 2 only. Length of each output video in seconds, between 5 and 8

Wan 2.1 I2V 14b 720p

Required Arguments

Name	Type	Description
prompt	string	Text prompt to guide generation
image	string	Local path or URL to input image.
model	string	"wan2.1-i2v-14b-720p"

Optional Arguments

Name	Type	Default Value	Description
negative_prompt	string	""	The prompt or prompts not to guide video generation. Ignored if guidance_scale is less than 1.
height	int	480	The height in pixels of the generated video.
width	int	832	The width in pixels of the generated video.
conditioning_scale	float	1.0	The scale applied to the control conditioning latent stream. Can be a float, List[float], or torch.Tensor.
num_frames	int	81	Number of frames in the generated video
num_inference_steps	int	50	The number of denoising steps. More steps usually lead to higher quality at the expense of slower inference.
guidance_scale	float	5.0	Guidance scale for classifier-free diffusion. Higher values encourage generation to be closely linked to the text prompt.
num_videos_per_prompt	int	1	The number of videos to generate for each prompt. Note: Tio Magic Animation Framework currently only supports 1 video output
generator	torch.Generator		A torch.Generator or List[torch.Generator] to make generation deterministic.
latents	torch.FloatTensor		Pre-generated noisy latents to be used as inputs for generation.
prompt_embeds	torch.FloatTensor		Pre-generated text embeddings, used as an alternative to the 'prompt' argument.
negative_prompt_embeds	torch.Tensor		Pre-generated negative text embeddings, used as an alternative to the 'negative_prompt' argument.
image_embeds	torch.Tensor		Pre-generated image embeddings, used as an alternative to the 'image' argument.
output_type	str	np	The output format of the generated video. Choose between 'pil' or 'np.array'.
return_dict	bool	True	Whether to return a WanPipelineOutput object instead of a plain tuple.
attention_kwargs	dict		A kwargs dictionary passed to the AttentionProcessor.
callback_on_step_end	Callable		A function called at the end of each denoising step during inference.
callback_on_step_end_tensor_inputs	list		The list of tensor inputs for the callback_on_step_end function.
max_sequence_length	int	512	Maximum sequence length in the encoded prompt.

Wan 2.1 Vace 14b

Required Arguments

Name	Type	Description
prompt	string	Text prompt to guide generation
image	string	Local path or URL to input image.
model	string	"wan2.1-vace-14b"

Optional Arguments

Name	Type	Default Value	Description
negative_prompt	string	""	The prompt or prompts not to guide video generation. Ignored if guidance_scale is less than 1.
video	list		The input video (List[PIL.Image.Image]) to be used as a starting point for the generation. Note: this is created in _process_payload for you.
mask	list		The input mask (List[PIL.Image.Image]) that defines which video regions to condition on (black) and which to generate (white). Note: this is created in process_payload for you.
reference_images	list		A list of one or more reference images (List[PIL.Image.Image]) as extra conditioning for the generation.
height	int	480	The height in pixels of the generated video.
width	int	832	The width in pixels of the generated video.
num_frames	int	81	Number of frames in the generated video
num_inference_steps	int	50	The number of denoising steps. More steps usually lead to higher quality at the expense of slower inference.
guidance_scale	float	5.0	Guidance scale for classifier-free diffusion. Higher values encourage generation to be closely linked to the text prompt.
num_videos_per_prompt	int	1	The number of videos to generate for each prompt. Note: Tio Magic Animation Framework currently only supports 1 video output
generator	torch.Generator		A torch.Generator or List[torch.Generator] to make generation deterministic.
latents	torch.FloatTensor		Pre-generated noisy latents to be used as inputs for generation.
prompt_embeds	torch.FloatTensor		Pre-generated text embeddings, used as an alternative to the 'prompt' argument.
output_type	str	np	The output format of the generated video. Choose between 'pil' or 'np.array'.
return_dict	bool	True	Whether to return a WanPipelineOutput object instead of a plain tuple.
attention_kwargs	dict		A kwargs dictionary passed to the AttentionProcessor.
callback_on_step_end	Callable		A function called at the end of each denoising step during inference.
callback_on_step_end_tensor_inputs	list		The list of tensor inputs for the callback_on_step_end function.
max_sequence_length	int	512	Maximum sequence length in the encoded prompt.
flow_shift	float	5.0	A value that estimates motion between two frames. A larger flow shift focuses on high motion or transformation. A smaller flow shift focuses on stability.

Wan 2.1 Vace 14b I2V FusionX

This is a LoRA applied on top of Wan 2.1 Vace 14b. All of the required arguments and optional arguments are the same as Wan 2.1 Vace 14b, except for the model string.

Required Arguments

Name	Type	Description
prompt	string	Text prompt to guide generation
image	string	Local path or URL to input image.
model	string	"wan2.1-vace-14b-i2v-fusionx"

Interpolate

Wan 2.1 Flf2v 14B 720p

Required Arguments

Name	Type	Description
prompt	string	Text prompt to guide generation
first_frame	string	Local path or URL to first frame image.
last_frame	string	Local path or URL to last frame image
model	string	"wan2.1-flf2v-14b-720p"

Optional Arguments

Name	Type	Default Value	Description
negative_prompt	string	""	The prompt or prompts not to guide video generation. Ignored if guidance_scale is less than 1.
height	int	480	The height in pixels of the generated video.
width	int	832	The width in pixels of the generated video.
num_frames	int	81	Number of frames in the generated video
num_inference_steps	int	50	The number of denoising steps. More steps usually lead to higher quality at the expense of slower inference.
guidance_scale	float	5.0	Guidance scale for classifier-free diffusion. Higher values encourage generation to be closely linked to the text prompt.
num_videos_per_prompt	int	1	The number of videos to generate for each prompt. Note: Tio Magic Animation Framework currently only supports 1 video output
generator	torch.Generator		A torch.Generator or List[torch.Generator] to make generation deterministic.
latents	torch.FloatTensor		Pre-generated noisy latents to be used as inputs for generation.
prompt_embeds	torch.FloatTensor		Pre-generated text embeddings, used as an alternative to the 'prompt' argument.
negative_prompt_embeds	torch.Tensor		Pre-generated negative text embeddings, used as an alternative to the 'negative_prompt' argument.
image_embeds	torch.Tensor		Pre-generated image embeddings, used as an alternative to the 'image' argument.
output_type	str	np	The output format of the generated video. Choose between 'pil' or 'np.array'.
return_dict	bool	True	Whether to return a WanPipelineOutput object instead of a plain tuple.
attention_kwargs	dict		A kwargs dictionary passed to the AttentionProcessor.
callback_on_step_end	Callable		A function called at the end of each denoising step during inference.
callback_on_step_end_tensor_inputs	list		The list of tensor inputs for the callback_on_step_end function.
max_sequence_length	int	512	Maximum sequence length in the encoded prompt.

Wan 2.1 Vace 14b

Required Arguments

Name	Type	Description
prompt	string	Text prompt to guide generation
first_frame	string	Local path or URL to first frame image.
last_frame	string	Local path or URL to last frame image
model	string	"wan2.1-vace-14b"

Optional Arguments

Name	Type	Default Value	Description
negative_prompt	string	""	The prompt or prompts not to guide video generation. Ignored if guidance_scale is less than 1.
video	list		The input video (List[PIL.Image.Image]) to be used as a starting point for the generation. Note: this is created in _process_payload for you.
mask	list		The input mask (List[PIL.Image.Image]) that defines which video regions to condition on (black) and which to generate (white). Note: this is created in process_payload for you.
reference_images	list		A list of one or more reference images (List[PIL.Image.Image]) as extra conditioning for the generation.
conditioning_scale	float	1.0	The scale applied to the control conditioning latent stream. Can be a float, List[float], or torch.Tensor.
height	int	480	The height in pixels of the generated video.
width	int	832	The width in pixels of the generated video.
num_frames	int	81	Number of frames in the generated video
num_inference_steps	int	50	The number of denoising steps. More steps usually lead to higher quality at the expense of slower inference.
guidance_scale	float	5.0	Guidance scale for classifier-free diffusion. Higher values encourage generation to be closely linked to the text prompt.
num_videos_per_prompt	int	1	The number of videos to generate for each prompt. Note: Tio Magic Animation Framework currently only supports 1 video output
generator	torch.Generator		A torch.Generator or List[torch.Generator] to make generation deterministic.
latents	torch.FloatTensor		Pre-generated noisy latents to be used as inputs for generation.
prompt_embeds	torch.FloatTensor		Pre-generated text embeddings, used as an alternative to the 'prompt' argument.
output_type	str	np	The output format of the generated video. Choose between 'pil' or 'np.array'.
return_dict	bool	True	Whether to return a WanPipelineOutput object instead of a plain tuple.
attention_kwargs	dict		A kwargs dictionary passed to the AttentionProcessor.
callback_on_step_end	Callable		A function called at the end of each denoising step during inference.
callback_on_step_end_tensor_inputs	list		The list of tensor inputs for the callback_on_step_end function.
max_sequence_length	int	512	Maximum sequence length in the encoded prompt.
flow_shift	float	5.0	A value that estimates motion between two frames. A larger flow shift focuses on high motion or transformation. A smaller flow shift focuses on stability.

Framepack I2V HY

Required Arguments

Name	Type	Description
prompt	string	Text prompt to guide generation
first_frame	string	Local path or URL to first frame image.
last_frame	string	Local path or URL to last frame image
model	string	"framepack-i2v-hy"

Optional Arguments

Name	Type	Default Value	Description
prompt_2	string	""	A secondary prompt for the second text encoder; defaults to the main prompt if not provided.
negative_prompt	string	""	The prompt or prompts not to guide video generation. Ignored if guidance_scale is less than 1.
negative_prompt2	string	""	A secondary negative prompt for the second text encoder.
height	int	720	The height in pixels of the generated video.
width	int	1280	The width in pixels of the generated video.
num_frames	int	129	Number of frames to generate.
num_inference_steps	int	50	The number of denoising steps. More steps can improve quality but are slower.
sigmas	list		Custom sigmas for the denoising scheduler.
true_cfg_scale	float	1.0	Enables true classifier-free guidance when > 1.0.
guidance_scale	float	6.0	Guidance scale to control how closely the video adheres to the prompt.
num_videos_per_prompt	int	1	The number of videos to generate for each prompt. Note: Tio Magic Animation Framework currently only supports 1 video output
generator	torch.Generator		A torch.Generator or List[torch.Generator] to make generation deterministic.
image_latents	torch.Tensor		Pre-encoded image latents, bypassing the VAE for the first image.
last_image_latents	torch.Tensor		Pre-encoded image latents, bypassing the VAE for the last image.
prompt_embeds	torch.Tensor		Pre-generated text embeddings, an alternative to 'prompt'.
pooled_prompt_embeds	torch.FloatTensor		Pre-generated pooled text embeddings.
negative_prompt_embeds	torch.FloatTensor		Pre-generated negative text embeddings, an alternative to 'negative_prompt'.
output_type	str	pil	The output format of the generated video. Choose between 'pil' or 'np.array'.
return_dict	bool	True	Whether to return a HunyuanVideoFramepackPipelineOutput object instead of a plain tuple.
attention_kwargs	dict		A kwargs dictionary passed to the AttentionProcessor.
clip_skip	int		Number of final layers to skip from the CLIP model.
callback_on_step_end	Callable		A function called at the end of each denoising step during inference.
callback_on_step_end_tensor_inputs	list		The list of tensor inputs for the callback_on_step_end function.

Luma Ray 2

Required Arguments

Name	Type	Description
prompt	string	Text prompt to guide generation
first_frame	string	URL to first frame image.
last_frame	string	URL to last frame image
model	string	"luma-ray-2"

Pose Guidance

Wan 2.1 Vace 14b

Required Arguments

Name	Type	Description
prompt	string	Text prompt to guide generation
image	string	Path or URL to input image
model	string	"wan2.1-vace-14b"

Optional Arguments

Name	Type	Default Value	Description
guiding_video	string		A video to guide the pose of the output video. If provided, a pose_video will be generated for the output video (List[PIL.Image.Image])
pose_video	string		A pose skeleton video to guide the pose of the output video (List[PIL.Image.Image])
negative_prompt	string	""	The prompt or prompts not to guide video generation. Ignored if guidance_scale is less than 1.
video	list		The input video (List[PIL.Image.Image]) to be used as a starting point for the generation. Note: this is created in _process_payload for you.
mask	list		The input mask (List[PIL.Image.Image]) that defines which video regions to condition on (black) and which to generate (white). Note: this is created in process_payload for you.
reference_images	list		A list of one or more reference images (List[PIL.Image.Image]) as extra conditioning for the generation.
conditioning_scale	float	1.0	The scale applied to the control conditioning latent stream. Can be a float, List[float], or torch.Tensor.
height	int	480	The height in pixels of the generated video.
width	int	832	The width in pixels of the generated video.
num_frames	int	81	Number of frames in the generated video
num_inference_steps	int	50	The number of denoising steps. More steps usually lead to higher quality at the expense of slower inference.
guidance_scale	float	5.0	Guidance scale for classifier-free diffusion. Higher values encourage generation to be closely linked to the text prompt.
num_videos_per_prompt	int	1	The number of videos to generate for each prompt. Note: Tio Magic Animation Framework currently only supports 1 video output
generator	torch.Generator		A torch.Generator or List[torch.Generator] to make generation deterministic.
latents	torch.FloatTensor		Pre-generated noisy latents to be used as inputs for generation.
prompt_embeds	torch.FloatTensor		Pre-generated text embeddings, used as an alternative to the 'prompt' argument.
output_type	str	np	The output format of the generated video. Choose between 'pil' or 'np.array'.
return_dict	bool	True	Whether to return a WanPipelineOutput object instead of a plain tuple.
attention_kwargs	dict		A kwargs dictionary passed to the AttentionProcessor.
callback_on_step_end	Callable		A function called at the end of each denoising step during inference.
callback_on_step_end_tensor_inputs	list		The list of tensor inputs for the callback_on_step_end function.
max_sequence_length	int	512	Maximum sequence length in the encoded prompt.
flow_shift	float	3.0	A value that estimates motion between two frames. A larger flow shift focuses on high motion or transformation. A smaller flow shift focuses on stability.