puras

Media model reference

Per-slug input schema and pricing for every model usable via `media.run()`. Auto-generated from the catalog.

Every slug here is callable via media.run(slug, inputs). The input fields map 1:1 to the model itself — we don't validate shape, the model does. Cost is debited from your project balance at the listed rate, computed from the inputs you send.

For the SDK signature and patterns, see sdk-media. For the broader catalog and policy, see models.

Index

SlugKindPrice
openai/gpt-image-2image$0.253 / call
openai/gpt-image-2-editimage$0.263 / call
bytedance/seedream-v4image$0.036 / call
bytedance/seedream-v4-editimage$0.036 / call
google/imagen-4image$0.060 / call
kuaishou/kling-v3-imageimage$0.034 / call
bytedance/seedance-2-t2vvideo$0.364 / s
bytedance/seedance-2-i2vvideo$0.363 / s
bytedance/seedance-2-r2vvideo$0.363 / s
bytedance/seedance-2-fast-t2vvideo$0.290 / s
bytedance/seedance-2-fast-i2vvideo$0.290 / s
bytedance/seedance-2-fast-r2vvideo$0.290 / s
kuaishou/kling-v3-t2vvideo$0.134 / s
kuaishou/kling-v3-i2vvideo$0.134 / s
google/veo-3-t2vvideo$0.600 / s
google/veo-3-i2vvideo$0.240 / s
google/veo-3-fast-t2vvideo$0.300 / s
google/veo-3-fast-i2vvideo$0.300 / s

Images

openai/gpt-image-2

GPT Image 2 · text → image

Quality × size matrix. Set quality ∈ low|medium|high and size (e.g. 1024x1024).

Pricing

VariantPrice
low · 1024x768$0.0060/image
low · 1024x1024$0.0072/image
low · 1024x1536$0.0060/image
low · 1920x1080$0.0060/image
low · 2560x1440$0.0084/image
low · 3840x2160$0.014/image
medium · 1024x768$0.044/image
medium · 1024x1024$0.064/image
medium · 1024x1536$0.050/image
medium · 1920x1080$0.048/image
medium · 2560x1440$0.067/image
medium · 3840x2160$0.121/image
high · 1024x768$0.174/image
high · 1024x1024$0.253/image
high · 1024x1536$0.198/image
high · 1920x1080$0.190/image
high · 2560x1440$0.266/image
high · 3840x2160$0.481/image

Inputs

FieldTypeRequiredDefaultNotes
promptstringThe prompt for image generation. (max 32,000 chars, min 2 chars)
image_sizeenum"landscape_4_3"The size of the generated image. Supports preset names, explicit {width, height}, or 'auto' to let the model pick the best size. Concrete sizes must have both dimensions as multiples of 16, max edge 3840px, aspect ratio <= 3:1, total pixels between 655,360 and 8,294,400. Values: "square_hd" | "square" | "portrait_4_3" | "portrait_16_9" | "landscape_4_3" | "landscape_16_9" | "auto"
num_imagesinteger1Number of images to generate. (≥ 1, ≤ 4)
output_formatenum"png"Output format for the images. Values: "jpeg" | "png" | "webp"
qualityenum"high"Quality for the generated image. Use 'auto' to let the model pick the best quality for the prompt. Values: "auto" | "low" | "medium" | "high"

Output

Saved to your project drive at drive_path; a signed output_url (TTL ~1h) is returned.

Example

python
result = media.run(
    "openai/gpt-image-2",
    prompt="...",
    image_size="square_hd",
    quality="high",
)

openai/gpt-image-2-edit

GPT Image 2 (edit) · image → image (edit)

Image-to-image edit. Quality × size matrix; provide image_url(s).

Pricing

VariantPrice
low · 1024x768$0.013/image
low · 1024x1024$0.018/image
low · 1024x1536$0.022/image
low · 1920x1080$0.020/image
low · 2560x1440$0.023/image
low · 3840x2160$0.029/image
medium · 1024x768$0.052/image
medium · 1024x1024$0.073/image
medium · 1024x1536$0.065/image
medium · 1920x1080$0.064/image
medium · 2560x1440$0.082/image
medium · 3840x2160$0.136/image
high · 1024x768$0.181/image
high · 1024x1024$0.263/image
high · 1024x1536$0.214/image
high · 1920x1080$0.190/image
high · 2560x1440$0.281/image
high · 3840x2160$0.496/image

Inputs

FieldTypeRequiredDefaultNotes
image_urlsarray<string>The URLs of the images to use as a reference for the generation
promptstringThe prompt for image generation. (max 32,000 chars, min 2 chars)
image_sizeenum"auto"The size of the generated image. Use 'auto' to infer from input images. Values: "square_hd" | "square" | "portrait_4_3" | "portrait_16_9" | "landscape_4_3" | "landscape_16_9" | "auto"
mask_urlstringThe URL of the mask image to use for the generation. This indicates what part of the image to edit
num_imagesinteger1Number of images to generate. (≥ 1, ≤ 4)
output_formatenum"png"Output format for the images. Values: "jpeg" | "png" | "webp"
qualityenum"high"Quality for the generated image. Use 'auto' to let the model pick the best quality for the prompt. Values: "auto" | "low" | "medium" | "high"

Output

Saved to your project drive at drive_path; a signed output_url (TTL ~1h) is returned.

Example

python
result = media.run(
    "openai/gpt-image-2-edit",
    prompt="...",
    image_urls=["https://..."],
    image_size="square_hd",
    quality="high",
)

bytedance/seedream-v4

Seedream v4 · text → image

Pricing

$0.036 per call

Inputs

FieldTypeRequiredDefaultNotes
promptstringThe text prompt used to generate the image
enable_safety_checkerbooleantrueIf set to true, the safety checker will be enabled
enhance_prompt_modeenum"standard"The mode to use for enhancing prompt enhancement. Standard mode provides higher quality results but takes longer to generate. Fast mode provides average quality results but takes less time to generate. Values: "standard" | "fast"
image_sizeenum{height: 2048, width: 2048}The size of the generated image. Total pixels must be between 960x960 and 4096x4096. Values: "square_hd" | "square" | "portrait_4_3" | "portrait_16_9" | "landscape_4_3" | "landscape_16_9" | "auto" | "auto_2K" | "auto_4K"
max_imagesinteger1If set to a number greater than one, enables multi-image generation. The model will potentially return up to max_images images every generation, and in total, num_images generations will be carried out. In total, the number of images generated will be between num_images and max_images*num_images. (≥ 1, ≤ 6)
num_imagesinteger1Number of separate model generations to be run with the prompt. (≥ 1, ≤ 6)
seedintegerRandom seed to control the stochasticity of image generation

Output

Saved to your project drive at drive_path; a signed output_url (TTL ~1h) is returned. Additional metadata available under meta (seed).

Example

python
result = media.run(
    "bytedance/seedream-v4",
    prompt="...",
    image_size="square_hd",
)

bytedance/seedream-v4-edit

Seedream v4 (edit) · image → image (edit)

Edit/composite reference images. Provide image_url (or list) plus a prompt.

Pricing

$0.036 per call

Inputs

FieldTypeRequiredDefaultNotes
image_urlsarray<string>List of URLs of input images for editing. Presently, up to 10 image inputs are allowed. If over 10 images are sent, only the last 10 will be used
promptstringThe text prompt used to edit the image
enable_safety_checkerbooleantrueIf set to true, the safety checker will be enabled
enhance_prompt_modeenum"standard"The mode to use for enhancing prompt enhancement. Standard mode provides higher quality results but takes longer to generate. Fast mode provides average quality results but takes less time to generate. Values: "standard" | "fast"
image_sizeenum{height: 2048, width: 2048}The size of the generated image. The minimum total image area is 921600 pixels. Failing this, the image size will be adjusted to by scaling it up, while maintaining the aspect ratio. Values: "square_hd" | "square" | "portrait_4_3" | "portrait_16_9" | "landscape_4_3" | "landscape_16_9" | "auto" | "auto_2K" | "auto_4K"
max_imagesinteger1If set to a number greater than one, enables multi-image generation. The model will potentially return up to max_images images every generation, and in total, num_images generations will be carried out. In total, the number of images generated will be between num_images and max_images*num_images. The total number of images (image inputs + image outputs) must not exceed 15. (≥ 1, ≤ 6)
num_imagesinteger1Number of separate model generations to be run with the prompt. (≥ 1, ≤ 6)
seedintegerRandom seed to control the stochasticity of image generation

Output

Saved to your project drive at drive_path; a signed output_url (TTL ~1h) is returned. Additional metadata available under meta (seed).

Example

python
result = media.run(
    "bytedance/seedream-v4-edit",
    prompt="...",
    image_urls=["https://..."],
    image_size="square_hd",
)

google/imagen-4

Imagen 4 · text → image

Pricing

$0.060 per call

Inputs

FieldTypeRequiredDefaultNotes
promptstringThe text prompt to generate an image from. (max 5,000 chars, min 3 chars)
aspect_ratioenum"1:1"The aspect ratio of the generated image. Values: "1:1" | "16:9" | "9:16" | "4:3" | "3:4"
num_imagesinteger1The number of images to generate. (≥ 1, ≤ 4)
output_formatenum"png"The format of the generated image. Values: "jpeg" | "png" | "webp"
resolutionenum"1K"The resolution of the generated image. Values: "1K" | "2K"
safety_toleranceenum"4"The safety tolerance level for content moderation. 1 is the most strict (blocks most content), 6 is the least strict. Values: "1" | "2" | "3" | "4" | "5" | "6"
seedintegerThe seed for the random number generator

Output

Saved to your project drive at drive_path; a signed output_url (TTL ~1h) is returned.

Example

python
result = media.run(
    "google/imagen-4",
    prompt="...",
)

kuaishou/kling-v3-image

Kling v3 (image) · text → image

Pricing

$0.034 per call

Inputs

FieldTypeRequiredDefaultNotes
promptstringText prompt for image generation. Max 2500 characters. (max 2,500 chars)
aspect_ratioenum"16:9"Aspect ratio of generated images. Values: "16:9" | "9:16" | "1:1" | "4:3" | "3:4" | "3:2" | "2:3" | "21:9"
elementsarray<elementinput>Optional: Elements (characters/objects) to include in the image for face control. Each element can have a frontal image and optionally reference images
negative_promptstringNegative text prompt. It is recommended to supplement negative prompt information through negative sentences directly within positive prompts
num_imagesinteger1Number of images to generate (1-9). (≥ 1, ≤ 9)
output_formatenum"png"The format of the generated image. Values: "jpeg" | "png" | "webp"
resolutionenum"1K"Image generation resolution. 1K: standard, 2K: high-res. Values: "1K" | "2K"

Output

Saved to your project drive at drive_path; a signed output_url (TTL ~1h) is returned.

Example

python
result = media.run(
    "kuaishou/kling-v3-image",
    prompt="...",
)

Videos

bytedance/seedance-2-t2v

Seedance 2.0 — text→video

720p–1080p text-to-video. Audio included.

Pricing

$0.364 per second of output

Inputs

FieldTypeRequiredDefaultNotes
promptstringThe text prompt used to generate the video
aspect_ratioenum"auto"The aspect ratio of the generated video. Use 16:9 for landscape, 9:16 for portrait/vertical, 1:1 for square, 21:9 for ultrawide cinematic, or auto to let the model decide. Values: "auto" | "21:9" | "16:9" | "4:3" | "1:1" | "3:4" | "9:16"
durationenum"auto"Duration of the video in seconds. Supports 4 to 15 seconds, or auto to let the model decide based on the prompt. Values: "auto" | "4" | "5" | "6" | "7" | "8" | "9" | "10" | "11" | "12" | "13" | "14" | "15"
generate_audiobooleantrueWhether to generate synchronized audio for the video, including sound effects, ambient sounds, and lip-synced speech. The cost of video generation is the same regardless of whether audio is generated or not
resolutionenum"720p"Video resolution - 480p for faster generation, 720p for balance, 1080p for highest quality. Values: "480p" | "720p" | "1080p"
seedintegerRandom seed for reproducibility. Note that results may still vary slightly even with the same seed

Output

Saved to your project drive at drive_path; a signed output_url (TTL ~1h) is returned. Additional metadata available under meta (seed).

Example

python
result = media.run(
    "bytedance/seedance-2-t2v",
    prompt="...",
    duration="auto",
)

bytedance/seedance-2-i2v

Seedance 2.0 — image→video

Pricing

$0.363 per second of output

Inputs

FieldTypeRequiredDefaultNotes
image_urlstringThe URL of the starting frame image to animate. Supported formats: JPEG, PNG, WebP. Max 30 MB
promptstringThe text prompt describing the desired motion and action for the video
aspect_ratioenum"auto"The aspect ratio of the generated video. Use 16:9 for landscape, 9:16 for portrait/vertical, 1:1 for square, 21:9 for ultrawide cinematic, or auto to infer from the input image. Values: "auto" | "21:9" | "16:9" | "4:3" | "1:1" | "3:4" | "9:16"
durationenum"auto"Duration of the video in seconds. Supports 4 to 15 seconds, or auto to let the model decide based on the prompt. Values: "auto" | "4" | "5" | "6" | "7" | "8" | "9" | "10" | "11" | "12" | "13" | "14" | "15"
end_image_urlstringThe URL of the image to use as the last frame of the video. When provided, the generated video will transition from the starting image to this ending image. Supported formats: JPEG, PNG, WebP. Max 30 MB
generate_audiobooleantrueWhether to generate synchronized audio for the video, including sound effects, ambient sounds, and lip-synced speech. The cost of video generation is the same regardless of whether audio is generated or not
resolutionenum"720p"Video resolution - 480p for faster generation, 720p for balance, 1080p for highest quality. Values: "480p" | "720p" | "1080p"
seedintegerRandom seed for reproducibility. Note that results may still vary slightly even with the same seed

Output

Saved to your project drive at drive_path; a signed output_url (TTL ~1h) is returned. Additional metadata available under meta (seed).

Example

python
result = media.run(
    "bytedance/seedance-2-i2v",
    prompt="...",
    image_url="https://...",
    duration="auto",
)

bytedance/seedance-2-r2v

Seedance 2.0 — reference→video

Up to 9 image / 3 video / 3 audio references. Per-second drops 40% when a video reference is passed.

Pricing

VariantPrice
without video reference$0.363/s
with video reference$0.218/s

Inputs

FieldTypeRequiredDefaultNotes
promptstringThe text prompt used to generate the video
aspect_ratioenum"auto"The aspect ratio of the generated video. Use 16:9 for landscape, 9:16 for portrait/vertical, 1:1 for square, 21:9 for ultrawide cinematic, or auto to let the model decide. Values: "auto" | "21:9" | "16:9" | "4:3" | "1:1" | "3:4" | "9:16"
audio_urlsarray<string>Reference audio to guide video generation. Refer to them in the prompt as @Audio1, @Audio2, etc. Supported formats: MP3, WAV. Up to 3 files, combined duration must not exceed 15 seconds. Max 15 MB per file.If audio is provided, at least one reference image or video is required
durationenum"auto"Duration of the video in seconds. Supports 4 to 15 seconds, or auto to let the model decide based on the prompt. Values: "auto" | "4" | "5" | "6" | "7" | "8" | "9" | "10" | "11" | "12" | "13" | "14" | "15"
generate_audiobooleantrueWhether to generate synchronized audio for the video, including sound effects, ambient sounds, and lip-synced speech. The cost of video generation is the same regardless of whether audio is generated or not
image_urlsarray<string>Reference images to guide video generation. Refer to them in the prompt as @Image1, @Image2, etc. Supported formats: JPEG, PNG, WebP. Max 30 MB per image. Up to 9 images. Total files across all modalities must not exceed 12
resolutionenum"720p"Video resolution - 480p for faster generation, 720p for balance, 1080p for highest quality. Values: "480p" | "720p" | "1080p"
seedintegerRandom seed for reproducibility. Note that results may still vary slightly even with the same seed
video_urlsarray<string>Reference videos to guide video generation. Refer to them in the prompt as @Video1, @Video2, etc. Supported formats: MP4, MOV. Up to 3 videos, combined duration must be between 2 and 15 seconds, total size under 50 MB. Each video must be between ~480p (640x640) and ~720p (834x1112) in resolution

Output

Saved to your project drive at drive_path; a signed output_url (TTL ~1h) is returned. Additional metadata available under meta (seed).

Example

python
result = media.run(
    "bytedance/seedance-2-r2v",
    prompt="...",
    image_urls=["https://..."],
    duration="auto",
)

bytedance/seedance-2-fast-t2v

Seedance 2.0 Fast — text→video

Pricing

$0.290 per second of output

Inputs

FieldTypeRequiredDefaultNotes
promptstringThe text prompt used to generate the video
aspect_ratioenum"auto"The aspect ratio of the generated video. Use 16:9 for landscape, 9:16 for portrait/vertical, 1:1 for square, 21:9 for ultrawide cinematic, or auto to let the model decide. Values: "auto" | "21:9" | "16:9" | "4:3" | "1:1" | "3:4" | "9:16"
durationenum"auto"Duration of the video in seconds. Supports 4 to 15 seconds, or auto to let the model decide based on the prompt. Values: "auto" | "4" | "5" | "6" | "7" | "8" | "9" | "10" | "11" | "12" | "13" | "14" | "15"
generate_audiobooleantrueWhether to generate synchronized audio for the video, including sound effects, ambient sounds, and lip-synced speech. The cost of video generation is the same regardless of whether audio is generated or not
resolutionenum"720p"Video resolution - 480p for faster generation, 720p for balance. Values: "480p" | "720p"
seedintegerRandom seed for reproducibility. Note that results may still vary slightly even with the same seed

Output

Saved to your project drive at drive_path; a signed output_url (TTL ~1h) is returned. Additional metadata available under meta (seed).

Example

python
result = media.run(
    "bytedance/seedance-2-fast-t2v",
    prompt="...",
    duration="auto",
)

bytedance/seedance-2-fast-i2v

Seedance 2.0 Fast — image→video

Pricing

$0.290 per second of output

Inputs

FieldTypeRequiredDefaultNotes
image_urlstringThe URL of the starting frame image to animate. Supported formats: JPEG, PNG, WebP. Max 30 MB
promptstringThe text prompt describing the desired motion and action for the video
aspect_ratioenum"auto"The aspect ratio of the generated video. Use 16:9 for landscape, 9:16 for portrait/vertical, 1:1 for square, 21:9 for ultrawide cinematic, or auto to infer from the input image. Values: "auto" | "21:9" | "16:9" | "4:3" | "1:1" | "3:4" | "9:16"
durationenum"auto"Duration of the video in seconds. Supports 4 to 15 seconds, or auto to let the model decide based on the prompt. Values: "auto" | "4" | "5" | "6" | "7" | "8" | "9" | "10" | "11" | "12" | "13" | "14" | "15"
end_image_urlstringThe URL of the image to use as the last frame of the video. When provided, the generated video will transition from the starting image to this ending image. Supported formats: JPEG, PNG, WebP. Max 30 MB
generate_audiobooleantrueWhether to generate synchronized audio for the video, including sound effects, ambient sounds, and lip-synced speech. The cost of video generation is the same regardless of whether audio is generated or not
resolutionenum"720p"Video resolution - 480p for faster generation, 720p for balance. Values: "480p" | "720p"
seedintegerRandom seed for reproducibility. Note that results may still vary slightly even with the same seed

Output

Saved to your project drive at drive_path; a signed output_url (TTL ~1h) is returned. Additional metadata available under meta (seed).

Example

python
result = media.run(
    "bytedance/seedance-2-fast-i2v",
    prompt="...",
    image_url="https://...",
    duration="auto",
)

bytedance/seedance-2-fast-r2v

Seedance 2.0 Fast — reference→video

Pricing

VariantPrice
without video reference$0.290/s
with video reference$0.174/s

Inputs

FieldTypeRequiredDefaultNotes
promptstringThe text prompt used to generate the video
aspect_ratioenum"auto"The aspect ratio of the generated video. Use 16:9 for landscape, 9:16 for portrait/vertical, 1:1 for square, 21:9 for ultrawide cinematic, or auto to let the model decide. Values: "auto" | "21:9" | "16:9" | "4:3" | "1:1" | "3:4" | "9:16"
audio_urlsarray<string>Reference audio to guide video generation. Refer to them in the prompt as @Audio1, @Audio2, etc. Supported formats: MP3, WAV. Up to 3 files, combined duration must not exceed 15 seconds. Max 15 MB per file.If audio is provided, at least one reference image or video is required
durationenum"auto"Duration of the video in seconds. Supports 4 to 15 seconds, or auto to let the model decide based on the prompt. Values: "auto" | "4" | "5" | "6" | "7" | "8" | "9" | "10" | "11" | "12" | "13" | "14" | "15"
generate_audiobooleantrueWhether to generate synchronized audio for the video, including sound effects, ambient sounds, and lip-synced speech. The cost of video generation is the same regardless of whether audio is generated or not
image_urlsarray<string>Reference images to guide video generation. Refer to them in the prompt as @Image1, @Image2, etc. Supported formats: JPEG, PNG, WebP. Max 30 MB per image. Up to 9 images. Total files across all modalities must not exceed 12
resolutionenum"720p"Video resolution - 480p for faster generation, 720p for balance. Values: "480p" | "720p"
seedintegerRandom seed for reproducibility. Note that results may still vary slightly even with the same seed
video_urlsarray<string>Reference videos to guide video generation. Refer to them in the prompt as @Video1, @Video2, etc. Supported formats: MP4, MOV. Up to 3 videos, combined duration must be between 2 and 15 seconds, total size under 50 MB. Each video must be between ~480p (640x640) and ~720p (834x1112) in resolution

Output

Saved to your project drive at drive_path; a signed output_url (TTL ~1h) is returned. Additional metadata available under meta (seed).

Example

python
result = media.run(
    "bytedance/seedance-2-fast-r2v",
    prompt="...",
    image_urls=["https://..."],
    duration="auto",
)

kuaishou/kling-v3-t2v

Kling v3 Pro — text→video

Set generate_audio: true to enable audio, voice_control: true for voice.

Pricing

VariantPrice
audio off$0.134/s
audio on$0.202/s
audio + voice control$0.235/s

Inputs

FieldTypeRequiredDefaultNotes
aspect_ratioenum"16:9"The aspect ratio of the generated video frame. Values: "16:9" | "9:16" | "1:1"
cfg_scalenumber0.5The CFG (Classifier Free Guidance) scale is a measure of how close you want
        the model to stick to your prompt. (≥ 0, ≤ 1) |

| duration | enum | | "5" | The duration of the generated video in seconds. Values: "3" | "4" | "5" | "6" | "7" | "8" | "9" | "10" | "11" | "12" | "13" | "14" | "15" | | generate_audio | boolean | | true | Whether to generate native audio for the video. Supports Chinese and English voice output. Other languages are automatically translated to English. For English speech, use lowercase letters; for acronyms or proper nouns, use uppercase | | multi_prompt | array<klingv3multipromptelement> | | — | List of prompts for multi-shot video generation. If provided, overrides the single prompt and divides the video into multiple shots with specified prompts and durations | | negative_prompt | string | | "blur, distort, and low quality" | (max 2,500 chars) | | prompt | string | | — | Text prompt for video generation. Either prompt or multi_prompt must be provided, but not both | | shot_type | enum | | "customize" | The type of multi-shot video generation. 'intelligent' lets the model automatically determine shot structure. Values: "customize" | "intelligent" |

Output

Saved to your project drive at drive_path; a signed output_url (TTL ~1h) is returned.

Example

python
result = media.run(
    "kuaishou/kling-v3-t2v",
    prompt="...",
    duration="5",
)

kuaishou/kling-v3-i2v

Kling v3 Pro — image→video

Pricing

VariantPrice
audio off$0.134/s
audio on$0.202/s
audio + voice control$0.235/s

Inputs

FieldTypeRequiredDefaultNotes
start_image_urlstringURL of the image to be used for the video
cfg_scalenumber0.5The CFG (Classifier Free Guidance) scale is a measure of how close you want
        the model to stick to your prompt. (≥ 0, ≤ 1) |

| duration | enum | | "5" | The duration of the generated video in seconds. Values: "3" | "4" | "5" | "6" | "7" | "8" | "9" | "10" | "11" | "12" | "13" | "14" | "15" | | elements | array<klingv3comboelementinput> | | — | Elements (characters/objects) to include in the video. Each example can either be an image set (frontal + reference images) or a video. Reference in prompt as @Element1, @Element2, etc | | end_image_url | string | | — | URL of the image to be used for the end of the video | | generate_audio | boolean | | true | Whether to generate native audio for the video. Supports Chinese and English voice output. Other languages are automatically translated to English. For English speech, use lowercase letters; for acronyms or proper nouns, use uppercase | | multi_prompt | array<klingv3multipromptelement> | | — | List of prompts for multi-shot video generation. If provided, divides the video into multiple shots | | negative_prompt | string | | "blur, distort, and low quality" | (max 2,500 chars) | | prompt | string | | — | Text prompt for video generation. Either prompt or multi_prompt must be provided, but not both | | shot_type | enum | | "customize" | The type of multi-shot video generation. 'intelligent' lets the model automatically determine shot structure. Values: "customize" | "intelligent" |

Output

Saved to your project drive at drive_path; a signed output_url (TTL ~1h) is returned.

Example

python
result = media.run(
    "kuaishou/kling-v3-i2v",
    prompt="...",
    duration="5",
)

google/veo-3-t2v

Veo 3 — text→video

Pricing

VariantPrice
audio off$0.600/s
audio on$0.900/s

Inputs

FieldTypeRequiredDefaultNotes
promptstringThe text prompt describing the video you want to generate. (max 20,000 chars)
aspect_ratioenum"16:9"The aspect ratio of the generated video. Values: "16:9" | "9:16"
auto_fixbooleantrueWhether to automatically attempt to fix prompts that fail content policy or other validation checks by rewriting them
durationenum"8s"The duration of the generated video. Values: "4s" | "6s" | "8s"
generate_audiobooleantrueWhether to generate audio for the video
negative_promptstringA negative prompt to guide the video generation
resolutionenum"720p"The resolution of the generated video. Values: "720p" | "1080p"
safety_toleranceenum"4"The safety tolerance level for content moderation. 1 is the most strict (blocks most content), 6 is the least strict. Values: "1" | "2" | "3" | "4" | "5" | "6"
seedintegerThe seed for the random number generator

Output

Saved to your project drive at drive_path; a signed output_url (TTL ~1h) is returned.

Example

python
result = media.run(
    "google/veo-3-t2v",
    prompt="...",
    duration="8s",
)

google/veo-3-i2v

Veo 3 — image→video

Pricing

VariantPrice
audio off$0.240/s
audio on$0.480/s

Inputs

FieldTypeRequiredDefaultNotes
image_urlstringURL of the input image to animate. Should be 720p or higher resolution in 16:9 or 9:16 aspect ratio. If the image is not in 16:9 or 9:16 aspect ratio, it will be cropped to fit
promptstringThe text prompt describing how the image should be animated. (max 20,000 chars)
aspect_ratioenum"auto"The aspect ratio of the generated video. Values: "auto" | "16:9" | "9:16"
auto_fixbooleanfalseWhether to automatically attempt to fix prompts that fail content policy or other validation checks by rewriting them
durationenum"8s"The duration of the generated video. Values: "4s" | "6s" | "8s"
generate_audiobooleantrueWhether to generate audio for the video
negative_promptstringA negative prompt to guide the video generation
resolutionenum"720p"The resolution of the generated video. Values: "720p" | "1080p"
safety_toleranceenum"4"The safety tolerance level for content moderation. 1 is the most strict (blocks most content), 6 is the least strict. Values: "1" | "2" | "3" | "4" | "5" | "6"
seedintegerThe seed for the random number generator

Output

Saved to your project drive at drive_path; a signed output_url (TTL ~1h) is returned.

Example

python
result = media.run(
    "google/veo-3-i2v",
    prompt="...",
    image_url="https://...",
    duration="8s",
)

google/veo-3-fast-t2v

Veo 3 Fast — text→video

Pricing

VariantPrice
audio off$0.300/s
audio on$0.480/s

Inputs

FieldTypeRequiredDefaultNotes
promptstringThe text prompt describing the video you want to generate. (max 20,000 chars)
aspect_ratioenum"16:9"The aspect ratio of the generated video. Values: "16:9" | "9:16"
auto_fixbooleantrueWhether to automatically attempt to fix prompts that fail content policy or other validation checks by rewriting them
durationenum"8s"The duration of the generated video. Values: "4s" | "6s" | "8s"
generate_audiobooleantrueWhether to generate audio for the video
negative_promptstringA negative prompt to guide the video generation
resolutionenum"720p"The resolution of the generated video. Values: "720p" | "1080p"
safety_toleranceenum"4"The safety tolerance level for content moderation. 1 is the most strict (blocks most content), 6 is the least strict. Values: "1" | "2" | "3" | "4" | "5" | "6"
seedintegerThe seed for the random number generator

Output

Saved to your project drive at drive_path; a signed output_url (TTL ~1h) is returned.

Example

python
result = media.run(
    "google/veo-3-fast-t2v",
    prompt="...",
    duration="8s",
)

google/veo-3-fast-i2v

Veo 3 Fast — image→video

Pricing

VariantPrice
audio off$0.300/s
audio on$0.480/s

Inputs

FieldTypeRequiredDefaultNotes
image_urlstringURL of the input image to animate. Should be 720p or higher resolution in 16:9 or 9:16 aspect ratio. If the image is not in 16:9 or 9:16 aspect ratio, it will be cropped to fit
promptstringThe text prompt describing how the image should be animated. (max 20,000 chars)
aspect_ratioenum"auto"The aspect ratio of the generated video. Values: "auto" | "16:9" | "9:16"
auto_fixbooleanfalseWhether to automatically attempt to fix prompts that fail content policy or other validation checks by rewriting them
durationenum"8s"The duration of the generated video. Values: "4s" | "6s" | "8s"
generate_audiobooleantrueWhether to generate audio for the video
negative_promptstringA negative prompt to guide the video generation
resolutionenum"720p"The resolution of the generated video. Values: "720p" | "1080p"
safety_toleranceenum"4"The safety tolerance level for content moderation. 1 is the most strict (blocks most content), 6 is the least strict. Values: "1" | "2" | "3" | "4" | "5" | "6"
seedintegerThe seed for the random number generator

Output

Saved to your project drive at drive_path; a signed output_url (TTL ~1h) is returned.

Example

python
result = media.run(
    "google/veo-3-fast-i2v",
    prompt="...",
    image_url="https://...",
    duration="8s",
)