Skip to main content
POST
/
vendors
/
klingai
/
v1
/
kling-v3-omni
/
image-to-video
/
generation
Image to Video Generation
curl --request POST \
  --url https://api.mulerouter.ai/vendors/klingai/v1/kling-v3-omni/image-to-video/generation \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "first_frame": "https://example.com/first-frame.jpg",
  "prompt": "A gentle breeze blows through the scene, camera slowly zooms in",
  "mode": "pro",
  "duration": 5
}
'
{
  "task_info": {
    "id": "8e1e315e-b50d-4334-a231-be7d19a372f4",
    "status": "pending",
    "created_at": "2026-03-03T00:00:00Z",
    "updated_at": "2026-03-03T00:00:00Z"
  }
}

Overview

Generate videos from input images using the Kling V3 Omni model. Requires at least one frame image. In addition to text-to-video features, image-to-video supports:
  • First/Last frame control — provide first_frame and last_frame images
  • Reference images — additional image references via images array
  • Element references — reference up to 3 elements when using first/last frame images
  • Multi-shot video — generate multi-scene videos via multi_shot and multi_prompt

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Body

application/json
first_frame
string | null

First frame image (URL or Base64). Sets the opening frame of the generated video.

last_frame
string | null

Last frame image (URL or Base64). Sets the closing frame of the generated video.

images
string[]

List of reference image URLs or Base64 strings.

Minimum array length: 1
prompt
string | null

Text prompt to guide video generation from the images.

multi_prompt
object[]

Multi-segment prompts for finer control.

Required array length: 1 - 6 elements
negative_prompt
string | null

Negative prompt to exclude unwanted content.

sound
enum<string>
default:off

Whether to generate sound for the video.

Available options:
on,
off
mode
enum<string>
default:pro

Generation mode. std for standard quality, pro for higher quality.

Available options:
std,
pro
aspect_ratio
enum<string> | null

Aspect ratio of the generated video.

Available options:
16:9,
9:16,
1:1
duration
integer
default:5

Duration of the generated video in seconds (3-15).

multi_shot
boolean
default:false

Whether to enable multi-shot generation.

shot_type
enum<string> | null

Shot type configuration.

Available options:
customize,
intelligence
elements
object[]

Element list. Max 3 when using first/last frame images.

Maximum array length: 3

Response

Accepted - Task created successfully

task_info
object