One omni video pipeline for text, image, reference, and edit, with audio and clips up to 15 seconds.
Sample outputs. Open in Studio to generate your own.
Build cinematic sequences by auto-splitting a prompt or defining up to ten custom shots in one render.
Turn a single image into motion using a first frame and an optional last frame for start-to-end interpolation.
Use video-edit mode on source clips up to 10 seconds while keeping the original sound.
Kling O3 is Kuaishou's omni video model, a single multimodal pipeline that takes text, image, and video inputs in one unified pass. Instead of juggling separate tools, you pick a mode, text-to-video, image-to-video, reference-guided, or video-edit, and Kling O3 handles the generation from end to end. Optional AI-generated audio adds a soundtrack, and when you edit existing footage it can keep the original sound.
The model is built for cinematic sequences. Image-to-video accepts a first frame plus an optional last frame for clean start-to-end interpolation, reference mode reads up to seven reference images for guided generation, and multi-shot editing lets you auto-split or define up to ten custom shots. Output runs from 3 seconds up to 15 seconds, delivered as MP4.
Three quality tiers, Std, Pro, and 4K, let you trade cost against sharpness on every job. Pricing is per second of video, so a short clip costs less than a long one, and you only pay for the duration you generate. Run any mode directly on HexGen with credits and no separate subscription.
Copy, tweak, and run. Good prompts get you most of the way there.
A neon-lit Tokyo alley at night, rain reflecting on the pavement, camera slowly tracking forward past glowing signs, cinematic 16:9, 10 seconds.
Image-to-video: animate this portrait so the subject turns toward the camera and smiles, soft window light, gentle hair movement, 5 seconds.
Reference-guided: a sleek sports car drifting around a mountain hairpin at sunset, dust trailing behind, dynamic chase camera, 8 seconds.
Pay only for what you render. 1 USD = 1,000 HGcoins. HGcoins never expire and failed runs refund automatically.
Kling O3 is the most versatile model in the Kling lineup, combining four input workflows, multi-shot editing, and optional audio in one pass. Here is how it stacks up against its siblings.
| Model | Quality | Speed | Cost | Choose it when |
|---|---|---|---|---|
Kling o3 This Kuaishou | Best | Standard | Higher cost | Pick this when you need every workflow in one place: text, image, reference, and video-edit, with multi-shot sequences, optional audio, and a 4K tier. |
Kuaishou | Best | Fast | Mid cost | Choose this for high-quality Kling generation when you do not need the full omni feature set of O3. |
Kuaishou | Great | Fastest | Lower cost | Choose this for faster, lower-cost Kling video when an earlier-generation pipeline is enough. |
It is an omni video model that generates video from text, an image, reference images, or an existing video clip, all in one pipeline. It supports multi-shot sequences and optional AI-generated audio.