Turn text or up to 7 reference images into short video with native audio, from xAI's Grok Imagine.
Exemples de rendus. Ouvrez Studio pour générer les vôtres.
Write a prompt and Grok Imagine Video renders a short clip with synchronized native audio.
Add up to seven reference images to guide motion and style in an image-to-video result.
Frame in portrait, landscape, or square at 480p or 720p for vertical and horizontal feeds.
Grok Imagine Video is xAI's video generation model, available on HexGen as a video pipeline. Start from a written prompt for text-to-video, or add up to seven reference images to drive an image-to-video result. The underlying xAI model generates synchronized native audio in a single pass, so your clip arrives with matching sound rather than a silent render.
Pick the look that fits your project with three generation modes, Fun, Normal, and Spicy, and frame it in any of five aspect ratios covering portrait, landscape, and square (2:3, 3:2, 1:1, 16:9, 9:16). Choose 480p or 720p, then set your clip length anywhere from 6 to 30 seconds in one-second steps.
On HexGen, pricing is per second of video and tiered by resolution: a per-second base for each resolution multiplied by the duration you select. That keeps short clips affordable and lets you scale up length and resolution only when a project needs it.
Copiez, ajustez et lancez. Un bon prompt fait l'essentiel du travail.
A neon-lit city street at night in the rain, camera slowly tracking forward past glowing signs, 16:9, 720p, 10 seconds.
Animate this reference photo of a coffee cup: steam rising and gentle light shifting across the table, 1:1, 6 seconds.
A golden retriever running through a sunlit meadow in slow motion, soft ambient nature sound, 9:16, 480p, 8 seconds.
Payez seulement ce que vous générez. 1 USD = 1,000 HGcoins. Les HGcoins n'expirent jamais et les échecs sont remboursés automatiquement.
Grok Imagine Video stands out for shipping synchronized native audio and offering Fun, Normal, and Spicy modes. Here is how it sits next to other video models in the catalog.
| Modèle | Qualité | Vitesse | Coût | À choisir quand |
|---|---|---|---|---|
Grok Imagine Celui-ci xAI | Très bon | Rapide | Coût réduit | Choose this when you want video with native audio, three content modes, and image-to-video from up to seven reference images, at a budget-friendly per-second 480p or 720p price. |
Kuaishou | Excellent | Rapide | Coût moyen | Pick Kling 2-6 when you want a higher-fidelity video render and do not need Grok's Fun/Normal/Spicy modes. |
ByteDance | Excellent | Rapide | Coût élevé | Pick Seedance 2-0 for premium video quality when budget is less of a concern. |
It is xAI's video model that turns a text prompt, optionally guided by reference images, into a short video clip with synchronized native audio.