Vídeo
xAI

Grok Imagine

Turn text or up to 7 reference images into short video with native audio, from xAI's Grok Imagine.

Desde 78 HGcoins / generación·pago por generación, sin suscripción
Ejemplos

Creado con Grok Imagine

Resultados de muestra. Abre Studio para generar los tuyos.

Para qué sirve

Dónde brilla Grok Imagine

Text To Video

Write a prompt and Grok Imagine Video renders a short clip with synchronized native audio.

Image To Video

Add up to seven reference images to guide motion and style in an image-to-video result.

Social Clips

Frame in portrait, landscape, or square at 480p or 720p for vertical and horizontal feeds.

Puntos fuertes

  • Handles both text-to-video and image-to-video in one pipeline
  • Accepts up to 7 optional reference images to steer image-to-video
  • Generates synchronized native audio alongside the video in a single pass
  • Three modes (Fun, Normal, Spicy) for different content styles
  • Flexible 6 to 30 second duration in one-second steps
  • Five aspect ratios spanning portrait, landscape, and square

Concesiones

  • Resolution is capped at 720p, with only 480p and 720p tiers offered
  • Priced per second, so cost scales with duration and the 720p tier costs roughly 1.9x the 480p tier per second
  • A text prompt is always required, while reference images are optional
Especificaciones

De un vistazo

Type
Text-to-video and image-to-video
Vendor
xAI
Modes
Fun / Normal / Spicy
Resolution
480p or 720p
Aspect ratios
2:3, 3:2, 1:1, 16:9, 9:16
Duration
6 to 30 seconds

Acerca de Grok Imagine

Grok Imagine Video is xAI's video generation model, available on HexGen as a video pipeline. Start from a written prompt for text-to-video, or add up to seven reference images to drive an image-to-video result. The underlying xAI model generates synchronized native audio in a single pass, so your clip arrives with matching sound rather than a silent render.

Pick the look that fits your project with three generation modes, Fun, Normal, and Spicy, and frame it in any of five aspect ratios covering portrait, landscape, and square (2:3, 3:2, 1:1, 16:9, 9:16). Choose 480p or 720p, then set your clip length anywhere from 6 to 30 seconds in one-second steps.

On HexGen, pricing is per second of video and tiered by resolution: a per-second base for each resolution multiplied by the duration you select. That keeps short clips affordable and lets you scale up length and resolution only when a project needs it.

Ideas de prompts

Puntos de partida

Copia, ajusta y ejecuta. Un buen prompt te lleva casi todo el camino.

A neon-lit city street at night in the rain, camera slowly tracking forward past glowing signs, 16:9, 720p, 10 seconds.

Animate this reference photo of a coffee cup: steam rising and gentle light shifting across the table, 1:1, 6 seconds.

A golden retriever running through a sunlit meadow in slow motion, soft ambient nature sound, 9:16, 480p, 8 seconds.

Precios
78
HGcoins / generación · ≈ $0.08

Paga solo por lo que generes. 1 USD = 1,000 HGcoins. Los HGcoins nunca caducan y las ejecuciones fallidas se reembolsan automáticamente.

Comparar

Grok Imagine frente a otros modelos

Grok Imagine Video stands out for shipping synchronized native audio and offering Fun, Normal, and Spicy modes. Here is how it sits next to other video models in the catalog.

Grok Imagine frente a otros modelos
ModeloCalidadVelocidadCosteElígelo cuando
Grok Imagine
Este
xAI
Muy bueno
Rápido
Coste bajo
Choose this when you want video with native audio, three content modes, and image-to-video from up to seven reference images, at a budget-friendly per-second 480p or 720p price.
Kuaishou
El mejor
Rápido
Coste medio
Pick Kling 2-6 when you want a higher-fidelity video render and do not need Grok's Fun/Normal/Spicy modes.
ByteDance
El mejor
Rápido
Coste alto
Pick Seedance 2-0 for premium video quality when budget is less of a concern.
En resumen: elige Grok Imagine cuando choose this when you want video with native audio, three content modes, and image-to-video from up to seven reference images, at a budget-friendly per-second 480p or 720p price.. Si no, uno de los modelos de arriba encajará mejor: toca una fila para comparar.

Preguntas frecuentes

It is xAI's video model that turns a text prompt, optionally guided by reference images, into a short video clip with synchronized native audio.