Vídeo
Google

Gemini Omni Video

Google's Gemini Omni turns a text prompt or up to 4 reference images into video, in resolutions up to 4K.

Desde 1050 HGcoins / generación·pago por generación, sin suscripción
Ejemplos

Creado con Gemini Omni Video

Resultados de muestra. Abre Studio para generar los tuyos.

Para qué sirve

Dónde brilla Gemini Omni Video

Social Clips

Generate short 4 to 10 second clips in vertical 9:16 for mobile feeds or landscape 16:9 for wider placements.

Image To Video

Attach up to 4 reference images to guide the motion and look of a generated clip from existing stills.

4K Hero Shots

Render higher-resolution footage up to 4K from a single text prompt for polished marketing moments.

Puntos fuertes

  • Generates video directly from a written text prompt, with the prompt as the required input
  • Supports image-to-video using up to 4 optional reference images
  • Outputs at resolutions up to 4K
  • Selectable clip durations of 4, 6, 8, or 10 seconds
  • Handles both landscape 16:9 and portrait 9:16 aspect ratios
  • Built by Google on the Gemini Omni multimodal foundation

Concesiones

  • Aspect ratios are limited to 16:9 and 9:16, with no square or other ratios
  • Maximum clip length is 10 seconds
  • Reference inputs accept image files only; no audio or video references are exposed
  • 4K output and longer clips cost more, since price scales with both resolution and duration
  • Native synchronized audio was still in testing at Google's launch, so audio behavior may be limited despite native-audio marketing
Especificaciones

De un vistazo

Type
Text-to-video / image-to-video
Vendor
Google (Gemini Omni)
Resolution
720p, 1080p, 4K
Aspect ratios
16:9, 9:16
Duration
4s, 6s, 8s, 10s
Reference images
Optional, up to 4 (images only)
Pricing
Per clip, tiered by resolution and duration (4K costs more)

Acerca de Gemini Omni Video

Gemini Omni is Google's multimodal video model, built on the Gemini family's multimodal architecture. On HexGen it turns a written prompt into video, and you can optionally guide the result with up to 4 reference images for an image-to-video workflow. The prompt is the required input, so you can start from words alone or pair them with visuals.

You control the output to fit where it will run. Pick a resolution up to 4K, choose landscape 16:9 or portrait 9:16, and set the clip length to 4, 6, 8, or 10 seconds. That makes it a flexible choice for short social clips, vertical mobile content, and higher-resolution hero footage from a single tool.

Gemini Omni is marketed as native-audio video generation. Note that Google's launch announcement described synchronized audio output as still being tested at release, so audio behavior may be limited. Pricing on HexGen is per clip, set by a resolution-by-duration matrix: 720p and 1080p share one tier, 4K sits in a higher tier, and within each tier the price rises with clip length.

Ideas de prompts

Puntos de partida

Copia, ajusta y ejecuta. Un buen prompt te lleva casi todo el camino.

A neon-lit Tokyo street at night in the rain, camera slowly pushing forward past glowing storefronts, reflections on wet pavement, cinematic 16:9.

Close-up of fresh coffee being poured into a white mug, steam rising in soft morning light, vertical 9:16 for a cafe promo.

A paper boat sailing across a calm pond as autumn leaves drift down, gentle ripples, warm late-afternoon sun, slow dolly shot.

Precios
1050
HGcoins / generación · ≈ $1.05

Paga solo por lo que generes. 1 USD = 1,000 HGcoins. Los HGcoins nunca caducan y las ejecuciones fallidas se reembolsan automáticamente.

Comparar

Gemini Omni Video frente a otros modelos

Gemini Omni Video is Google's multimodal video model offering text-to-video and image-to-video with output up to 4K. Here is how it sits next to two other video models in the HexGen catalog.

Gemini Omni Video frente a otros modelos
ModeloCalidadVelocidadCosteElígelo cuando
Gemini Omni Video
Este
Google
El mejor
Rápido
Coste alto
Pick this for Google's multimodal pipeline when you want up to 4K output and image-guided video in both landscape and portrait.
Kuaishou
Muy bueno
Rápido
Coste medio
A capable Kuaishou video alternative when you want a different motion model in the catalog.
ByteDance
Muy bueno
El más rápido
Coste bajo
Lean toward ByteDance Seedance when speed and lower cost matter more than 4K output.
En resumen: elige Gemini Omni Video cuando pick this for google's multimodal pipeline when you want up to 4k output and image-guided video in both landscape and portrait.. Si no, uno de los modelos de arriba encajará mejor: toca una fila para comparar.

Preguntas frecuentes

It is Google's Gemini Omni multimodal video model that generates video from a text prompt, with the prompt as the required input. You can also guide the result with up to 4 optional reference images.