What inputs does it need?

A text prompt is always required. For image-to-video you also supply a reference image (up to one); without one the model runs as text-to-video.

What resolutions, aspect ratios, and lengths are available?

Output is 720p or 1080p, in 16:9, 9:16, 1:1, 4:3, or 3:4, with clips from 3 to 15 seconds in 1-second steps (5 seconds by default).

How does pricing work?

Pricing is per second of video and scales with the duration you choose. 1080p costs roughly double 720p, so quick drafts are cheaper at 720p.

Does HappyHorse generate audio on HexGen?

No. HexGen does not offer audio output for this model; it produces video only.

How do I run it on HexGen?

Open the model in the video generator, write a prompt, optionally add a reference image for image-to-video, pick your resolution, aspect ratio, and duration, then generate.

Video

Alibaba

HappyHorse

Alibaba's HappyHorse video model: text-to-video and image-to-video up to 1080p, 3 to 15 seconds.

Open in Studio See examples

From 1400 HGcoins / generation·pay per generation, no subscription

Examples

Made with HappyHorse

Sample outputs. Open in Studio to generate your own.

What it's for

Where HappyHorse shines

Cinematic clips

Generate short cinematic video from a written description with no source footage.

Image to motion

Animate a single reference image into video for stronger visual continuity.

Social video

Render vertical 9:16 or square 1:1 clips up to 15 seconds for short-form feeds.

Strengths

Runs both text-to-video and image-to-video from one prompt field
Optional reference image gives image-to-video stronger visual continuity
Outputs 1080p in addition to 720p
Flexible duration from 3 to 15 seconds in 1-second steps
Five aspect ratios cover landscape, portrait, and square framing
Accepts multilingual prompts per the underlying Alibaba model

Trade-offs

A text prompt is always required, and image-to-video needs a reference image or it runs as text-to-video
Only two resolution tiers are offered (720p and 1080p), with a 15-second maximum clip
Reference image input is limited to a single image
Cost grows with duration since pricing is per second, and 1080p costs more than 720p

Specs

At a glance

Type

Video (text-to-video and image-to-video)

Vendor

Alibaba

Resolution

720p or 1080p

Aspect ratios

16:9, 9:16, 1:1, 4:3, 3:4

Duration

3 to 15 seconds (default 5s)

Reference images

Optional, up to 1 (image-to-video)

About HappyHorse

HappyHorse is an AI video model from Alibaba that turns a single prompt into motion. Work two ways from one prompt field: pure text-to-video, or image-to-video when you supply a reference image for stronger visual continuity. It topped the Artificial Analysis Video Arena for both modes before Alibaba was revealed as its creator.

Choose 720p or 1080p output and pick from five aspect ratios that cover landscape, portrait, and square framing (16:9, 9:16, 1:1, 4:3, and 3:4). Clip length is flexible from 3 to 15 seconds in 1-second steps, with a 5-second default, so you can size each render to the platform you are shipping to.

Pricing is per second and scales with the duration you choose, and 1080p costs roughly double 720p. That makes it easy to keep quick drafts cheap at 720p and reserve full resolution for finals. Multilingual prompts are supported per the underlying Alibaba model.

Prompt ideas

Starting points

Copy, tweak, and run. Good prompts get you most of the way there.

A lone cyclist rides down a misty mountain road at dawn, slow dolly-forward camera, soft golden light breaking through pine trees, 16:9, 8 seconds.

Close-up of steaming ramen on a wooden counter, chopsticks lifting noodles, gentle steam rising, warm neon glow, vertical 9:16, 5 seconds.

Animate this product photo: slow turntable rotation of the sneaker on a clean studio backdrop, subtle rim light sweeping across, square 1:1, 6 seconds.

Pricing

1400

HGcoins / generation · ≈ $1.40

Pay only for what you render. 1 USD = 1,000 HGcoins. HGcoins never expire and failed runs refund automatically.

Open in Studio View top-up packs

Compare

HappyHorse vs other models

HappyHorse covers both text-to-video and image-to-video with up to 1080p output and clips as long as 15 seconds. Here is how it sits next to two other video models in the catalog.

HappyHorse vs other models
Model	Quality	Speed	Cost	Choose it when
HappyHorse This Alibaba	Best	Fast	Mid cost	Pick HappyHorse when you want one model for both text-to-video and image-to-video, 1080p output, and flexible 3 to 15 second clips across five aspect ratios.
Seedance 2.0 ByteDance	Best	Fast	Mid cost	A ByteDance video model and a strong alternative for cinematic clips with believable camera motion.
Kling 2.6 Kuaishou	Great	Fastest	Mid cost	A Kuaishou video model to consider when you want a different motion style for short clips.

Bottom line: pick HappyHorse when pick happyhorse when you want one model for both text-to-video and image-to-video, 1080p output, and flexible 3 to 15 second clips across five aspect ratios.. Otherwise one of the models above will fit better. Tap a row to compare.

More models

Related models

Frequently asked questions

It generates video from a prompt in two modes: text-to-video from a written description, and image-to-video when you add a reference image for stronger visual continuity.