Video
Alibaba

Wan 2.5

Alibaba's Wan 2.5 video model: turn a prompt or start image into cinematic 720p or 1080p clips.

From 750 HGcoins / generation·pay per generation, no subscription
Examples

Made with Wan 2.5

Sample outputs. Open in Studio to generate your own.

What it's for

Where Wan 2.5 shines

Cinematic clips

Generate short cinematic video from a text prompt at 720p or 1080p in landscape, vertical, or square.

Animate a still

Supply an optional start image to switch into image-to-video and bring a single frame into motion.

Social video

Produce 5s or 10s clips in 9:16, 1:1, or 16:9 to match vertical feeds, square posts, or widescreen players.

Strengths

  • Handles both text-to-video and image-to-video from one pipeline, with image-to-video enabled by an optional start image
  • Two resolution tiers (720p and 1080p) and two clip lengths (5s and 10s) let you balance quality against cost
  • Three aspect ratios cover landscape (16:9), vertical (9:16), and square (1:1) outputs
  • Built by Alibaba, whose Wan 2.5 family is established for cinematic AI video generation

Trade-offs

  • Clip length is capped at 5 or 10 seconds, with no longer durations offered
  • Resolution is limited to 720p or 1080p, with no 480p or 4K option exposed
  • A text prompt is always required; the start image is optional
  • Cost scales with both resolution and duration, so 1080p at 10s is the most expensive combination
Specs

At a glance

Type
Text-to-video and image-to-video
Vendor
Alibaba
Resolution
720p or 1080p
Duration
5s or 10s
Aspect ratios
16:9, 9:16, 1:1
Reference image
Optional start image (enables image-to-video)

About Wan 2.5

Wan 2.5 is Alibaba's cinematic AI video model, available on HexGen for both text-to-video and image-to-video generation. Write a prompt to create a clip from scratch, or supply an optional start image to animate it into motion. One pipeline covers both workflows, so you can move between ideas without switching tools.

Choose the look that fits your project. Wan 2.5 outputs at two resolution tiers, 720p and 1080p, in clip lengths of 5 or 10 seconds, across three aspect ratios: 16:9 for landscape, 9:16 for vertical, and 1:1 for square. That makes it a flexible fit for social posts, product teasers, and storyboards alike.

Pricing is a flat credit amount per video, tiered by resolution and duration, so you always know the cost before you run a generation. Pick your resolution and length, write your prompt, add a start image if you want, and generate.

Prompt ideas

Starting points

Copy, tweak, and run. Good prompts get you most of the way there.

A lone lighthouse on a rocky cliff at dusk, waves crashing below, slow cinematic push-in, golden hour light. 1080p, 10s, 16:9.

Close-up of coffee being poured into a glass cup, steam rising, warm cafe lighting, shallow depth of field. 720p, 5s, 1:1.

Animate this start image: city street at night, neon signs flickering, light rain, gentle camera drift forward. 1080p, 9:16.

Pricing
750
HGcoins / generation · ≈ $0.75

Pay only for what you render. 1 USD = 1,000 HGcoins. HGcoins never expire and failed runs refund automatically.

Compare

Wan 2.5 vs other models

Wan 2.5 covers both text-to-video and image-to-video with two resolution tiers and two clip lengths. Here is how it stacks up against other video models in the catalog.

Wan 2.5 vs other models
ModelQualitySpeedCostChoose it when
Wan 2.5
This
Alibaba
Great
Fast
Mid cost
Pick Wan 2.5 when you want flexible cinematic clips from a prompt or a start image, with a clear flat price tiered by resolution and duration.
Kuaishou
Best
Fast
Higher cost
Consider Kling 2.6 from Kuaishou for an alternative cinematic video pipeline.
ByteDance
Great
Fastest
Mid cost
Consider Seedance 2.0 from ByteDance as another text-to-video and image-to-video option.
Bottom line: pick Wan 2.5 when pick wan 2.5 when you want flexible cinematic clips from a prompt or a start image, with a clear flat price tiered by resolution and duration.. Otherwise one of the models above will fit better. Tap a row to compare.

Frequently asked questions

Wan 2.5 generates short video clips from a text prompt (text-to-video) or from an optional start image (image-to-video) at 720p or 1080p, in 5s or 10s lengths.