Adding video: how we picked the launch lineup

We shipped video this week. Eight models, one credits pool, no per-provider contracts to manage. Before we get into what’s in the lineup, a quick note on how we picked — because the “which video models are best” answer is much less obvious than the chat-side equivalent.

The four axes that matter

Chat models compete mostly on intelligence. Video models compete on four axes at once, and almost no model wins all four:

Visual quality — sharpness, motion coherence, how often a hand has five fingers.
Speed and cost — how long until you have something to look at, and what it costs to iterate.
Control — references, seeds, durations, camera moves, lip-sync.
Audio — whether the model emits sound that matches the visuals, or whether you have to bolt foley on after.

A 4K cinematic generator is wrong for an X reply. A fast social-tier model is wrong for a hero shot. So the lineup is plural by design — we want one right tool per job, not one model trying to be everything.

What’s in

HappyHorse 1.0 is the new top of the catalog as of yesterday’s external benchmarks, but the part we cared about isn’t the visual score — it’s that the audio is actually generated jointly with the video instead of being layered in after. Lip-sync works. Music swells in time with motion. We’ll write a dedicated post about it once we’ve used it on more real jobs.

Veo 3.1 is the cinematic option. 4K-native, the cleanest text-to-shot we’ve seen, and ridiculously expensive — so we route to it when the prompt looks like a hero shot (“wide-angle, dolly in, golden hour”) and not when it looks like a social clip.

Seedance 2.0 lands tomorrow with multi-input — up to nine reference images, three reference clips, three audio tracks. We pre-tested with ByteDance’s beta keys and it’s a different kind of tool: less for “type a prompt, get a video” and more for “give me a director’s brief.” We’ll cover it separately on May 3.

Kling 3.0 Omni is the motion-fluidity specialist. Best multi-shot continuity in our tests, especially when you need a character to walk through several beats without their face morphing between them.

Runway Gen-4.5 stays in the kit for the filmmakers who already have muscle memory for it. The camera-move controls and the keyframe interface are still the best in the category if you actually want to direct a shot.

Grok Imagine is the social-native option — fast, drafts-quality, real- feeling clips. We added it last week and have a fuller write-up coming.

Wan 2.7 is the budget tier with LoRA support. Open-source roots, so if you’ve trained your own character/style LoRAs already, this is where they plug in.

Hailuo 2.3 is the cheapest reliable option. Nothing flashy. Useful when you’re iterating thumbnails or stress-testing a concept and don’t want to burn premium credits doing it.

What didn’t make it

Three serious models almost made the launch and didn’t, for different reasons.

The first didn’t ship a non-watermarked tier yet — we won’t surface a model that brands your output. The second has great visuals but no API for seed control, which makes iteration painful in a multi-take workflow. The third is just expensive in a way that doesn’t pencil out — even for Max-tier users, the per-clip cost would push us into raising the cap.

We’ll revisit all three when their tiering changes.

One pool, eight models

The reason any of this works is that you’re not buying eight separate subscriptions. You spend the same shared credits whichever model you pick, and you can see the cost per generation before you commit. Pro is 2,000 credits a month, Max is 10,000. A typical text-to-video clip from HappyHorse runs 40–60 credits; a Veo 4K hero shot is more like 120; a Hailuo iteration is 6.

That ratio is the part we’re proudest of. The point of the studio is that you don’t have to pre-commit to a provider before you know what your prompt needs — and video, more than any other modality, punishes that kind of pre-commitment.

Video is live for Pro and Max today. The free tier doesn’t include it yet; we want to see how the credits math plays out for a month before we open the gate wider.

The four axes that matter

What’s in

What didn’t make it

One pool, eight models

Try the product behind the writing. studio.

Try the product behind the writing.
studio.