PromptwireLive Standings
The Wire/Announcement
Announcement

Stability AI Launches Stable Audio 3.0 as an Open-Weight Model Family

Stability AI is shipping four Stable Audio 3.0 models trained on licensed data — three with open weights for on-device and mid-tier use, and a Large variant gated to the Stability API and enterprise self-hosting.

By the Promptwire desk·
Abstract soundwave shapes in copper and teal dissolving into geometric fragments on a dark navy background

Builders, integrators, prompt engineers

Stable Audio 3.0 lands as a four-model family with a split distribution strategy. Three tiers — Small SFX, Small, and Medium — are open weights on Hugging Face. Large is API-only via Stability or enterprise self-hosting.

The headline architectural change is what Stability calls a "novel semantic-acoustic autoencoder" supporting variable-length generation at per-second granularity. Per Stability's own numbers:

  • 3.0 Small: up to two minutes (vs. 11 seconds for Stable Audio Open Small and 47 seconds for Stable Audio Open).
  • 3.0 Medium: up to 6:20.
  • 3.0 Large: more than six minutes, positioned for "low-latency generation at high volume."

For builders, three things matter beyond raw length.

First, on-device music composition. Stability claims 3.0 Small is the only model capable of full music composition running on phones and consumer laptops — not just short samples. That changes what you can ship without a GPU backend.

Second, LoRA fine-tuning is now documented. Weights for 3.0 Small and 3.0 Medium ship alongside LoRA training docs, so customizing on a proprietary catalog is a first-class path rather than a community hack. Enterprise customers can get "white-glove" fine-tuning support.

Third, editing primitives. The model supports single-segment editing, multi-segment editing, and causal continuation (extending audio past its original endpoint). That's the toolkit you'd want for a DAW plugin or an agentic music workflow, not just a one-shot generator.

On licensing: training data is described as "fully licensed." Outputs are owned by the user under the Stability AI Community License. Organizations above $1M in annual revenue need the Enterprise License, which also carries legal indemnification. No prices are disclosed in the post.

Distribution will extend to ComfyUI and other partner platforms. We have no benchmarks, no API pricing, and no latency figures to verify the "low-latency" claim on Large — worth watching once independent tests land.