Introducing the Uni-1.1 API: Intelligence You Can Direct. Aesthetic You Can Ship.

May 5, 2026

Today we're opening access to the Uni-1.1 API: a REST interface to our Unified Intelligence model for image generation and natural-language editing. Now, the power of Uni-1 is the hands of developers to not only transform creative workflows, but amplify creativity and expand productions at a scale not possible before.

Built for builders shipping in production — and ranked top 3 lab in the Image Arena across Text-to-Image and Image Edit.

Launched last month, Uni-1 takes a different approach to generative AI: reasoning and image generation run in the same model rather than in separate systems stitched together at inference time. The result is tighter adherence to multi-constraint briefs, cleaner reference grounding, and editing that responds to intent rather than to prompt syntax. It currently ranks #1 on Human Preference Elo across overall, style & editing, and reference-based generation, and leads RISEBench on both overall reasoning and spatial logic.

Uni-1.1 API is already in production or committed across Envato, Comfy, Runware, Flora, Krea, Magnific (formerly Freepik), Fal, and LovArt.

This post covers what Unified Intelligence is, what it means under the hood, and what you can build with it.

What Unified Intelligence is

Luma’s Unified Intelligence model, Uni-1, was built on the idea that the accuracy of an AI-generated output isn't enough. Great creative work requires knowing not just what's correct, but what's beautiful.

The standard approach in generative AI is to train separate models for separate modalities, a language model here, an image model there, and stitch them together at inference time. The problem isn't that these models are individually weak. It's that stitching is not the same thing as integrating. You get plumbing, not reasoning.

Uni-1 takes a different approach. It's a decoder-only autoregressive transformer where text and image tokens share a single sequence; both as input and output, treated equally in the same forward pass. The model doesn't translate between modalities; it reasons across them simultaneously. When you pass a brief, it resolves the structural intent before generating a single pixel. Visual thinking and language reasoning happen together, not in sequence.

That's what lets Uni-1 hold multi-constraint instructions coherently and produce outputs that deliver not just accurate but resonant and authored professional creative work. Uni-1 was trained in collaboration with Hollywood cinematographers, VFX artists, and world-class artists across cultural forms. The result is cinematic lighting, material accuracy, and cultural visual literacy that's production-grade by default.

API capabilities

The Uni-1.1 API is a REST interface with two primary endpoints that work from direction and creative intent, not just prompts. 

  • Generate Image handles text-to-image and reference-guided generation. Pass up to nine reference inputs per request to preserve identity, composition, style, or any combination of them. The model holds visual continuity across multi-reference inputs that tend to drift in pipelines built on stitched models.
  • Modify Image exposes natural-language editing — swap backgrounds, shift lighting, apply a reference aesthetic, make localized changes — described in plain language and executed without prompt scaffolding.

Both endpoints are available through Python and JavaScript/TypeScript SDKs. Output controls include standard aspect ratios (1:1, 9:16, 16:9, and more) and PNG or JPEG format. Generation time is approximately 31 seconds per image, at less than half the price and latency of comparable models. It is available as of today via Uni-1.1 and Uni-1.1 Max.

Use cases

Because Uni-1 reasons before it renders, the Uni-1.1 API unlocks a new class of creative workflows where visual intent, brand coherence, and aesthetic judgment are held at the model level, not engineered around it.

  • Agent-native workflows. Built-in prompt enhancement, research, and reference gathering at the API level. No middleware to build or maintain, and your end users don't need to be prompt engineers.
  • Brand workflows at scale. Reference images act as constraints at the model level, holding visual identity across channels and markets.
  • Reference-grounded generation. Generate consistent characters across scenes or transfer a client's aesthetic to a new subject by passing up to nine reference images per request.
  • Natural-language editing. Describe localized changes — backgrounds, lighting, color, composition — in plain language, without prompt scaffolding.
  • Iterative creative pipelines. Generate a first pass, then refine with follow-up instructions. The model holds visual continuity across turns.
  • Global content at scale. Native multilingual rendering across non-Latin scripts including Chinese, Japanese, and Arabic, with regional aesthetic awareness.

Availability

At less than half the price and latency of comparable models, the Uni-1.1 API is available today in two tiers for Uni-1.1 and Uni-1.1 Max:

  • Build — full API access with usage-based billing. The starting point for integration and experimentation.
  • Scale — higher rate limits and dedicated support for production workloads.

Uni-1.1 reasons through creative intent before it generates a pixel. For the teams building professional creative work — and the enterprises that depend on them — that's the meaningful shift. The Uni-1.1 API is where it starts.