Can CeylanVienna-based, globally curious.
Learn/AI & Tools

Default-first fallback orchestration for AI generation pipelines

AI generation routes that call a single provider are brittle. Default-first fallback orchestration makes them resilient: try the configured primary, fall back automatically on failure, record what actually ran, and let users override for one run without changing the default.

2026-04-25·3 min read·intermediate

The single-provider problem

The simplest AI generation route picks one model and calls it. If the model is unavailable, returns an error, or produces unusable output, the route fails. The user sees an error, retries manually, or gives up.

This is fine for prototypes. In production workflows that run on a schedule or on user demand, single-provider routes become operational risk. Any provider outage, quota exhaustion, or API change breaks every generation that depends on it.

The fallback chain pattern

Instead of one provider, define an ordered list. Try the primary. If it fails, try the first fallback. If that fails, try the next. Record which provider actually delivered.

interface ProviderStrategy {
  primary: string;
  fallbacks: string[];
}

async function generateWithFallback(
  strategy: ProviderStrategy,
  generate: (provider: string) => Promise<string>
): Promise<{ result: string; provider: string; fallbackUsed: boolean }> {
  const chain = [strategy.primary, ...strategy.fallbacks];

  for (let i = 0; i < chain.length; i++) {
    const provider = chain[i];
    try {
      const result = await generate(provider);
      return { result, provider, fallbackUsed: i > 0 };
    } catch (err) {
      if (i === chain.length - 1) throw err; // last in chain, re-raise
      console.warn(`Provider ${provider} failed, trying ${chain[i + 1]}`);
    }
  }

  throw new Error("All providers failed");
}

The caller always gets a result and knows which provider delivered it. Fallback is invisible to the user unless they look at the provider label.

Store fallback health

Recording what happened makes the system observable and debuggable:

interface ProviderHealth {
  lastSuccessAt?: string;
  lastFailureAt?: string;
  lastError?: string;
  lastResolvedSource?: string; // which provider actually ran
}

After each generation run, write the health record. This lets the admin surface show: "Primary provider last failed 3 days ago. Last run used fallback." Without these records, every failure looks like the first.

The one-run override

The default strategy should be automatic and require no user input. But sometimes you know the primary is going to fail, planned maintenance, quota exhaustion, and you want to skip straight to a specific provider for one run.

The override is a request-time hint, not a settings change:

// Default: use whatever the configured strategy says
POST /api/generate/hero-image
{ slug: "my-article" }

// Override: use this provider for this run only
POST /api/generate/hero-image
{ slug: "my-article", providerOverride: "lummi" }

The override does not change the stored strategy. The next run goes back to the default. This is the distinction between an escape hatch and a settings change, the escape hatch is temporary by design.

Surfacing fallback to the user

When a fallback was used, tell the user, but briefly. They do not need a detailed failure report for a generation that succeeded.

// In the API response
{
  url: "https://...",
  provider: "lummi",
  fallbackUsed: true
}

// In the UI
fallbackUsed
  ? "Hero image ready (via fallback: Lummi)"
  : "Hero image ready"

This is enough to explain why the image looks slightly different from usual without alarming anyone.

What goes in the fallback chain

Good fallback targets:

  • A slower but more reliable version of the same provider
  • A different provider that produces compatible output
  • A manual sentinel that marks the asset as needing human input, rather than failing silently
// Example chains
heroImage:  ["gemini-imagen-3", "lummi", "manual"]
socialText: ["claude-sonnet-4-6", "claude-haiku-4-5", "manual"]
videoClip:  ["veo-2", "manual"]

The manual sentinel is important: it means "generation failed but the workflow continues, a human needs to provide this asset." This is better than an error that halts everything.

The product rule: defaults must be one-click

The fallback chain is infrastructure. The user should never have to configure it for a normal run. The only user interaction is the optional one-run override when they have a specific reason to deviate.

If your fallback system requires the user to select a provider before every generation, it has drifted from infrastructure into ceremony. Keep defaults automatic. Keep overrides optional and temporary.

More like this, straight to your inbox.

I write about AI & Tools and a handful of other things I actually care about. No schedule, no filler. Just when I have something worth saying.

More on AI & Tools

Send a read-only agent first

One agent spent three hours chasing a build error. A second agent read the migrations against the query code in two minutes and found the real bug. The lesson isn't about which AI is smarter, it's about audit-first workflows.

Use a working-memory file as the handoff layer between AI coding sessions

AI coding agents forget everything between sessions. A working-memory.md file kept in the repo solves this, it's the shared brain that survives model switches, overnight gaps, and multi-agent collaboration.

How to split work across two AI agents without merge conflicts

When two AI agents work on the same codebase in parallel, file-level collisions are inevitable without a deliberate coordination pattern. Protected lanes and explicit ownership boundaries solve this without requiring real-time communication.

Read the broader essay

Article

AI Is the Most Equalising Force Tech Has Ever Seen, Especially for Women

The real story from an AI coding conference is not about robots replacing developers. It is about who finally gets a seat at the table, and why the old gatekeepers are losing their grip.

Article

Claude Is Overhyped. Codex Is Underrated. Here Is What Actually Happened When I Built Real Products With Both.

After shipping multiple products in a dual AI vibe-coding setup, I have a clearer picture of who does what better, and the answer is more interesting than the hype suggests.

If this raised a question, I'd be happy to talk about it.

Find me →
← Back to Learn