Skip to main content

Routing

Routing is the decision layer that turns an incoming request into an eligible provider execution path. In Sentinel, routing determines which provider and model path should actually handle the request.

What routing decides

For each request, Sentinel resolves:

  • the lane being used
  • the requested model or provider target
  • which candidates are eligible
  • whether the selected provider supports the endpoint
  • whether retries or fallback are safe for that route shape

Operators define the intended routing behavior in the Sentinel console. Sentinel then enforces those decisions consistently at request time.

Capability-aware routing

Sentinel does not rely on trial-and-error to discover support. Routing is capability-aware, which means Sentinel filters out invalid or unsafe paths before execution.

That includes:

  • removing unsupported provider candidates before execution
  • failing unsupported lane paths early
  • using route safety to decide whether retry or fallback is appropriate

The goal is not just to find any provider path, but to choose an eligible one that is safe for that request shape.

Retry and fallback model

Not every request should retry.

Some request types are safe to retry. Others can create duplicate side effects, partial execution, or inconsistent provider state.

In general:

  • streaming routes should not retry mid-stream
  • uploads, deletes, create/generation calls, and batch-result retrieval should remain single-attempt
  • safe read-only metadata calls can be retried

This distinction matters because a duplicate generation, upload, or provider transition is often worse than a fast failure.

Caching

Caching must be route-aware as well.

In the current Sentinel model:

  • streaming routes are excluded
  • native passthrough lanes are excluded
  • file, audio, and generation workflows are excluded
  • only explicitly safe, configured routes should be cacheable

The principle is simple: cache only where request semantics are stable and the route is safe for reuse.

Operator guidance

Start simple.

Begin with:

  • one provider
  • one clear route plan
  • one known-good model path

Then add complexity deliberately:

  • fallback candidates
  • per-provider limits
  • policy and budget guardrails
  • route-specific safety tuning

The safest routing model is usually the simplest one that meets the workload's needs.