Routing

Routing is the decision layer that turns an incoming request into an eligible provider execution path. In Sentinel, routing determines which provider and model path should actually handle the request.

What routing decides

For each request, Sentinel resolves:

the lane being used
the requested model or provider target
which candidates are eligible
whether the selected provider supports the endpoint
whether retries or fallback are safe for that route shape

Operators define the intended routing behavior in the Sentinel console. Sentinel then enforces those decisions consistently at request time.

Capability-aware routing

Sentinel does not rely on trial-and-error to discover support. Routing is capability-aware, which means Sentinel filters out invalid or unsafe paths before execution.

That includes:

removing unsupported provider candidates before execution
failing unsupported lane paths early
using route safety to decide whether retry or fallback is appropriate

The goal is not just to find any provider path, but to choose an eligible one that is safe for that request shape.

Retry and fallback model

Not every request should retry.

Some request types are safe to retry. Others can create duplicate side effects, partial execution, or inconsistent provider state.

In general:

streaming routes should not retry mid-stream
uploads, deletes, create/generation calls, and batch-result retrieval should remain single-attempt
safe read-only metadata calls can be retried

This distinction matters because a duplicate generation, upload, or provider transition is often worse than a fast failure.

Caching

Caching must be route-aware as well.

In the current Sentinel model:

streaming routes are excluded
native passthrough lanes are excluded
file, audio, and generation workflows are excluded
only explicitly safe, configured routes should be cacheable

The principle is simple: cache only where request semantics are stable and the route is safe for reuse.

Operator guidance

Start simple.

Begin with:

one provider
one clear route plan
one known-good model path

Then add complexity deliberately:

fallback candidates
per-provider limits
policy and budget guardrails
route-specific safety tuning

The safest routing model is usually the simplest one that meets the workload's needs.

What routing decides​

Capability-aware routing​

Retry and fallback model​

Caching​

Operator guidance​

What routing decides

Capability-aware routing

Retry and fallback model

Caching

Operator guidance