Skip to main content

Architecture

Sentinel is a governed access layer for model traffic, not just a pass-through proxy. Its architecture is easiest to understand as two cooperating parts: the hosted gateway that handles runtime requests, and the hosted control surface that defines how those requests should behave.

System shape

At a high level, applications and SDKs send requests to the Sentinel gateway. The gateway applies configuration, policy, limits, and routing decisions before forwarding traffic to the selected provider lane.

The control surface manages the configuration, credentials, and governance state that shape those request-time decisions.

Data plane

The gateway is the data plane. It handles request-time behavior, including:

  • key authentication
  • endpoint restriction checks
  • policy evaluation
  • rate limits and budget gates
  • routing and provider capability selection
  • request execution, telemetry, and request correlation

In practice, the data plane is where Sentinel decides whether a request can proceed, where it should go, and what should be recorded about it.

Control surface

The hosted Sentinel console is the control surface for the configuration that drives runtime behavior.

It manages:

  • tenants, projects, environments, and keys
  • provider configs and provider secrets
  • route plans and policy definitions
  • model sync and operational metadata
  • request review, blocks, and audit visibility

In practice, the control surface is where operators define the rules and operating context that the gateway enforces.

What the control plane manages

The control surface is the source of truth for the objects that shape request execution.

That includes:

  • tenants and projects
  • environments
  • keys and endpoint restrictions
  • provider configs and provider secrets
  • routing definitions
  • policy definitions
  • budgets and limits

These objects do not live in application clients. They stay in Sentinel so governance and routing can be managed centrally.

Trust boundaries

Sentinel creates a clean separation between applications, provider credentials, and governance controls.

With Sentinel:

  • clients authenticate to Sentinel, not to providers
  • provider credentials stay in Sentinel-controlled secret storage
  • policy and budget decisions happen before provider execution
  • audit and telemetry become platform-level records, not app-local logs

This separation helps platform teams centralize model access without pushing control logic into every application.

Request lifecycle

The request lifecycle is the point where Sentinel's value becomes concrete.

A Sentinel request follows a predictable lifecycle:

  1. the client sends a request to the Sentinel gateway
  2. Sentinel loads the relevant config and runtime context
  3. authentication, endpoint restrictions, policy checks, and limits are evaluated
  4. Sentinel resolves the eligible provider target and execution path
  5. the request is forwarded to the provider
  6. the response is returned with Sentinel headers and request correlation
  7. telemetry and audit signals are recorded for operator visibility

This lifecycle is where Sentinel's value becomes concrete: governance happens before execution, and visibility remains attached to the request path.

How routing, policy, and limits fit together

These controls work together, not independently:

  • authentication establishes the caller and config context
  • endpoint restrictions decide whether the request class is allowed
  • policy evaluates supported content surfaces before provider execution
  • limits and budgets gate traffic and consumption
  • routing resolves the eligible provider target and execution strategy
  • telemetry and audit record what happened and why

This is what lets Sentinel act as a governed access layer rather than just a forwarding layer.

What this architecture enables

This architecture is designed to support real operational control at the model-access layer.

It enables:

  • one stable integration point for application teams
  • centralized governance without rebuilding controls in every client
  • provider flexibility without forcing constant client rewrites
  • durable visibility into both request outcomes and platform decisions

For teams adopting Sentinel, this means integration stays simpler while control becomes stronger.