Skip to main content

Budgets and Limits

Sentinel separates traffic controls from spend controls because they solve different operational problems. This is the part of Sentinel that helps platform teams protect reliability and cost without pushing that logic into every client.

Limits

Limits are short-window controls used to protect systems and tenants from burst traffic, misuse, or unsafe request volume.

Typical examples include:

  • requests per minute
  • token throughput ceilings
  • concurrent stream limits
  • endpoint-specific restrictions

Limits are about protecting runtime behavior as traffic happens.

Budgets

Budgets are accumulated consumption controls used to contain cost and enforce usage policy over time.

Typical examples include:

  • key-level usage ceilings
  • project or environment spend thresholds
  • alert and block thresholds tied to operational review

Budgets are about controlling usage over time, not just absorbing short-term burst traffic.

Enforcement order

In practice, Sentinel evaluates these controls before provider execution, not not after. That keeps rejected traffic from turning into provider charges.

Operators should treat limits and budgets as part of the hosted product policy for a workspace, not as ad hoc application-side logic.

Operational guidance

A good rollout usually means:

  • setting defaults that are safe but not overly restrictive for trusted internal workloads
  • using alerts before hard blocks while baseline usage patterns are still being learned
  • keeping ownership of limits and budgets explicit across platform and product teams
  • testing failure behavior for both burst rejection and budget exhaustion

The goal is to make protective controls predictable, visible, and operationally trustworthy before tightening them further.