Budgets and Limits
Sentinel separates traffic controls from spend controls because they solve different operational problems. This is the part of Sentinel that helps platform teams protect reliability and cost without pushing that logic into every client.
Limits
Limits are short-window controls used to protect systems and tenants from burst traffic, misuse, or unsafe request volume.
Typical examples include:
- requests per minute
- token throughput ceilings
- concurrent stream limits
- endpoint-specific restrictions
Limits are about protecting runtime behavior as traffic happens.
Budgets
Budgets are accumulated consumption controls used to contain cost and enforce usage policy over time.
Typical examples include:
- key-level usage ceilings
- project or environment spend thresholds
- alert and block thresholds tied to operational review
Budgets are about controlling usage over time, not just absorbing short-term burst traffic.
Enforcement order
In practice, Sentinel evaluates these controls before provider execution, not not after. That keeps rejected traffic from turning into provider charges.
Operators should treat limits and budgets as part of the hosted product policy for a workspace, not as ad hoc application-side logic.
Operational guidance
A good rollout usually means:
- setting defaults that are safe but not overly restrictive for trusted internal workloads
- using alerts before hard blocks while baseline usage patterns are still being learned
- keeping ownership of limits and budgets explicit across platform and product teams
- testing failure behavior for both burst rejection and budget exhaustion
The goal is to make protective controls predictable, visible, and operationally trustworthy before tightening them further.