Skip to content

feat: expose configured rate limit as a gauge metric#1138

Open
gd03champ wants to merge 1 commit into
envoyproxy:mainfrom
gd03champ:feat/rate-limit-gauge-metric
Open

feat: expose configured rate limit as a gauge metric#1138
gd03champ wants to merge 1 commit into
envoyproxy:mainfrom
gd03champ:feat/rate-limit-gauge-metric

Conversation

@gd03champ
Copy link
Copy Markdown

Problem

The existing near_limit counter fires at a fixed percentage controlled by the global NEAR_LIMIT_RATIO env var (default 80%). There is no way to:

  1. Set a different threshold per-service/descriptor (e.g. alert at 60% for a critical service).
  2. Build a utilization ratio in Prometheus because the configured requests_per_unit value is never exported as a metric — the denominator simply doesn't exist.

Solution

Add a rate_limit gauge to RateLimitStats that holds the configured requests_per_unit for each descriptor. The gauge is populated once at config load time inside NewRateLimit(), so it reflects the current config and is updated on hot-reload.

With this gauge, operators can construct arbitrary utilization alerts in Prometheus without touching NEAR_LIMIT_RATIO:

# Alert when any descriptor exceeds 60% utilization
ratelimit_service_rate_limit_total_hits
  / ratelimit_service_rate_limit_limit > 0.6

# Per-domain alert at a custom threshold
ratelimit_service_rate_limit_total_hits{domain="payments"}
  / ratelimit_service_rate_limit_limit{domain="payments"} > 0.9

Unlimited descriptors (requests_per_unit: 0) emit 0, which produces +Inf/NaN in ratio queries and does not trigger threshold alerts.

Changes

File Change
src/stats/manager.go Add RateLimit gostats.Gauge field to RateLimitStats
src/stats/manager_impl.go Initialize gauge in NewStats()
test/mocks/stats/manager.go Mirror gauge initialization in test mock
src/config/config_impl.go Set gauge value in NewRateLimit() at config load time
src/stats/prom/default_mapper.yaml Add glob (1-key) and regex (2-key) mapper entries for ratelimit_service_rate_limit_limit
test/stats/manager_impl_test.go New test asserting the gauge emits the correct requests_per_unit value

Test plan

  • go test ./test/stats/... — new TestNewStatsCreatesRateLimitGauge passes
  • go test ./test/config/... ./test/limiter/... ./test/redis/... ./test/memcached/... ./test/service/... — all existing tests pass
  • go build ./... — compiles cleanly

Add a `rate_limit` gauge to `RateLimitStats` that emits the configured
`requests_per_unit` value for each descriptor. The gauge is set once at
config load time via `NewRateLimit`, making it stable between reloads.

This allows operators to compute utilization ratios in Prometheus and
alert at arbitrary thresholds without relying on the global
`NEAR_LIMIT_RATIO` setting:

  ratelimit_service_rate_limit_total_hits
    / ratelimit_service_rate_limit_limit > 0.6

Unlimited descriptors emit 0, which produces NaN in ratio queries and
does not trigger threshold alerts.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: GD <gsnish25255@gmail.com>
@gd03champ gd03champ force-pushed the feat/rate-limit-gauge-metric branch from 159967a to 11311e1 Compare May 15, 2026 05:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant