# Throttle Middleware Token-bucket rate limiter running on APP and BACKOFFICE nodes, after authentication. Uses Redis as backend and YAML-defined rules. ## Activation The throttle is controlled by 3 settings, read once at server startup: | Variable | Default (unset) | Source in deployed envs | Description | |---|---|---|---| | `THROTTLE_ENABLED` | `false` (disabled) | SSM `/be/throttle/enabled` | Set to `true` to mount the middleware. Otherwise it short-circuits to a no-op pass-through. | | `THROTTLE_SHADOW_MODE` | `true` (log only) | SSM `/be/throttle/shadow-mode` | Set to `false` to actually reject requests with 429. Default keeps you safe in shadow. | | `THROTTLE_RULES_PATH` | resolved relative to the middleware file (`config/throttle-rules.yaml` at the package root) | `env.default` (image-relative path) | Override only if you need to load a different rule file. Not in SSM since it's an image-relative path. | In deployed environments, `THROTTLE_ENABLED` and `THROTTLE_SHADOW_MODE` are wired into the ECS task definition's `secrets:` block via Terraform. See [Operating the throttle](#operating-the-throttle) for how to change values. In local dev, set them via `.env` (no SSM involved). ### Recommended rollout 1. Land the SSM wiring (terraform apply per env). Default values are `enabled=true, shadow_mode=true` — middleware starts observing immediately. 2. Monitor `humandMainApi.throttle.*` metrics and logs for a few days. Tune rules in `config/throttle-rules.yaml` as needed (rule changes still require a redeploy). 3. Once confident, flip `/be/throttle/shadow-mode` to `false` per env to enforce. 4. To roll back: flip `/be/throttle/enabled` to `false` (faster than reverting shadow_mode, since it short-circuits the entire middleware including Redis access). ## Operating the throttle `THROTTLE_ENABLED` and `THROTTLE_SHADOW_MODE` are stored in SSM Parameter Store. ECS reads SSM values on container start, so changing a value requires a rolling deploy (~2-5 min) — not instant, but no PR or code change. ### SSM keys | SSM key | Env var | Type | Default | |---|---|---|---| | `/be/throttle/enabled` | `THROTTLE_ENABLED` | String | `true` | | `/be/throttle/shadow-mode` | `THROTTLE_SHADOW_MODE` | String | `true` | Defined in `infrastructure/env//ssm.tf` per environment. The `lifecycle.ignore_changes = [value]` directive on each resource means terraform creates the parameter once at the default value and then never overwrites operator changes — you can flip the value in production without worrying that the next `terraform apply` will revert it. The middleware is only wired into `routes/root.ts` and `routes/backofficeRoot.ts`, so `humand-public-api` carries the env vars but is not subject to throttling. Force-redeploy `humand-app` and `humand-backoffice` to pick up SSM changes; container logs show `Throttle middleware disabled` when the new task starts with `enabled=false`. ## Rules Rules live in `humand-packages/monolith/config/throttle-rules.yaml` and are loaded at server startup (changing rules requires a redeploy). ### Format ```yaml # Rules evaluated in order. First match wins. # Put the most specific rules first, catch-all at the end. # # Two signatures are emitted per request: # 1. instanceId:userId:path (matches 3-segment patterns) # 2. instanceId:userId (matches 2-segment patterns) # Block a specific user - pattern: "*:1234" burst: 0 refill: 0 # Block an entire instance (e.g. during an incident) - pattern: "5678:*" burst: 0 refill: 0 # Tight limit on a heavy endpoint — 5 burst, 1/sec refill - pattern: "*:*:/reports/*" burst: 5 refill: 1 bucketKey: "reports:{0}:{1}" # {0}=instanceId, {1}=userId # Limit file uploads per user - pattern: "*:*:/files" burst: 10 refill: 2 bucketKey: "files:{0}:{1}" # Higher limit for a specific high-traffic instance - pattern: "9999:*" burst: 500 refill: 50 bucketKey: "user:9999:{0}" # {0}=userId # Default — rate limit per user - pattern: "*:*" burst: 100 refill: 10 bucketKey: "user:{0}:{1}" # {0}=instanceId, {1}=userId ``` ### Fields Each rule configures a token bucket with two parameters: | Field | Type | Required | Description | |---|---|---|---| | `pattern` | string | yes | Glob pattern to match against the request signature | | `burst` | int >= 0 | yes | Token bucket capacity — max requests allowed in a burst | | `refill` | int >= 0 | yes | Tokens added per second | | `bucketKey` | string | no | Redis key template. If omitted, the full signature is used as key | Validation: `refill: 0` is only accepted when `burst: 0` (hard-block rule). Any other combination (e.g. `burst: 5, refill: 0`) is rejected at rule-loading time — a non-refilling bucket with initial capacity would divide by zero in the Lua script. ### Request signatures Two signatures are emitted per request, enabling rules at different granularity levels: | Signature | Format | Use case | |---|---|---| | Full | `instanceId:userId:path` | Path-specific rules (`*:*:/reports/*`) | | Short | `instanceId:userId` | User/instance-level rules (`*:1234`, `5678:*`) | For each rule, the middleware tries the shorter signature before the longer one. That way a rule like `*:*` with `bucketKey: "user:{0}:{1}"` always matches the 2-segment form and produces one bucket per `(instanceId, userId)` — rather than greedy-capturing the 3-segment form and accidentally yielding one bucket per endpoint. The short form lets you write `*:1234` instead of `*:1234:*` for user-level rules. ### Glob patterns `*` matches any character sequence (including `/`, `:`). Each `*` is a capture group referenced as `{0}`, `{1}`, etc. in `bucketKey`. ``` Pattern: *:*:/users/* (matches the full signature) Signature: 3214:5678:/users/123 Groups: {0}="3214", {1}="5678", {2}="123" Pattern: *:1234 (matches the short signature) Signature: 3214:1234 Groups: {0}="3214" ``` ### Bucket key Defines how requests are grouped in Redis. Controls rate limiting granularity: ```yaml # Rate limit per user (one bucket per instanceId+userId) bucketKey: "user:{0}:{1}" # {0}=instanceId, {1}=userId # Rate limit per instance (all users in the instance share a bucket) bucketKey: "instance:{0}" # {0}=instanceId # No bucketKey -> the full signature is the key (one bucket per instanceId+userId+path) ``` ### Blocking users (burst=0, refill=0) To block a specific user or instance, add a rule with `burst: 0` and `refill: 0` at the top of the file. This is a fast-path that returns 429 without hitting Redis. ```yaml - pattern: "*:1234" # block userId 1234 across all instances burst: 0 refill: 0 - pattern: "5678:*" # block all users in instance 5678 burst: 0 refill: 0 ``` To unblock, remove the rule and redeploy. ### Evaluation order Rules are evaluated in file order. **First match wins.** Always place: 1. Specific blocks first 2. Instance/path exceptions next 3. Catch-all last If no rule matches, the request passes through (with a warn in the logs). ## Algorithm: Token Bucket Each bucket has two parameters: - **burst**: maximum capacity (how many requests can be made in a burst) - **refill**: tokens added per second Example with `burst: 100, refill: 10`: - The bucket starts full with 100 tokens. - Each request consumes 1 token. - 10 tokens are added per second (capped at 100). - If 0 tokens remain, the request is rejected with 429 + `Retry-After` header. The logic runs atomically in Redis via a Lua script (no race conditions across servers). ## Failure behavior | Scenario | Behavior | |---|---| | Redis down | Fail-open: lets request through + warn log + `throttle.redis_error` metric | | YAML file not found at startup | Middleware disabled + error log | | Individual invalid rule in YAML | That rule is discarded + warn log, others load normally | | Completely invalid YAML | Middleware disabled + error log | | Error generating request signature | Lets request through + warn log | ## Observability ### Metrics (Datadog) All metric names use the `humandMainApi.throttle` prefix (configured via `metricPrefix` when the middleware is built). Every metric also carries a `nodeType` tag (e.g. `APP`, `BACKOFFICE`) wired in via `baseMetricTags`. | Metric | Tags | When | |---|---|---| | `humandMainApi.throttle.allowed` | nodeType, rule | Request allowed | | `humandMainApi.throttle.rejected` | nodeType, rule, shadow | Request rejected (or shadow-rejected) | | `humandMainApi.throttle.no_match` | nodeType | No rule matched | | `humandMainApi.throttle.redis_error` | nodeType | Redis failed | The `shadow` tag is `"true"` or `"false"` and indicates whether the rejection was real or only logged. ### Logs Each rejection logs (warn level): - `signature`: full request signature - `rule`: the pattern that matched - `bucketKey`: the Redis key - `shadowMode`: whether the rejection was real or only logged - `retryAfter`: seconds until retry is possible ### Response headers All requests matching a rule receive: - `RateLimit-Limit`: rule burst - `RateLimit-Remaining`: tokens remaining - `RateLimit-Reset`: seconds until the bucket is full (omitted on hard-block fast-path) Rejected requests (429) also receive: - `Retry-After`: for normal rate-limited 429s, the number of seconds until at least 1 token refills. For hard-block rules (`burst: 0, refill: 0`), a fixed `86400` — there is no retry that succeeds until the rule itself is changed. ## Architecture ``` Request -> Auth middleware -> Throttle middleware -> Route handler | +- Generate signature: instanceId:userId:path +- Match against rules (file order) +- If burst=0,refill=0 -> 429 (no Redis) +- EVAL Lua script in Redis (atomic) +- allowed -> next() / rejected -> 429 ``` The library lives in `@humand-packages/common` (`src/presentation/middlewares/throttle.ts`) and is reusable by other packages. The monolith-specific wiring lives in `src/api/middlewares/throttle.ts`. ### Redis Uses a separate logical DB (`CacheDomains.THROTTLE`, db 3) on the same Redis server as the monolith. Each bucket is stored as a hash with `tokens` and `last_refill`, with automatic TTL. The throttle-specific Redis client uses a tight `commandTimeout` (100ms) so a degraded Redis can never block requests on the throttle path — every slow command trips the fail-open branch quickly. The other domains keep their global timeout. ## Limitations (v1) - Configuration is read once at server startup. Changing `THROTTLE_ENABLED`, `THROTTLE_SHADOW_MODE`, or rules in `throttle-rules.yaml` requires a rolling deploy (no hot-reload). - No per-instance override from DB (like `communityRateLimiter`). - Does not apply to the Public API (still uses `communityRateLimiter`), gRPC, or workers. - No local rejection cache (every request hits Redis). - Always consumes 1 token per request (not configurable).