# Throttle Middleware

Token-bucket rate limiter running on APP and BACKOFFICE nodes, after authentication. Uses Redis as backend and YAML-defined rules.

## Activation

The throttle is controlled by 3 settings, read once at server startup:

| Variable | Default (unset) | Source in deployed envs | Description |
|---|---|---|---|
| `THROTTLE_ENABLED` | `false` (disabled) | SSM `/be/throttle/enabled` | Set to `true` to mount the middleware. Otherwise it short-circuits to a no-op pass-through. |
| `THROTTLE_SHADOW_MODE` | `true` (log only) | SSM `/be/throttle/shadow-mode` | Set to `false` to actually reject requests with 429. Default keeps you safe in shadow. |
| `THROTTLE_RULES_PATH` | resolved relative to the middleware file (`config/throttle-rules.yaml` at the package root) | `env.default` (image-relative path) | Override only if you need to load a different rule file. Not in SSM since it's an image-relative path. |

In deployed environments, `THROTTLE_ENABLED` and `THROTTLE_SHADOW_MODE` are wired into the ECS task definition's `secrets:` block via Terraform. See [Operating the throttle](#operating-the-throttle) for how to change values.

In local dev, set them via `.env` (no SSM involved).

### Recommended rollout

1. Land the SSM wiring (terraform apply per env). Default values are `enabled=true, shadow_mode=true` — middleware starts observing immediately.
2. Monitor `humandMainApi.throttle.*` metrics and logs for a few days. Tune rules in `config/throttle-rules.yaml` as needed (rule changes still require a redeploy).
3. Once confident, flip `/be/throttle/shadow-mode` to `false` per env to enforce.
4. To roll back: flip `/be/throttle/enabled` to `false` (faster than reverting shadow_mode, since it short-circuits the entire middleware including Redis access).

## Operating the throttle

`THROTTLE_ENABLED` and `THROTTLE_SHADOW_MODE` are stored in SSM Parameter Store. ECS reads SSM values on container start, so changing a value requires a rolling deploy (~2-5 min) — not instant, but no PR or code change.

### SSM keys

| SSM key | Env var | Type | Default |
|---|---|---|---|
| `/be/throttle/enabled` | `THROTTLE_ENABLED` | String | `true` |
| `/be/throttle/shadow-mode` | `THROTTLE_SHADOW_MODE` | String | `true` |

Defined in `infrastructure/env/<env>/ssm.tf` per environment. The `lifecycle.ignore_changes = [value]` directive on each resource means terraform creates the parameter once at the default value and then never overwrites operator changes — you can flip the value in production without worrying that the next `terraform apply` will revert it.

The middleware is only wired into `routes/root.ts` and `routes/backofficeRoot.ts`, so `humand-public-api` carries the env vars but is not subject to throttling. Force-redeploy `humand-app` and `humand-backoffice` to pick up SSM changes; container logs show `Throttle middleware disabled` when the new task starts with `enabled=false`.

## Rules

Rules live in `humand-packages/monolith/config/throttle-rules.yaml` and are loaded at server startup (changing rules requires a redeploy).

### Format

```yaml
# Rules evaluated in order. First match wins.
# Put the most specific rules first, catch-all at the end.
#
# Two signatures are emitted per request:
#   1. instanceId:userId:path   (matches 3-segment patterns)
#   2. instanceId:userId         (matches 2-segment patterns)

# Block a specific user
- pattern: "*:1234"
  burst: 0
  refill: 0

# Block an entire instance (e.g. during an incident)
- pattern: "5678:*"
  burst: 0
  refill: 0

# Tight limit on a heavy endpoint — 5 burst, 1/sec refill
- pattern: "*:*:/reports/*"
  burst: 5
  refill: 1
  bucketKey: "reports:{0}:{1}"  # {0}=instanceId, {1}=userId

# Limit file uploads per user
- pattern: "*:*:/files"
  burst: 10
  refill: 2
  bucketKey: "files:{0}:{1}"

# Higher limit for a specific high-traffic instance
- pattern: "9999:*"
  burst: 500
  refill: 50
  bucketKey: "user:9999:{0}"  # {0}=userId

# Default — rate limit per user
- pattern: "*:*"
  burst: 100
  refill: 10
  bucketKey: "user:{0}:{1}"  # {0}=instanceId, {1}=userId
```

### Fields

Each rule configures a token bucket with two parameters:

| Field | Type | Required | Description |
|---|---|---|---|
| `pattern` | string | yes | Glob pattern to match against the request signature |
| `burst` | int >= 0 | yes | Token bucket capacity — max requests allowed in a burst |
| `refill` | int >= 0 | yes | Tokens added per second |
| `bucketKey` | string | no | Redis key template. If omitted, the full signature is used as key |

Validation: `refill: 0` is only accepted when `burst: 0` (hard-block rule). Any other combination (e.g. `burst: 5, refill: 0`) is rejected at rule-loading time — a non-refilling bucket with initial capacity would divide by zero in the Lua script.

### Request signatures

Two signatures are emitted per request, enabling rules at different granularity levels:

| Signature | Format | Use case |
|---|---|---|
| Full | `instanceId:userId:path` | Path-specific rules (`*:*:/reports/*`) |
| Short | `instanceId:userId` | User/instance-level rules (`*:1234`, `5678:*`) |

For each rule, the middleware tries the shorter signature before the longer one. That way a rule like `*:*` with `bucketKey: "user:{0}:{1}"` always matches the 2-segment form and produces one bucket per `(instanceId, userId)` — rather than greedy-capturing the 3-segment form and accidentally yielding one bucket per endpoint. The short form lets you write `*:1234` instead of `*:1234:*` for user-level rules.

### Glob patterns

`*` matches any character sequence (including `/`, `:`). Each `*` is a capture group referenced as `{0}`, `{1}`, etc. in `bucketKey`.

```
Pattern:   *:*:/users/*       (matches the full signature)
Signature: 3214:5678:/users/123
Groups:    {0}="3214", {1}="5678", {2}="123"

Pattern:   *:1234             (matches the short signature)
Signature: 3214:1234
Groups:    {0}="3214"
```

### Bucket key

Defines how requests are grouped in Redis. Controls rate limiting granularity:

```yaml
# Rate limit per user (one bucket per instanceId+userId)
bucketKey: "user:{0}:{1}"    # {0}=instanceId, {1}=userId

# Rate limit per instance (all users in the instance share a bucket)
bucketKey: "instance:{0}"    # {0}=instanceId

# No bucketKey -> the full signature is the key (one bucket per instanceId+userId+path)
```

### Blocking users (burst=0, refill=0)

To block a specific user or instance, add a rule with `burst: 0` and `refill: 0` at the top of the file. This is a fast-path that returns 429 without hitting Redis.

```yaml
- pattern: "*:1234"   # block userId 1234 across all instances
  burst: 0
  refill: 0

- pattern: "5678:*"   # block all users in instance 5678
  burst: 0
  refill: 0
```

To unblock, remove the rule and redeploy.

### Evaluation order

Rules are evaluated in file order. **First match wins.** Always place:
1. Specific blocks first
2. Instance/path exceptions next
3. Catch-all last

If no rule matches, the request passes through (with a warn in the logs).

## Algorithm: Token Bucket

Each bucket has two parameters:
- **burst**: maximum capacity (how many requests can be made in a burst)
- **refill**: tokens added per second

Example with `burst: 100, refill: 10`:
- The bucket starts full with 100 tokens.
- Each request consumes 1 token.
- 10 tokens are added per second (capped at 100).
- If 0 tokens remain, the request is rejected with 429 + `Retry-After` header.

The logic runs atomically in Redis via a Lua script (no race conditions across servers).

## Failure behavior

| Scenario | Behavior |
|---|---|
| Redis down | Fail-open: lets request through + warn log + `throttle.redis_error` metric |
| YAML file not found at startup | Middleware disabled + error log |
| Individual invalid rule in YAML | That rule is discarded + warn log, others load normally |
| Completely invalid YAML | Middleware disabled + error log |
| Error generating request signature | Lets request through + warn log |

## Observability

### Metrics (Datadog)

All metric names use the `humandMainApi.throttle` prefix (configured via `metricPrefix` when the middleware is built). Every metric also carries a `nodeType` tag (e.g. `APP`, `BACKOFFICE`) wired in via `baseMetricTags`.

| Metric | Tags | When |
|---|---|---|
| `humandMainApi.throttle.allowed` | nodeType, rule | Request allowed |
| `humandMainApi.throttle.rejected` | nodeType, rule, shadow | Request rejected (or shadow-rejected) |
| `humandMainApi.throttle.no_match` | nodeType | No rule matched |
| `humandMainApi.throttle.redis_error` | nodeType | Redis failed |

The `shadow` tag is `"true"` or `"false"` and indicates whether the rejection was real or only logged.

### Logs

Each rejection logs (warn level):
- `signature`: full request signature
- `rule`: the pattern that matched
- `bucketKey`: the Redis key
- `shadowMode`: whether the rejection was real or only logged
- `retryAfter`: seconds until retry is possible

### Response headers

All requests matching a rule receive:
- `RateLimit-Limit`: rule burst
- `RateLimit-Remaining`: tokens remaining
- `RateLimit-Reset`: seconds until the bucket is full (omitted on hard-block fast-path)

Rejected requests (429) also receive:
- `Retry-After`: for normal rate-limited 429s, the number of seconds until at least 1 token refills. For hard-block rules (`burst: 0, refill: 0`), a fixed `86400` — there is no retry that succeeds until the rule itself is changed.

## Architecture

```
Request -> Auth middleware -> Throttle middleware -> Route handler
                               |
                               +- Generate signature: instanceId:userId:path
                               +- Match against rules (file order)
                               +- If burst=0,refill=0 -> 429 (no Redis)
                               +- EVAL Lua script in Redis (atomic)
                               +- allowed -> next() / rejected -> 429
```

The library lives in `@humand-packages/common` (`src/presentation/middlewares/throttle.ts`) and is reusable by other packages. The monolith-specific wiring lives in `src/api/middlewares/throttle.ts`.

### Redis

Uses a separate logical DB (`CacheDomains.THROTTLE`, db 3) on the same Redis server as the monolith. Each bucket is stored as a hash with `tokens` and `last_refill`, with automatic TTL.

The throttle-specific Redis client uses a tight `commandTimeout` (100ms) so a degraded Redis can never block requests on the throttle path — every slow command trips the fail-open branch quickly. The other domains keep their global timeout.

## Limitations (v1)

- Configuration is read once at server startup. Changing `THROTTLE_ENABLED`, `THROTTLE_SHADOW_MODE`, or rules in `throttle-rules.yaml` requires a rolling deploy (no hot-reload).
- No per-instance override from DB (like `communityRateLimiter`).
- Does not apply to the Public API (still uses `communityRateLimiter`), gRPC, or workers.
- No local rejection cache (every request hits Redis).
- Always consumes 1 token per request (not configurable).