# hu-agent

## Agent Coding Principles

**1. Think Before Coding — Don't assume. Don't hide confusion. Surface tradeoffs.**
- State assumptions explicitly. If uncertain, ask.
- If multiple interpretations exist, present them — don't pick silently.
- If a simpler approach exists, say so. Push back when warranted.
- If something is unclear, stop. Name what's confusing. Ask.

**2. Simplicity First — Minimum code that solves the problem. Nothing speculative.**
- No features beyond what was asked.
- No abstractions for single-use code.
- No "flexibility" or "configurability" that wasn't requested.
- No error handling for impossible scenarios.
- Ask: "Would a senior engineer say this is overcomplicated?" If yes, simplify.

**3. Surgical Changes — Touch only what you must. Clean up only your own mess.**
- Don't "improve" adjacent code, comments, or formatting.
- Don't refactor things that aren't broken. Match existing style.
- If you notice unrelated dead code, mention it — don't delete it.
- Remove imports/variables/functions that *your* changes made unused, not pre-existing ones.
- Every changed line should trace directly to the user's request.

**4. Goal-Driven Execution — Define success criteria. Loop until verified.**
- Transform tasks into verifiable goals:
  - "Fix the bug" → "Write a test that reproduces it, then make it pass"
  - "Add validation" → "Write tests for invalid inputs, then make them pass"
- For multi-step tasks, state a brief plan with a verify step per item.
- Weak criteria ("make it work") require constant clarification — define what done looks like.

---

## Project-Specific Guidelines

Guidance for AI coding agents (Claude, Cursor, Codex, Copilot…) working in **`hu-agent`**.
For full product / API documentation see [`README.md`](./README.md). This file is the short, opinionated brief.

### Project Overview

`hu-agent` (a.k.a. `hu-developer-bot`) is an autonomous service that:

- **Polls Jira** for bug tickets assigned to the bot, runs the **Cursor CLI agent** with repo-specific rules to fix them, validates the build, and opens GitHub PRs.
- **Polls GitHub** for new comments on bot-authored PRs and replies / pushes follow-up commits.
- **Polls Slack** for `@hu-agent` mentions and answers in-thread.
- Sends a daily Slack report and exposes an HTTP debug/admin API under `/hu-agent/...`.

Single Bun process, single in-memory queue (concurrency = 1), Pino logging, optional Datadog log search.

The bot operates on **other repositories** declared in [`repos.json`](./repos.json). Anything you read about "rules", "PR templates" or "safety limits" applies to those target repos — **not** to `hu-agent` itself.

### Stack & Tooling

| Area | Choice |
|---|---|
| Runtime | **Bun** ≥ 1.1 (server, scripts, package manager) |
| Language | TypeScript (strict, ES2022, ES modules, `noUncheckedIndexedAccess`) |
| HTTP framework | Hono |
| Validation | Zod |
| Logging | Pino (+ `pino-pretty` in dev) |
| Scheduling | `croner` |
| Git | `simple-git` + system `git` binary |
| GitHub | `@octokit/rest` + `@octokit/auth-app` |
| AWS | `@aws-sdk/client-dynamodb`, `@aws-sdk/client-codeartifact` |
| Lint / format | ESLint 10 (flat config) + Prettier 3 |
| Metrics | DogStatsD via **hot-shots** → Datadog (us5); namespace `hu_agent.*` |
| Tests | Bun's built-in test runner (`bun:test`) — tests live under `test/` mirroring `src/` |
| Module resolution | `bundler`, path alias `@/* → src/*` |
| Container | Dockerfile (Bun base) + `docker-compose.yml` |
| Infra | Terraform under [`infrastructure/`](./infrastructure) |

The lockfile is `bun.lock`. **Do not add `package-lock.json` or `yarn.lock`** to this repo (the npm one already in tree is legacy).

### Key Commands

```bash
bun install                 # install deps (uses bun.lock)
bun run dev                 # watch mode (src/index.ts)
bun run start               # run once
bun run build               # bundle to dist/
bun run typecheck           # tsc --noEmit
bun run lint                # eslint src/ test/
bun run lint:fix
bun run format              # prettier --write 'src/**/*.ts' 'test/**/*.ts'
bun run format:check
bun run test                # bun:test runner (discovers test/**/*.test.ts)
bun run test:coverage       # coverage report; CI-enforced 80% threshold (bunfig.toml)
bun run check               # typecheck + lint + format:check  ← run before declaring done
```

CI ([`.github/workflows/ci.yml`](./.github/workflows/ci.yml)) runs `bun install --frozen-lockfile`, `typecheck`, `lint`, `format:check`, `test`, and `test:coverage` on every PR to `main` / `develop`. Match it locally with `bun run check && bun run test:coverage`. Coverage threshold and ignore list live in [`bunfig.toml`](./bunfig.toml); modules not yet tested (pipelines, HTTP clients, pollers, `server/routes/debug.ts`) are in `coveragePathIgnorePatterns` and will be progressively removed in Phases 2–4 of the test-coverage roadmap.

**Tests live in `test/` at repo root, mirroring `src/`** (e.g. `src/core/constants.ts` → `test/core/constants.test.ts`). They import production code via the `@/*` alias (`import { X } from "@/core/constants.js"`). Bun's runner auto-discovers `**/*.test.ts`. Do **not** co-locate tests next to source.

For Terraform under `infrastructure/`, run `terraform fmt <dir>` after editing any `.tf` file (CI fails on unformatted files).

### Module Structure

```
src/
├── index.ts                       Composition root: loads config, wires DI, starts queue + pollers + Hono server, handles SIGINT/SIGTERM.
├── core/
│   ├── constants.ts               SAFETY, QUEUE, CURSOR_AGENT, POLLER, MEMORY, REPO_SELECTOR, SQUAD maps. Single source of truth for limits.
│   └── errors.ts                  Domain error classes (e.g. JobTimeoutError).
├── utils/
│   ├── config.ts                  Zod-validated env loader (the only place that reads process.env).
│   ├── logger.ts                  Pino factory; child loggers per module.
│   ├── codeartifact.ts            AWS CodeArtifact token refresh for npm/yarn/pnpm-based target repos.
│   ├── package-manager.ts         Detects npm/yarn/pnpm/bun per cloned repo.
│   ├── node-version.ts            Resolves .nvmrc / .node-version per cloned repo.
│   ├── ansi.ts, log-line.ts       Log line normalization (used by Datadog endpoint).
│   ├── git-remote.ts, slug.ts, timezone.ts
├── jira/                          REST v3 client + Zod schemas. ADF→Markdown, subtask creation.
├── github/                        Octokit wrapper + PAT/App auth provider with token cache.
├── agent/                         Provider-neutral agent clients (selected by AGENT_PROVIDER). cursor-cli-client.ts spawns `cursor-agent`; claude-cli-client.ts spawns `claude` headless; cursor-cloud-client.ts hits the Cursor HTTP API; prompts.ts builds prompts and parses PR_*/REPLY_*/VERDICT markers; client.ts is the barrel.
├── repo/
│   ├── manager.ts                 Git ops, build validation, configures git auth.
│   ├── registry.ts                Loads & validates repos.json (Zod).
│   ├── selector.ts                Keyword scoring → picks the right repo for a ticket.
│   └── rules-sync.ts              Clones remote rules repos, symlinks .mdc files into target repo's .cursor/rules/.
├── metrics/
│   ├── types.ts                   `Metrics` port + `MetricTags` type (DI contract).
│   ├── names.ts                   `METRIC` constant map — all `hu_agent.*` metric names.
│   ├── noop-recorder.ts           `NoopMetrics` — no-op implementation (used when metrics disabled).
│   ├── datadog-recorder.ts        `DatadogMetrics` — DogStatsD via hot-shots over UDP :8125.
│   └── index.ts                   `createMetrics` factory + barrel export.
├── health/
│   └── auth-checker.ts            `AuthChecker` — periodic per-integration auth probe; emits `integration.auth_ok` gauge and logs on transition.
├── queue/memory-queue.ts          In-memory FIFO with retry/timeout/job-kind awareness + onHighLoad hook; emits `jobs.*` metrics + periodic queue sampler.
├── poller/                        jira-poller, pr-comment-poller, slack-mention-poller (all 5–30 s intervals).
│   ├── poll-runner.ts             `PollRunner` — shared wrapper used by all pollers; emits `poll.*` metrics and throttled `poll_skipped` log.
│   └── comment-filter.ts          Pure helpers: shouldIgnoreComment + parseIgnoredCommentLogins (consumed by pr-comment-poller and config).
├── pipeline/
│   ├── fix-pipeline.ts            Jira ticket → triage → fix → push → PR.
│   ├── pr-comment-pipeline.ts     PR comment → reply (+ optional code change) + persist memory.
│   ├── slack-mention-pipeline.ts  @-mention → answer.
│   └── safety.ts                  Diff safety validator (uses SAFETY constants).
├── pr-memory/store.ts             File-backed per-PR memory store (memory/<repo>#<pr>.json).
├── slack/                         WebClient wrapper + message store.
├── datadog/                       Logs API client + line formatter.
├── scheduler/                     Daily report job (croner) + DynamoDB-backed dedupe store.
├── server/
│   ├── app.ts                     Hono app factory, mounts /hu-agent/*.
│   ├── middleware/
│   └── routes/                    health.ts, webhook.ts, debug.ts (the big admin/debug surface).
└── types/index.ts                 Shared types (Repo, Job, Outcome…).
```

Tests (mirror `src/` — one test file per source module, no co-location):
```
test/
├── _coverage-sentinel.test.ts      Side-effect imports to force coverage on all tracked modules.
├── _helpers/                       Shared test infrastructure (NOT mirroring src/).
│   ├── fake-fetch.ts               Fixture-driven fetch mock: createFakeFetch(), jsonResponse(), errorResponse().
│   ├── fake-metrics.ts             FakeMetrics recorder — captures calls for assertion in tests.
│   └── mock-logger.ts              Minimal pino Logger stub with bun:test mock() methods.
├── core/                           Tests for src/core/
├── agent/                          Tests for src/agent/ (cursor-cli-client, claude-cli-client, cursor-cloud-client, prompts)
├── datadog/                        Tests for src/datadog/
├── github/                         Tests for src/github/ (auth, client)
├── health/                         Tests for src/health/ (auth-checker)
├── integration/                    Cross-module integration tests (suffix: `.integration.test.ts`).
├── jira/                           Tests for src/jira/
├── metrics/                        Tests for src/metrics/ (noop-recorder, datadog-recorder)
├── pipeline/                       Tests for src/pipeline/
├── poller/                         Tests for src/poller/ (incl. poll-runner)
├── pr-memory/                      Tests for src/pr-memory/
├── queue/                          Tests for src/queue/
├── repo/                           Tests for src/repo/
├── scheduler/                      Tests for src/scheduler/
├── server/                         Tests for src/server/
├── slack/                          Tests for src/slack/
└── utils/                          Tests for src/utils/
```

`test/_helpers/` uses an underscore prefix (like `_coverage-sentinel`) to signal it is shared test infrastructure, not a mirror of any `src/` subdirectory. Add new shared mocks or fixtures there; do not put them in `test/utils/` (which mirrors `src/utils/`).

External configuration / data:
- [`repos.json`](./repos.json) — managed repositories (URL, keywords, default branch, rules, PR labels). Edit this when adding/removing a repo.
- [`rules/`](./rules) — bot-side fallback rules (`general.mdc` + per-repo `.mdc` files).
- `workdir/` — runtime clones of managed repos (gitignored).
- `memory/` — per-PR memory JSON (gitignored).
- `.worktrees/` — local git worktrees for parallel branch work (gitignored).
- `docs/superpowers/specs/` — design specs, **version-controlled** (shared with collaborators). `docs/superpowers/plans/` and the rest of `docs/superpowers/` are local scratchpad for `superpowers:*` skills (gitignored). See `.gitignore` (`docs/superpowers/*` + `!docs/superpowers/specs/`).
- `infrastructure/` — Terraform (app, env, modules).
- `docs/` — pipeline walkthroughs + Insomnia/OpenAPI export.

### Code Conventions (enforced by ESLint + Prettier)

- **Strict TS**, no `any` (warn). No unused locals/params (prefix with `_` to silence).
- **No `console.log`** — use the injected `pino` `logger`. Only `console.warn/error` are allowed.
- **Prettier**: `printWidth: 100`, `semi: true`, `singleQuote: false`, `trailingComma: "all"`, `tabWidth: 2`, `arrowParens: "always"`, `endOfLine: "lf"`.
- **Dependency injection**: classes receive their dependencies via constructor (see `src/index.ts`). Don't reach into `process.env` outside `utils/config.ts`.
- **Errors**: prefer typed errors from `core/errors.ts`. Pino logs `{ err }` first arg, then message.
- **Async**: top-level `await main()` in `index.ts`; otherwise wrap in functions.

### Naming Conventions

- **Path alias**: import internal modules as `@/...` (e.g. `import { JiraClientImpl } from "@/jira/client.js"`). The `.js` extension is required in TS specifiers because of `moduleResolution: "bundler"` + ESM. Example: `import { MemoryQueue } from "@/queue/memory-queue.js"`.
- **`type` imports**: `import type { … }` (ESLint warns otherwise).
- **English-only** in code, comments, commit messages, PR titles/bodies. Even when handling Spanish Jira tickets, the bot's own source stays English.

### Feature Creation Workflow

1. **Read `README.md` and the relevant module before editing.** The README is authoritative for behavior; this file is just orientation.
2. **Keep `src/core/constants.ts` as the single source of truth** for limits, intervals, and forbidden paths. If you tune a limit, update it there — don't sprinkle magic numbers across modules.
3. **Don't reinvent the env loader.** Add new config in `src/utils/config.ts` (Zod schema) and inject through `index.ts`.
4. **Don't introduce a new HTTP client** — use Octokit (GitHub), the existing Jira client, or `fetch` for one-off calls.
5. **Use the existing test runner (Bun's `bun:test`).** Don't introduce vitest / jest / mocha or a new ORM / lockfile without an explicit ask. New tests go under `test/` mirroring `src/`, never co-located. When you change behavior in a module that already has a test, update the test in the same change.
6. **Match existing module shape** when adding a feature:
   - new external integration → folder under `src/<name>/` with a `client.ts` (and `schemas.ts` if using Zod).
   - new background job → `src/poller/` + `src/pipeline/` pair, wired in `index.ts`, with a new `jobKind` in the queue dispatcher.
   - new HTTP route → `src/server/routes/` and mount in `app.ts` under `/hu-agent`.
7. **Logger first, throw second.** Every module gets `logger.child({ module: "..." })` from `index.ts`. Keep that pattern.
8. **Never commit secrets.** `.env` is gitignored; use `.env.example` to document new variables.
9. **Don't bypass safety.** The values in `SAFETY` (max 500 lines, 10 files, 3 deletions, forbidden CI paths, lock files) are load-bearing — they protect against runaway agent PRs in the *target* repos. Don't loosen them casually.
10. **Keep changes minimal.** No drive-by reformatting of unrelated files. Run `bun run format` only on files you touched (Prettier handles this when invoked via `lint:fix` or your editor).

**Definition of done** (for any change in this repo):

- [ ] `bun run check` is green (typecheck + lint + format).
- [ ] `bun run test` is green. If you changed behavior covered by a test, the test was updated in the same change. If you added a new public function/module of meaningful complexity, it has at least one test under `test/`.
- [ ] `bun run test:coverage` meets the 80% threshold. If you added a new source file that should be covered, remove it from `coveragePathIgnorePatterns` in `bunfig.toml` as part of the same change and provide tests. `coveragePathIgnorePatterns` has exactly 5 permanent entries (`src/index.ts`, `src/types/index.ts`, `src/agent/client.ts`, `test/`, `_helpers`). Any additional entry beyond these five means a source file is still untested — add tests or explicitly document the permanent exclusion in `bunfig.toml` with a comment.
- [ ] If you touched `repos.json`, the file still parses against `RepoRegistry`'s Zod schema (try `bun run dev` and watch logs).
- [ ] If you touched env handling, `.env.example` is updated.
- [ ] If you touched anything under `infrastructure/`, you ran `terraform fmt` on it.
- [ ] No new `console.log`, no new `any`, no new lockfiles, no new secrets in tree.
- [ ] No commits during multi-step plans — ask the user before `git commit`.

### Operational Facts

- Bun `idleTimeout: 120 s` because the Datadog 24 h log search paginates.
- Queue concurrency = 1, `JOB_TIMEOUT_MS = 50 min`, `MAX_RETRIES = 2`.
- Pollers: Jira every 5 s, PR comments every 30 s, Slack mentions every 30 s — each independently toggleable via env (`JIRA_POLL_ENABLED`, `ATTEND_FEEDBACK_IN_PR_ENABLED`, `SLACK_MENTIONS_ENABLED`).
- GitHub auth supports either `GITHUB_TOKEN` (PAT) or GitHub App (`GITHUB_APP_ID` + `GITHUB_APP_INSTALLATION_ID` + `GITHUB_APP_PRIVATE_KEY`). App tokens auto-refresh 5 min before expiry.
- Cursor CLI: the bot calls `~/.local/bin/cursor-agent` directly to skip the Cursor wrapper's hang-prone version check. Install via `curl -sS https://cursor.com/install | bash`.
- AWS CodeArtifact: `configureCodeArtifactAuth` runs on boot to authenticate npm-based managed repos. If it fails the bot still starts but warns.
- DynamoDB (optional, via `DYNAMODB_TABLE_NAME`) backs the daily-report dedupe store.
- The bot emits `hu_agent.*` custom DogStatsD metrics (jobs, queue depth, poll activity) to Datadog us5. Every metric carries global tags `env`, `service` and `provider` (the active `AGENT_PROVIDER`, cursor | claude). Dashboards and monitors are built manually in Datadog; there are no Terraform-managed Datadog resources.
- `AuthChecker` runs a periodic per-integration auth probe (jira/slack/github/agent) decoupled from the job queue, emitting `integration.auth_ok` gauge so credential failures surface even when the queue is wedged.
- Pipeline usage metrics are emitted for all four pipelines: `repo.selected`, `pr.opened`, `fix.build_validation`, `fix.triage_verdict{verdict}`, `fix.outcome{outcome}` (terminal: pr_created | no_changes | needs_info | agent_failed | checks_failed | sync_failed | error), `safety.violation` (observability-only), `agent.invocations{phase}`, `agent.duration{phase}`, `pr_comment.handled{repo,action}`, `pr_mention.handled{repo,action}`, `slack_mention.handled`, `prewarm.warmed{total}` (gauge: repos warmed at boot out of total), `prewarm.duration{total}` (timing: total boot prewarm wall-clock). These enable an agent-usage dashboard in Datadog us5.
- **Follow-up (agent.tokens):** add an `agent.tokens{provider,model}` metric. The Cursor CLI/cloud clients don't expose token usage in their responses; the Claude provider does (`usage.input_tokens`/`usage.output_tokens` in the `result` event), so wire it into the agent invocation sites (fix triage/fix/followup, pr-comment, pr-mention, slack) when picking this up.
- **Follow-up (safety enforcement):** `safety.violation` is observability-only — the fix pipeline computes diff violations via `getDiffStats()` but does NOT block on them today. Enforcing the `SAFETY` limits (skip/abort a repo when `!passed`) is a separate, deliberate change (it would stop currently-passing large PRs).
- **Cold-start prewarm:** on boot the bot full-clones and checks out the base of every managed repo via `prewarmRepos` **before** pollers start; `/health` returns **503** until prewarm resolves, then **200**. Prewarm is fail-open (bounded by `CLONE.PREWARM_TIMEOUT_MS`): on timeout/error the bot marks ready anyway and falls back to on-demand sync, emitting `prewarm.warmed{total}`. `syncAllRepos` installs only when `node_modules` is missing or install inputs changed. ECS `startPeriod` and `health_check_grace_period_seconds` are 600s to cover the longer-but-warm boot. (A shallow `--depth` clone was tried and reverted: it breaks the pr-comment-pipeline `checkoutBranch` path — `--single-branch` reduced refspec + missing three-dot merge-base; see `docs/superpowers/specs/2026-06-03-cold-start-repo-prewarm-design.md` §1.)
- `SANDBOX_MODE=true` (dev sandbox only) suppresses every Jira write (transitions, comments, subtasks, squad labels) and opens PRs as draft. The dev environment is endpoint-triggered (pollers off), runs a separate GitHub App (`hu-agent-dev[bot]`), and posts to the `hu-agent-test` Slack channel. **Any new pipeline write path must also be gated by `SANDBOX_MODE`.**
- The `Trigger Fix Pipeline` workflow ([`.github/workflows/trigger.yml`](./.github/workflows/trigger.yml), `workflow_dispatch`: `environment` dev/prd + `issue_key`) joins the VPN from the runner and POSTs to the private trigger endpoint — no local VPN needed. Fire-and-forget: green run = HTTP 202 (enqueued). ALB hostnames are hardcoded per env; see `docs/superpowers/specs/2026-06-03-manual-trigger-workflow-design.md`.

### Ports

- Server listens on `PORT` (default `3000`) and mounts everything under **`/hu-agent`** (e.g. `GET /hu-agent/health`).

### Environment Variables

All env is loaded and validated in one place — `src/utils/config.ts` (Zod schema → `AppConfig`). Never read `process.env` outside that file; add new config there and inject it via `src/index.ts`. Every variable is documented in `.env.example`, which is kept in sync whenever env handling changes.

### Documentation Rules

When you add or change a feature, update this `AGENTS.md` **in the same commit/PR** if the change affects cross-cutting concerns: new commands, config/env vars, testing patterns, tech-stack additions, or Docker/infra changes. Keep it minimal — what a developer needs to work here, not a changelog. If a section no longer reflects the code, fix or remove it.

Whenever an `AGENTS.md` exists in a directory, verify a `CLAUDE.md` symlink points to it (`ln -s AGENTS.md CLAUDE.md`) — Claude Code reads `CLAUDE.md` (via the `.claude → .cursor` convention), so without the symlink the rules are ignored. When you add a new `AGENTS.md`, also add its row to the Agent Context Map below.

### Agent Context Map

| Path | Covers |
|------|--------|
| `AGENTS.md` (root) | The whole service: principles, stack, commands, source layout, conventions, safety limits, operational facts |
| `infrastructure/AGENTS.md` | *(not present yet)* Terraform IaC — add a focused file here if infra rules grow |

Single root file is the source of truth; the map exists so the convention is ready if a focused `infrastructure/AGENTS.md` is added later.

### Repo Skills

None yet. hu-agent has no `.cursor/skills/` directory. If repo-local skills are added under `.cursor/skills/<name>/SKILL.md`, list them here with a "when to use" note so agents invoke them instead of reinventing the procedure.

### Pointers

- [`README.md`](./README.md) — full feature, env, and API reference.
- [`docs/jira-fix-pipeline.md`](./docs/jira-fix-pipeline.md) — Jira → Fix → PR walkthrough.
- [`docs/github-pipeline.md`](./docs/github-pipeline.md) — PR comment pipeline walkthrough.
- [`docs/IDEA.md`](./docs/IDEA.md) — original product brief.
- [`docs/api-docs/collection.yaml`](./docs/api-docs/collection.yaml) — Insomnia / OpenAPI export of the debug API.
- [`rules/general.mdc`](./rules/general.mdc) — global agent rules injected into every prompt the bot sends to Cursor.
- [`infrastructure/`](./infrastructure) — Terraform (run `terraform fmt` after edits).