# hu-agent
## Agent Coding Principles
**1. Think Before Coding — Don't assume. Don't hide confusion. Surface tradeoffs.**
- State assumptions explicitly. If uncertain, ask.
- If multiple interpretations exist, present them — don't pick silently.
- If a simpler approach exists, say so. Push back when warranted.
- If something is unclear, stop. Name what's confusing. Ask.
**2. Simplicity First — Minimum code that solves the problem. Nothing speculative.**
- No features beyond what was asked.
- No abstractions for single-use code.
- No "flexibility" or "configurability" that wasn't requested.
- No error handling for impossible scenarios.
- Ask: "Would a senior engineer say this is overcomplicated?" If yes, simplify.
**3. Surgical Changes — Touch only what you must. Clean up only your own mess.**
- Don't "improve" adjacent code, comments, or formatting.
- Don't refactor things that aren't broken. Match existing style.
- If you notice unrelated dead code, mention it — don't delete it.
- Remove imports/variables/functions that *your* changes made unused, not pre-existing ones.
- Every changed line should trace directly to the user's request.
**4. Goal-Driven Execution — Define success criteria. Loop until verified.**
- Transform tasks into verifiable goals:
- "Fix the bug" → "Write a test that reproduces it, then make it pass"
- "Add validation" → "Write tests for invalid inputs, then make them pass"
- For multi-step tasks, state a brief plan with a verify step per item.
- Weak criteria ("make it work") require constant clarification — define what done looks like.
---
## Project-Specific Guidelines
Guidance for AI coding agents (Claude, Cursor, Codex, Copilot…) working in **`hu-agent`**.
For full product / API documentation see [`README.md`](./README.md). This file is the short, opinionated brief.
### Project Overview
`hu-agent` (a.k.a. `hu-developer-bot`) is an autonomous service that:
- **Polls Jira** for bug tickets assigned to the bot, runs the **Cursor CLI agent** with repo-specific rules to fix them, validates the build, and opens GitHub PRs.
- **Polls GitHub** for new comments on bot-authored PRs and replies / pushes follow-up commits.
- **Polls Slack** for `@hu-agent` mentions and answers in-thread.
- Sends a daily Slack report and exposes an HTTP debug/admin API under `/hu-agent/...`.
Single Bun process, single in-memory queue (concurrency = 1), Pino logging, optional Datadog log search.
The bot operates on **other repositories** declared in [`repos.json`](./repos.json). Anything you read about "rules", "PR templates" or "safety limits" applies to those target repos — **not** to `hu-agent` itself.
### Stack & Tooling
| Area | Choice |
|---|---|
| Runtime | **Bun** ≥ 1.1 (server, scripts, package manager) |
| Language | TypeScript (strict, ES2022, ES modules, `noUncheckedIndexedAccess`) |
| HTTP framework | Hono |
| Validation | Zod |
| Logging | Pino (+ `pino-pretty` in dev) |
| Scheduling | `croner` |
| Git | `simple-git` + system `git` binary |
| GitHub | `@octokit/rest` + `@octokit/auth-app` |
| AWS | `@aws-sdk/client-dynamodb`, `@aws-sdk/client-codeartifact` |
| Lint / format | ESLint 10 (flat config) + Prettier 3 |
| Metrics | DogStatsD via **hot-shots** → Datadog (us5); namespace `hu_agent.*` |
| Tests | Bun's built-in test runner (`bun:test`) — tests live under `test/` mirroring `src/` |
| Module resolution | `bundler`, path alias `@/* → src/*` |
| Container | Dockerfile (Bun base) + `docker-compose.yml` |
| Infra | Terraform under [`infrastructure/`](./infrastructure) |
The lockfile is `bun.lock`. **Do not add `package-lock.json` or `yarn.lock`** to this repo (the npm one already in tree is legacy).
### Key Commands
```bash
bun install # install deps (uses bun.lock)
bun run dev # watch mode (src/index.ts)
bun run start # run once
bun run build # bundle to dist/
bun run typecheck # tsc --noEmit
bun run lint # eslint src/ test/
bun run lint:fix
bun run format # prettier --write 'src/**/*.ts' 'test/**/*.ts'
bun run format:check
bun run test # bun:test runner (discovers test/**/*.test.ts)
bun run test:coverage # coverage report; CI-enforced 80% threshold (bunfig.toml)
bun run check # typecheck + lint + format:check ← run before declaring done
```
CI ([`.github/workflows/ci.yml`](./.github/workflows/ci.yml)) runs `bun install --frozen-lockfile`, `typecheck`, `lint`, `format:check`, `test`, and `test:coverage` on every PR to `main` / `develop`. Match it locally with `bun run check && bun run test:coverage`. Coverage threshold and ignore list live in [`bunfig.toml`](./bunfig.toml); modules not yet tested (pipelines, HTTP clients, pollers, `server/routes/debug.ts`) are in `coveragePathIgnorePatterns` and will be progressively removed in Phases 2–4 of the test-coverage roadmap.
**Tests live in `test/` at repo root, mirroring `src/`** (e.g. `src/core/constants.ts` → `test/core/constants.test.ts`). They import production code via the `@/*` alias (`import { X } from "@/core/constants.js"`). Bun's runner auto-discovers `**/*.test.ts`. Do **not** co-locate tests next to source.
For Terraform under `infrastructure/`, run `terraform fmt
` after editing any `.tf` file (CI fails on unformatted files).
### Module Structure
```
src/
├── index.ts Composition root: loads config, wires DI, starts queue + pollers + Hono server, handles SIGINT/SIGTERM.
├── core/
│ ├── constants.ts SAFETY, QUEUE, CURSOR_AGENT, POLLER, MEMORY, REPO_SELECTOR, SQUAD maps. Single source of truth for limits.
│ └── errors.ts Domain error classes (e.g. JobTimeoutError).
├── utils/
│ ├── config.ts Zod-validated env loader (the only place that reads process.env).
│ ├── logger.ts Pino factory; child loggers per module.
│ ├── codeartifact.ts AWS CodeArtifact token refresh for npm/yarn/pnpm-based target repos.
│ ├── package-manager.ts Detects npm/yarn/pnpm/bun per cloned repo.
│ ├── node-version.ts Resolves .nvmrc / .node-version per cloned repo.
│ ├── ansi.ts, log-line.ts Log line normalization (used by Datadog endpoint).
│ ├── git-remote.ts, slug.ts, timezone.ts
├── jira/ REST v3 client + Zod schemas. ADF→Markdown, subtask creation.
├── github/ Octokit wrapper + PAT/App auth provider with token cache.
├── agent/ Provider-neutral agent clients (selected by AGENT_PROVIDER). cursor-cli-client.ts spawns `cursor-agent`; claude-cli-client.ts spawns `claude` headless; cursor-cloud-client.ts hits the Cursor HTTP API; prompts.ts builds prompts and parses PR_*/REPLY_*/VERDICT markers; client.ts is the barrel.
├── repo/
│ ├── manager.ts Git ops, build validation, configures git auth.
│ ├── registry.ts Loads & validates repos.json (Zod).
│ ├── selector.ts Keyword scoring → picks the right repo for a ticket.
│ └── rules-sync.ts Clones remote rules repos, symlinks .mdc files into target repo's .cursor/rules/.
├── metrics/
│ ├── types.ts `Metrics` port + `MetricTags` type (DI contract).
│ ├── names.ts `METRIC` constant map — all `hu_agent.*` metric names.
│ ├── noop-recorder.ts `NoopMetrics` — no-op implementation (used when metrics disabled).
│ ├── datadog-recorder.ts `DatadogMetrics` — DogStatsD via hot-shots over UDP :8125.
│ └── index.ts `createMetrics` factory + barrel export.
├── health/
│ └── auth-checker.ts `AuthChecker` — periodic per-integration auth probe; emits `integration.auth_ok` gauge and logs on transition.
├── queue/memory-queue.ts In-memory FIFO with retry/timeout/job-kind awareness + onHighLoad hook; emits `jobs.*` metrics + periodic queue sampler.
├── poller/ jira-poller, pr-comment-poller, slack-mention-poller (all 5–30 s intervals).
│ ├── poll-runner.ts `PollRunner` — shared wrapper used by all pollers; emits `poll.*` metrics and throttled `poll_skipped` log.
│ └── comment-filter.ts Pure helpers: shouldIgnoreComment + parseIgnoredCommentLogins (consumed by pr-comment-poller and config).
├── pipeline/
│ ├── fix-pipeline.ts Jira ticket → triage → fix → push → PR.
│ ├── pr-comment-pipeline.ts PR comment → reply (+ optional code change) + persist memory.
│ ├── slack-mention-pipeline.ts @-mention → answer.
│ └── safety.ts Diff safety validator (uses SAFETY constants).
├── pr-memory/store.ts File-backed per-PR memory store (memory/#.json).
├── slack/ WebClient wrapper + message store.
├── datadog/ Logs API client + line formatter.
├── scheduler/ Daily report job (croner) + DynamoDB-backed dedupe store.
├── server/
│ ├── app.ts Hono app factory, mounts /hu-agent/*.
│ ├── middleware/
│ └── routes/ health.ts, webhook.ts, debug.ts (the big admin/debug surface).
└── types/index.ts Shared types (Repo, Job, Outcome…).
```
Tests (mirror `src/` — one test file per source module, no co-location):
```
test/
├── _coverage-sentinel.test.ts Side-effect imports to force coverage on all tracked modules.
├── _helpers/ Shared test infrastructure (NOT mirroring src/).
│ ├── fake-fetch.ts Fixture-driven fetch mock: createFakeFetch(), jsonResponse(), errorResponse().
│ ├── fake-metrics.ts FakeMetrics recorder — captures calls for assertion in tests.
│ └── mock-logger.ts Minimal pino Logger stub with bun:test mock() methods.
├── core/ Tests for src/core/
├── agent/ Tests for src/agent/ (cursor-cli-client, claude-cli-client, cursor-cloud-client, prompts)
├── datadog/ Tests for src/datadog/
├── github/ Tests for src/github/ (auth, client)
├── health/ Tests for src/health/ (auth-checker)
├── integration/ Cross-module integration tests (suffix: `.integration.test.ts`).
├── jira/ Tests for src/jira/
├── metrics/ Tests for src/metrics/ (noop-recorder, datadog-recorder)
├── pipeline/ Tests for src/pipeline/
├── poller/ Tests for src/poller/ (incl. poll-runner)
├── pr-memory/ Tests for src/pr-memory/
├── queue/ Tests for src/queue/
├── repo/ Tests for src/repo/
├── scheduler/ Tests for src/scheduler/
├── server/ Tests for src/server/
├── slack/ Tests for src/slack/
└── utils/ Tests for src/utils/
```
`test/_helpers/` uses an underscore prefix (like `_coverage-sentinel`) to signal it is shared test infrastructure, not a mirror of any `src/` subdirectory. Add new shared mocks or fixtures there; do not put them in `test/utils/` (which mirrors `src/utils/`).
External configuration / data:
- [`repos.json`](./repos.json) — managed repositories (URL, keywords, default branch, rules, PR labels). Edit this when adding/removing a repo.
- [`rules/`](./rules) — bot-side fallback rules (`general.mdc` + per-repo `.mdc` files).
- `workdir/` — runtime clones of managed repos (gitignored).
- `memory/` — per-PR memory JSON (gitignored).
- `.worktrees/` — local git worktrees for parallel branch work (gitignored).
- `docs/superpowers/specs/` — design specs, **version-controlled** (shared with collaborators). `docs/superpowers/plans/` and the rest of `docs/superpowers/` are local scratchpad for `superpowers:*` skills (gitignored). See `.gitignore` (`docs/superpowers/*` + `!docs/superpowers/specs/`).
- `infrastructure/` — Terraform (app, env, modules).
- `docs/` — pipeline walkthroughs + Insomnia/OpenAPI export.
### Code Conventions (enforced by ESLint + Prettier)
- **Strict TS**, no `any` (warn). No unused locals/params (prefix with `_` to silence).
- **No `console.log`** — use the injected `pino` `logger`. Only `console.warn/error` are allowed.
- **Prettier**: `printWidth: 100`, `semi: true`, `singleQuote: false`, `trailingComma: "all"`, `tabWidth: 2`, `arrowParens: "always"`, `endOfLine: "lf"`.
- **Dependency injection**: classes receive their dependencies via constructor (see `src/index.ts`). Don't reach into `process.env` outside `utils/config.ts`.
- **Errors**: prefer typed errors from `core/errors.ts`. Pino logs `{ err }` first arg, then message.
- **Async**: top-level `await main()` in `index.ts`; otherwise wrap in functions.
### Naming Conventions
- **Path alias**: import internal modules as `@/...` (e.g. `import { JiraClientImpl } from "@/jira/client.js"`). The `.js` extension is required in TS specifiers because of `moduleResolution: "bundler"` + ESM. Example: `import { MemoryQueue } from "@/queue/memory-queue.js"`.
- **`type` imports**: `import type { … }` (ESLint warns otherwise).
- **English-only** in code, comments, commit messages, PR titles/bodies. Even when handling Spanish Jira tickets, the bot's own source stays English.
### Feature Creation Workflow
1. **Read `README.md` and the relevant module before editing.** The README is authoritative for behavior; this file is just orientation.
2. **Keep `src/core/constants.ts` as the single source of truth** for limits, intervals, and forbidden paths. If you tune a limit, update it there — don't sprinkle magic numbers across modules.
3. **Don't reinvent the env loader.** Add new config in `src/utils/config.ts` (Zod schema) and inject through `index.ts`.
4. **Don't introduce a new HTTP client** — use Octokit (GitHub), the existing Jira client, or `fetch` for one-off calls.
5. **Use the existing test runner (Bun's `bun:test`).** Don't introduce vitest / jest / mocha or a new ORM / lockfile without an explicit ask. New tests go under `test/` mirroring `src/`, never co-located. When you change behavior in a module that already has a test, update the test in the same change.
6. **Match existing module shape** when adding a feature:
- new external integration → folder under `src//` with a `client.ts` (and `schemas.ts` if using Zod).
- new background job → `src/poller/` + `src/pipeline/` pair, wired in `index.ts`, with a new `jobKind` in the queue dispatcher.
- new HTTP route → `src/server/routes/` and mount in `app.ts` under `/hu-agent`.
7. **Logger first, throw second.** Every module gets `logger.child({ module: "..." })` from `index.ts`. Keep that pattern.
8. **Never commit secrets.** `.env` is gitignored; use `.env.example` to document new variables.
9. **Don't bypass safety.** The values in `SAFETY` (max 500 lines, 10 files, 3 deletions, forbidden CI paths, lock files) are load-bearing — they protect against runaway agent PRs in the *target* repos. Don't loosen them casually.
10. **Keep changes minimal.** No drive-by reformatting of unrelated files. Run `bun run format` only on files you touched (Prettier handles this when invoked via `lint:fix` or your editor).
**Definition of done** (for any change in this repo):
- [ ] `bun run check` is green (typecheck + lint + format).
- [ ] `bun run test` is green. If you changed behavior covered by a test, the test was updated in the same change. If you added a new public function/module of meaningful complexity, it has at least one test under `test/`.
- [ ] `bun run test:coverage` meets the 80% threshold. If you added a new source file that should be covered, remove it from `coveragePathIgnorePatterns` in `bunfig.toml` as part of the same change and provide tests. `coveragePathIgnorePatterns` has exactly 5 permanent entries (`src/index.ts`, `src/types/index.ts`, `src/agent/client.ts`, `test/`, `_helpers`). Any additional entry beyond these five means a source file is still untested — add tests or explicitly document the permanent exclusion in `bunfig.toml` with a comment.
- [ ] If you touched `repos.json`, the file still parses against `RepoRegistry`'s Zod schema (try `bun run dev` and watch logs).
- [ ] If you touched env handling, `.env.example` is updated.
- [ ] If you touched anything under `infrastructure/`, you ran `terraform fmt` on it.
- [ ] No new `console.log`, no new `any`, no new lockfiles, no new secrets in tree.
- [ ] No commits during multi-step plans — ask the user before `git commit`.
### Operational Facts
- Bun `idleTimeout: 120 s` because the Datadog 24 h log search paginates.
- Queue concurrency = 1, `JOB_TIMEOUT_MS = 50 min`, `MAX_RETRIES = 2`.
- Pollers: Jira every 5 s, PR comments every 30 s, Slack mentions every 30 s — each independently toggleable via env (`JIRA_POLL_ENABLED`, `ATTEND_FEEDBACK_IN_PR_ENABLED`, `SLACK_MENTIONS_ENABLED`).
- GitHub auth supports either `GITHUB_TOKEN` (PAT) or GitHub App (`GITHUB_APP_ID` + `GITHUB_APP_INSTALLATION_ID` + `GITHUB_APP_PRIVATE_KEY`). App tokens auto-refresh 5 min before expiry.
- Cursor CLI: the bot calls `~/.local/bin/cursor-agent` directly to skip the Cursor wrapper's hang-prone version check. Install via `curl -sS https://cursor.com/install | bash`.
- AWS CodeArtifact: `configureCodeArtifactAuth` runs on boot to authenticate npm-based managed repos. If it fails the bot still starts but warns.
- DynamoDB (optional, via `DYNAMODB_TABLE_NAME`) backs the daily-report dedupe store.
- The bot emits `hu_agent.*` custom DogStatsD metrics (jobs, queue depth, poll activity) to Datadog us5. Every metric carries global tags `env`, `service` and `provider` (the active `AGENT_PROVIDER`, cursor | claude). Dashboards and monitors are built manually in Datadog; there are no Terraform-managed Datadog resources.
- `AuthChecker` runs a periodic per-integration auth probe (jira/slack/github/agent) decoupled from the job queue, emitting `integration.auth_ok` gauge so credential failures surface even when the queue is wedged.
- Pipeline usage metrics are emitted for all four pipelines: `repo.selected`, `pr.opened`, `fix.build_validation`, `fix.triage_verdict{verdict}`, `fix.outcome{outcome}` (terminal: pr_created | no_changes | needs_info | agent_failed | checks_failed | sync_failed | error), `safety.violation` (observability-only), `agent.invocations{phase}`, `agent.duration{phase}`, `pr_comment.handled{repo,action}`, `pr_mention.handled{repo,action}`, `slack_mention.handled`, `prewarm.warmed{total}` (gauge: repos warmed at boot out of total), `prewarm.duration{total}` (timing: total boot prewarm wall-clock). These enable an agent-usage dashboard in Datadog us5.
- **Follow-up (agent.tokens):** add an `agent.tokens{provider,model}` metric. The Cursor CLI/cloud clients don't expose token usage in their responses; the Claude provider does (`usage.input_tokens`/`usage.output_tokens` in the `result` event), so wire it into the agent invocation sites (fix triage/fix/followup, pr-comment, pr-mention, slack) when picking this up.
- **Follow-up (safety enforcement):** `safety.violation` is observability-only — the fix pipeline computes diff violations via `getDiffStats()` but does NOT block on them today. Enforcing the `SAFETY` limits (skip/abort a repo when `!passed`) is a separate, deliberate change (it would stop currently-passing large PRs).
- **Cold-start prewarm:** on boot the bot full-clones and checks out the base of every managed repo via `prewarmRepos` **before** pollers start; `/health` returns **503** until prewarm resolves, then **200**. Prewarm is fail-open (bounded by `CLONE.PREWARM_TIMEOUT_MS`): on timeout/error the bot marks ready anyway and falls back to on-demand sync, emitting `prewarm.warmed{total}`. `syncAllRepos` installs only when `node_modules` is missing or install inputs changed. ECS `startPeriod` and `health_check_grace_period_seconds` are 600s to cover the longer-but-warm boot. (A shallow `--depth` clone was tried and reverted: it breaks the pr-comment-pipeline `checkoutBranch` path — `--single-branch` reduced refspec + missing three-dot merge-base; see `docs/superpowers/specs/2026-06-03-cold-start-repo-prewarm-design.md` §1.)
- `SANDBOX_MODE=true` (dev sandbox only) suppresses every Jira write (transitions, comments, subtasks, squad labels) and opens PRs as draft. The dev environment is endpoint-triggered (pollers off), runs a separate GitHub App (`hu-agent-dev[bot]`), and posts to the `hu-agent-test` Slack channel. **Any new pipeline write path must also be gated by `SANDBOX_MODE`.**
- The `Trigger Fix Pipeline` workflow ([`.github/workflows/trigger.yml`](./.github/workflows/trigger.yml), `workflow_dispatch`: `environment` dev/prd + `issue_key`) joins the VPN from the runner and POSTs to the private trigger endpoint — no local VPN needed. Fire-and-forget: green run = HTTP 202 (enqueued). ALB hostnames are hardcoded per env; see `docs/superpowers/specs/2026-06-03-manual-trigger-workflow-design.md`.
### Ports
- Server listens on `PORT` (default `3000`) and mounts everything under **`/hu-agent`** (e.g. `GET /hu-agent/health`).
### Environment Variables
All env is loaded and validated in one place — `src/utils/config.ts` (Zod schema → `AppConfig`). Never read `process.env` outside that file; add new config there and inject it via `src/index.ts`. Every variable is documented in `.env.example`, which is kept in sync whenever env handling changes.
### Documentation Rules
When you add or change a feature, update this `AGENTS.md` **in the same commit/PR** if the change affects cross-cutting concerns: new commands, config/env vars, testing patterns, tech-stack additions, or Docker/infra changes. Keep it minimal — what a developer needs to work here, not a changelog. If a section no longer reflects the code, fix or remove it.
Whenever an `AGENTS.md` exists in a directory, verify a `CLAUDE.md` symlink points to it (`ln -s AGENTS.md CLAUDE.md`) — Claude Code reads `CLAUDE.md` (via the `.claude → .cursor` convention), so without the symlink the rules are ignored. When you add a new `AGENTS.md`, also add its row to the Agent Context Map below.
### Agent Context Map
| Path | Covers |
|------|--------|
| `AGENTS.md` (root) | The whole service: principles, stack, commands, source layout, conventions, safety limits, operational facts |
| `infrastructure/AGENTS.md` | *(not present yet)* Terraform IaC — add a focused file here if infra rules grow |
Single root file is the source of truth; the map exists so the convention is ready if a focused `infrastructure/AGENTS.md` is added later.
### Repo Skills
None yet. hu-agent has no `.cursor/skills/` directory. If repo-local skills are added under `.cursor/skills//SKILL.md`, list them here with a "when to use" note so agents invoke them instead of reinventing the procedure.
### Pointers
- [`README.md`](./README.md) — full feature, env, and API reference.
- [`docs/jira-fix-pipeline.md`](./docs/jira-fix-pipeline.md) — Jira → Fix → PR walkthrough.
- [`docs/github-pipeline.md`](./docs/github-pipeline.md) — PR comment pipeline walkthrough.
- [`docs/IDEA.md`](./docs/IDEA.md) — original product brief.
- [`docs/api-docs/collection.yaml`](./docs/api-docs/collection.yaml) — Insomnia / OpenAPI export of the debug API.
- [`rules/general.mdc`](./rules/general.mdc) — global agent rules injected into every prompt the bot sends to Cursor.
- [`infrastructure/`](./infrastructure) — Terraform (run `terraform fmt` after edits).