# hu-agent ## Agent Coding Principles **1. Think Before Coding — Don't assume. Don't hide confusion. Surface tradeoffs.** - State assumptions explicitly. If uncertain, ask. - If multiple interpretations exist, present them — don't pick silently. - If a simpler approach exists, say so. Push back when warranted. - If something is unclear, stop. Name what's confusing. Ask. **2. Simplicity First — Minimum code that solves the problem. Nothing speculative.** - No features beyond what was asked. - No abstractions for single-use code. - No "flexibility" or "configurability" that wasn't requested. - No error handling for impossible scenarios. - Ask: "Would a senior engineer say this is overcomplicated?" If yes, simplify. **3. Surgical Changes — Touch only what you must. Clean up only your own mess.** - Don't "improve" adjacent code, comments, or formatting. - Don't refactor things that aren't broken. Match existing style. - If you notice unrelated dead code, mention it — don't delete it. - Remove imports/variables/functions that *your* changes made unused, not pre-existing ones. - Every changed line should trace directly to the user's request. **4. Goal-Driven Execution — Define success criteria. Loop until verified.** - Transform tasks into verifiable goals: - "Fix the bug" → "Write a test that reproduces it, then make it pass" - "Add validation" → "Write tests for invalid inputs, then make them pass" - For multi-step tasks, state a brief plan with a verify step per item. - Weak criteria ("make it work") require constant clarification — define what done looks like. --- ## Project-Specific Guidelines Guidance for AI coding agents (Claude, Cursor, Codex, Copilot…) working in **`hu-agent`**. For full product / API documentation see [`README.md`](./README.md). This file is the short, opinionated brief. ### Project Overview `hu-agent` (a.k.a. `hu-developer-bot`) is an autonomous service that: - **Polls Jira** for bug tickets assigned to the bot, runs the **Cursor CLI agent** with repo-specific rules to fix them, validates the build, and opens GitHub PRs. - **Polls GitHub** for new comments on bot-authored PRs and replies / pushes follow-up commits. - **Polls Slack** for `@hu-agent` mentions and answers in-thread. - Sends a daily Slack report and exposes an HTTP debug/admin API under `/hu-agent/...`. Single Bun process, single in-memory queue (concurrency = 1), Pino logging, optional Datadog log search. The bot operates on **other repositories** declared in [`repos.json`](./repos.json). Anything you read about "rules", "PR templates" or "safety limits" applies to those target repos — **not** to `hu-agent` itself. ### Stack & Tooling | Area | Choice | |---|---| | Runtime | **Bun** ≥ 1.1 (server, scripts, package manager) | | Language | TypeScript (strict, ES2022, ES modules, `noUncheckedIndexedAccess`) | | HTTP framework | Hono | | Validation | Zod | | Logging | Pino (+ `pino-pretty` in dev) | | Scheduling | `croner` | | Git | `simple-git` + system `git` binary | | GitHub | `@octokit/rest` + `@octokit/auth-app` | | AWS | `@aws-sdk/client-dynamodb`, `@aws-sdk/client-codeartifact` | | Lint / format | ESLint 10 (flat config) + Prettier 3 | | Metrics | DogStatsD via **hot-shots** → Datadog (us5); namespace `hu_agent.*` | | Tests | Bun's built-in test runner (`bun:test`) — tests live under `test/` mirroring `src/` | | Module resolution | `bundler`, path alias `@/* → src/*` | | Container | Dockerfile (Bun base) + `docker-compose.yml` | | Infra | Terraform under [`infrastructure/`](./infrastructure) | The lockfile is `bun.lock`. **Do not add `package-lock.json` or `yarn.lock`** to this repo (the npm one already in tree is legacy). ### Key Commands ```bash bun install # install deps (uses bun.lock) bun run dev # watch mode (src/index.ts) bun run start # run once bun run build # bundle to dist/ bun run typecheck # tsc --noEmit bun run lint # eslint src/ test/ bun run lint:fix bun run format # prettier --write 'src/**/*.ts' 'test/**/*.ts' bun run format:check bun run test # bun:test runner (discovers test/**/*.test.ts) bun run test:coverage # coverage report; CI-enforced 80% threshold (bunfig.toml) bun run check # typecheck + lint + format:check ← run before declaring done ``` CI ([`.github/workflows/ci.yml`](./.github/workflows/ci.yml)) runs `bun install --frozen-lockfile`, `typecheck`, `lint`, `format:check`, `test`, and `test:coverage` on every PR to `main` / `develop`. Match it locally with `bun run check && bun run test:coverage`. Coverage threshold and ignore list live in [`bunfig.toml`](./bunfig.toml); modules not yet tested (pipelines, HTTP clients, pollers, `server/routes/debug.ts`) are in `coveragePathIgnorePatterns` and will be progressively removed in Phases 2–4 of the test-coverage roadmap. **Tests live in `test/` at repo root, mirroring `src/`** (e.g. `src/core/constants.ts` → `test/core/constants.test.ts`). They import production code via the `@/*` alias (`import { X } from "@/core/constants.js"`). Bun's runner auto-discovers `**/*.test.ts`. Do **not** co-locate tests next to source. For Terraform under `infrastructure/`, run `terraform fmt ` after editing any `.tf` file (CI fails on unformatted files). ### Module Structure ``` src/ ├── index.ts Composition root: loads config, wires DI, starts queue + pollers + Hono server, handles SIGINT/SIGTERM. ├── core/ │ ├── constants.ts SAFETY, QUEUE, CURSOR_AGENT, POLLER, MEMORY, REPO_SELECTOR, SQUAD maps. Single source of truth for limits. │ └── errors.ts Domain error classes (e.g. JobTimeoutError). ├── utils/ │ ├── config.ts Zod-validated env loader (the only place that reads process.env). │ ├── logger.ts Pino factory; child loggers per module. │ ├── codeartifact.ts AWS CodeArtifact token refresh for npm/yarn/pnpm-based target repos. │ ├── package-manager.ts Detects npm/yarn/pnpm/bun per cloned repo. │ ├── node-version.ts Resolves .nvmrc / .node-version per cloned repo. │ ├── ansi.ts, log-line.ts Log line normalization (used by Datadog endpoint). │ ├── git-remote.ts, slug.ts, timezone.ts ├── jira/ REST v3 client + Zod schemas. ADF→Markdown, subtask creation. ├── github/ Octokit wrapper + PAT/App auth provider with token cache. ├── agent/ Provider-neutral agent clients (selected by AGENT_PROVIDER). cursor-cli-client.ts spawns `cursor-agent`; claude-cli-client.ts spawns `claude` headless; cursor-cloud-client.ts hits the Cursor HTTP API; prompts.ts builds prompts and parses PR_*/REPLY_*/VERDICT markers; client.ts is the barrel. ├── repo/ │ ├── manager.ts Git ops, build validation, configures git auth. │ ├── registry.ts Loads & validates repos.json (Zod). │ ├── selector.ts Keyword scoring → picks the right repo for a ticket. │ └── rules-sync.ts Clones remote rules repos, symlinks .mdc files into target repo's .cursor/rules/. ├── metrics/ │ ├── types.ts `Metrics` port + `MetricTags` type (DI contract). │ ├── names.ts `METRIC` constant map — all `hu_agent.*` metric names. │ ├── noop-recorder.ts `NoopMetrics` — no-op implementation (used when metrics disabled). │ ├── datadog-recorder.ts `DatadogMetrics` — DogStatsD via hot-shots over UDP :8125. │ └── index.ts `createMetrics` factory + barrel export. ├── health/ │ └── auth-checker.ts `AuthChecker` — periodic per-integration auth probe; emits `integration.auth_ok` gauge and logs on transition. ├── queue/memory-queue.ts In-memory FIFO with retry/timeout/job-kind awareness + onHighLoad hook; emits `jobs.*` metrics + periodic queue sampler. ├── poller/ jira-poller, pr-comment-poller, slack-mention-poller (all 5–30 s intervals). │ ├── poll-runner.ts `PollRunner` — shared wrapper used by all pollers; emits `poll.*` metrics and throttled `poll_skipped` log. │ └── comment-filter.ts Pure helpers: shouldIgnoreComment + parseIgnoredCommentLogins (consumed by pr-comment-poller and config). ├── pipeline/ │ ├── fix-pipeline.ts Jira ticket → triage → fix → push → PR. │ ├── pr-comment-pipeline.ts PR comment → reply (+ optional code change) + persist memory. │ ├── slack-mention-pipeline.ts @-mention → answer. │ └── safety.ts Diff safety validator (uses SAFETY constants). ├── pr-memory/store.ts File-backed per-PR memory store (memory/#.json). ├── slack/ WebClient wrapper + message store. ├── datadog/ Logs API client + line formatter. ├── scheduler/ Daily report job (croner) + DynamoDB-backed dedupe store. ├── server/ │ ├── app.ts Hono app factory, mounts /hu-agent/*. │ ├── middleware/ │ └── routes/ health.ts, webhook.ts, debug.ts (the big admin/debug surface). └── types/index.ts Shared types (Repo, Job, Outcome…). ``` Tests (mirror `src/` — one test file per source module, no co-location): ``` test/ ├── _coverage-sentinel.test.ts Side-effect imports to force coverage on all tracked modules. ├── _helpers/ Shared test infrastructure (NOT mirroring src/). │ ├── fake-fetch.ts Fixture-driven fetch mock: createFakeFetch(), jsonResponse(), errorResponse(). │ ├── fake-metrics.ts FakeMetrics recorder — captures calls for assertion in tests. │ └── mock-logger.ts Minimal pino Logger stub with bun:test mock() methods. ├── core/ Tests for src/core/ ├── agent/ Tests for src/agent/ (cursor-cli-client, claude-cli-client, cursor-cloud-client, prompts) ├── datadog/ Tests for src/datadog/ ├── github/ Tests for src/github/ (auth, client) ├── health/ Tests for src/health/ (auth-checker) ├── integration/ Cross-module integration tests (suffix: `.integration.test.ts`). ├── jira/ Tests for src/jira/ ├── metrics/ Tests for src/metrics/ (noop-recorder, datadog-recorder) ├── pipeline/ Tests for src/pipeline/ ├── poller/ Tests for src/poller/ (incl. poll-runner) ├── pr-memory/ Tests for src/pr-memory/ ├── queue/ Tests for src/queue/ ├── repo/ Tests for src/repo/ ├── scheduler/ Tests for src/scheduler/ ├── server/ Tests for src/server/ ├── slack/ Tests for src/slack/ └── utils/ Tests for src/utils/ ``` `test/_helpers/` uses an underscore prefix (like `_coverage-sentinel`) to signal it is shared test infrastructure, not a mirror of any `src/` subdirectory. Add new shared mocks or fixtures there; do not put them in `test/utils/` (which mirrors `src/utils/`). External configuration / data: - [`repos.json`](./repos.json) — managed repositories (URL, keywords, default branch, rules, PR labels). Edit this when adding/removing a repo. - [`rules/`](./rules) — bot-side fallback rules (`general.mdc` + per-repo `.mdc` files). - `workdir/` — runtime clones of managed repos (gitignored). - `memory/` — per-PR memory JSON (gitignored). - `.worktrees/` — local git worktrees for parallel branch work (gitignored). - `docs/superpowers/specs/` — design specs, **version-controlled** (shared with collaborators). `docs/superpowers/plans/` and the rest of `docs/superpowers/` are local scratchpad for `superpowers:*` skills (gitignored). See `.gitignore` (`docs/superpowers/*` + `!docs/superpowers/specs/`). - `infrastructure/` — Terraform (app, env, modules). - `docs/` — pipeline walkthroughs + Insomnia/OpenAPI export. ### Code Conventions (enforced by ESLint + Prettier) - **Strict TS**, no `any` (warn). No unused locals/params (prefix with `_` to silence). - **No `console.log`** — use the injected `pino` `logger`. Only `console.warn/error` are allowed. - **Prettier**: `printWidth: 100`, `semi: true`, `singleQuote: false`, `trailingComma: "all"`, `tabWidth: 2`, `arrowParens: "always"`, `endOfLine: "lf"`. - **Dependency injection**: classes receive their dependencies via constructor (see `src/index.ts`). Don't reach into `process.env` outside `utils/config.ts`. - **Errors**: prefer typed errors from `core/errors.ts`. Pino logs `{ err }` first arg, then message. - **Async**: top-level `await main()` in `index.ts`; otherwise wrap in functions. ### Naming Conventions - **Path alias**: import internal modules as `@/...` (e.g. `import { JiraClientImpl } from "@/jira/client.js"`). The `.js` extension is required in TS specifiers because of `moduleResolution: "bundler"` + ESM. Example: `import { MemoryQueue } from "@/queue/memory-queue.js"`. - **`type` imports**: `import type { … }` (ESLint warns otherwise). - **English-only** in code, comments, commit messages, PR titles/bodies. Even when handling Spanish Jira tickets, the bot's own source stays English. ### Feature Creation Workflow 1. **Read `README.md` and the relevant module before editing.** The README is authoritative for behavior; this file is just orientation. 2. **Keep `src/core/constants.ts` as the single source of truth** for limits, intervals, and forbidden paths. If you tune a limit, update it there — don't sprinkle magic numbers across modules. 3. **Don't reinvent the env loader.** Add new config in `src/utils/config.ts` (Zod schema) and inject through `index.ts`. 4. **Don't introduce a new HTTP client** — use Octokit (GitHub), the existing Jira client, or `fetch` for one-off calls. 5. **Use the existing test runner (Bun's `bun:test`).** Don't introduce vitest / jest / mocha or a new ORM / lockfile without an explicit ask. New tests go under `test/` mirroring `src/`, never co-located. When you change behavior in a module that already has a test, update the test in the same change. 6. **Match existing module shape** when adding a feature: - new external integration → folder under `src//` with a `client.ts` (and `schemas.ts` if using Zod). - new background job → `src/poller/` + `src/pipeline/` pair, wired in `index.ts`, with a new `jobKind` in the queue dispatcher. - new HTTP route → `src/server/routes/` and mount in `app.ts` under `/hu-agent`. 7. **Logger first, throw second.** Every module gets `logger.child({ module: "..." })` from `index.ts`. Keep that pattern. 8. **Never commit secrets.** `.env` is gitignored; use `.env.example` to document new variables. 9. **Don't bypass safety.** The values in `SAFETY` (max 500 lines, 10 files, 3 deletions, forbidden CI paths, lock files) are load-bearing — they protect against runaway agent PRs in the *target* repos. Don't loosen them casually. 10. **Keep changes minimal.** No drive-by reformatting of unrelated files. Run `bun run format` only on files you touched (Prettier handles this when invoked via `lint:fix` or your editor). **Definition of done** (for any change in this repo): - [ ] `bun run check` is green (typecheck + lint + format). - [ ] `bun run test` is green. If you changed behavior covered by a test, the test was updated in the same change. If you added a new public function/module of meaningful complexity, it has at least one test under `test/`. - [ ] `bun run test:coverage` meets the 80% threshold. If you added a new source file that should be covered, remove it from `coveragePathIgnorePatterns` in `bunfig.toml` as part of the same change and provide tests. `coveragePathIgnorePatterns` has exactly 5 permanent entries (`src/index.ts`, `src/types/index.ts`, `src/agent/client.ts`, `test/`, `_helpers`). Any additional entry beyond these five means a source file is still untested — add tests or explicitly document the permanent exclusion in `bunfig.toml` with a comment. - [ ] If you touched `repos.json`, the file still parses against `RepoRegistry`'s Zod schema (try `bun run dev` and watch logs). - [ ] If you touched env handling, `.env.example` is updated. - [ ] If you touched anything under `infrastructure/`, you ran `terraform fmt` on it. - [ ] No new `console.log`, no new `any`, no new lockfiles, no new secrets in tree. - [ ] No commits during multi-step plans — ask the user before `git commit`. ### Operational Facts - Bun `idleTimeout: 120 s` because the Datadog 24 h log search paginates. - Queue concurrency = 1, `JOB_TIMEOUT_MS = 50 min`, `MAX_RETRIES = 2`. - Pollers: Jira every 5 s, PR comments every 30 s, Slack mentions every 30 s — each independently toggleable via env (`JIRA_POLL_ENABLED`, `ATTEND_FEEDBACK_IN_PR_ENABLED`, `SLACK_MENTIONS_ENABLED`). - GitHub auth supports either `GITHUB_TOKEN` (PAT) or GitHub App (`GITHUB_APP_ID` + `GITHUB_APP_INSTALLATION_ID` + `GITHUB_APP_PRIVATE_KEY`). App tokens auto-refresh 5 min before expiry. - Cursor CLI: the bot calls `~/.local/bin/cursor-agent` directly to skip the Cursor wrapper's hang-prone version check. Install via `curl -sS https://cursor.com/install | bash`. - AWS CodeArtifact: `configureCodeArtifactAuth` runs on boot to authenticate npm-based managed repos. If it fails the bot still starts but warns. - DynamoDB (optional, via `DYNAMODB_TABLE_NAME`) backs the daily-report dedupe store. - The bot emits `hu_agent.*` custom DogStatsD metrics (jobs, queue depth, poll activity) to Datadog us5. Every metric carries global tags `env`, `service` and `provider` (the active `AGENT_PROVIDER`, cursor | claude). Dashboards and monitors are built manually in Datadog; there are no Terraform-managed Datadog resources. - `AuthChecker` runs a periodic per-integration auth probe (jira/slack/github/agent) decoupled from the job queue, emitting `integration.auth_ok` gauge so credential failures surface even when the queue is wedged. - Pipeline usage metrics are emitted for all four pipelines: `repo.selected`, `pr.opened`, `fix.build_validation`, `fix.triage_verdict{verdict}`, `fix.outcome{outcome}` (terminal: pr_created | no_changes | needs_info | agent_failed | checks_failed | sync_failed | error), `safety.violation` (observability-only), `agent.invocations{phase}`, `agent.duration{phase}`, `pr_comment.handled{repo,action}`, `pr_mention.handled{repo,action}`, `slack_mention.handled`, `prewarm.warmed{total}` (gauge: repos warmed at boot out of total), `prewarm.duration{total}` (timing: total boot prewarm wall-clock). These enable an agent-usage dashboard in Datadog us5. - **Follow-up (agent.tokens):** add an `agent.tokens{provider,model}` metric. The Cursor CLI/cloud clients don't expose token usage in their responses; the Claude provider does (`usage.input_tokens`/`usage.output_tokens` in the `result` event), so wire it into the agent invocation sites (fix triage/fix/followup, pr-comment, pr-mention, slack) when picking this up. - **Follow-up (safety enforcement):** `safety.violation` is observability-only — the fix pipeline computes diff violations via `getDiffStats()` but does NOT block on them today. Enforcing the `SAFETY` limits (skip/abort a repo when `!passed`) is a separate, deliberate change (it would stop currently-passing large PRs). - **Cold-start prewarm:** on boot the bot full-clones and checks out the base of every managed repo via `prewarmRepos` **before** pollers start; `/health` returns **503** until prewarm resolves, then **200**. Prewarm is fail-open (bounded by `CLONE.PREWARM_TIMEOUT_MS`): on timeout/error the bot marks ready anyway and falls back to on-demand sync, emitting `prewarm.warmed{total}`. `syncAllRepos` installs only when `node_modules` is missing or install inputs changed. ECS `startPeriod` and `health_check_grace_period_seconds` are 600s to cover the longer-but-warm boot. (A shallow `--depth` clone was tried and reverted: it breaks the pr-comment-pipeline `checkoutBranch` path — `--single-branch` reduced refspec + missing three-dot merge-base; see `docs/superpowers/specs/2026-06-03-cold-start-repo-prewarm-design.md` §1.) - `SANDBOX_MODE=true` (dev sandbox only) suppresses every Jira write (transitions, comments, subtasks, squad labels) and opens PRs as draft. The dev environment is endpoint-triggered (pollers off), runs a separate GitHub App (`hu-agent-dev[bot]`), and posts to the `hu-agent-test` Slack channel. **Any new pipeline write path must also be gated by `SANDBOX_MODE`.** - The `Trigger Fix Pipeline` workflow ([`.github/workflows/trigger.yml`](./.github/workflows/trigger.yml), `workflow_dispatch`: `environment` dev/prd + `issue_key`) joins the VPN from the runner and POSTs to the private trigger endpoint — no local VPN needed. Fire-and-forget: green run = HTTP 202 (enqueued). ALB hostnames are hardcoded per env; see `docs/superpowers/specs/2026-06-03-manual-trigger-workflow-design.md`. ### Ports - Server listens on `PORT` (default `3000`) and mounts everything under **`/hu-agent`** (e.g. `GET /hu-agent/health`). ### Environment Variables All env is loaded and validated in one place — `src/utils/config.ts` (Zod schema → `AppConfig`). Never read `process.env` outside that file; add new config there and inject it via `src/index.ts`. Every variable is documented in `.env.example`, which is kept in sync whenever env handling changes. ### Documentation Rules When you add or change a feature, update this `AGENTS.md` **in the same commit/PR** if the change affects cross-cutting concerns: new commands, config/env vars, testing patterns, tech-stack additions, or Docker/infra changes. Keep it minimal — what a developer needs to work here, not a changelog. If a section no longer reflects the code, fix or remove it. Whenever an `AGENTS.md` exists in a directory, verify a `CLAUDE.md` symlink points to it (`ln -s AGENTS.md CLAUDE.md`) — Claude Code reads `CLAUDE.md` (via the `.claude → .cursor` convention), so without the symlink the rules are ignored. When you add a new `AGENTS.md`, also add its row to the Agent Context Map below. ### Agent Context Map | Path | Covers | |------|--------| | `AGENTS.md` (root) | The whole service: principles, stack, commands, source layout, conventions, safety limits, operational facts | | `infrastructure/AGENTS.md` | *(not present yet)* Terraform IaC — add a focused file here if infra rules grow | Single root file is the source of truth; the map exists so the convention is ready if a focused `infrastructure/AGENTS.md` is added later. ### Repo Skills None yet. hu-agent has no `.cursor/skills/` directory. If repo-local skills are added under `.cursor/skills//SKILL.md`, list them here with a "when to use" note so agents invoke them instead of reinventing the procedure. ### Pointers - [`README.md`](./README.md) — full feature, env, and API reference. - [`docs/jira-fix-pipeline.md`](./docs/jira-fix-pipeline.md) — Jira → Fix → PR walkthrough. - [`docs/github-pipeline.md`](./docs/github-pipeline.md) — PR comment pipeline walkthrough. - [`docs/IDEA.md`](./docs/IDEA.md) — original product brief. - [`docs/api-docs/collection.yaml`](./docs/api-docs/collection.yaml) — Insomnia / OpenAPI export of the debug API. - [`rules/general.mdc`](./rules/general.mdc) — global agent rules injected into every prompt the bot sends to Cursor. - [`infrastructure/`](./infrastructure) — Terraform (run `terraform fmt` after edits).