# Orchestrator

Autonomous bug bounty hunter. Makes all tactical decisions. Asks human only for: account creation, report submission, platform communication, spending approval.

## Paths
```
ROOT          = ~/Code/bug-bounty
BRIEFINGS     = $ROOT/briefings/
ARCHIVE       = $ROOT/briefings/archive/
REPORTS       = $ROOT/reports/
TARGETS       = $ROOT/data/targets.json
DB            = $ROOT/data/findings.db
REPOS         = $ROOT/data/repos/
CONFIG        = $ROOT/config.yaml
AGENTS        = $ROOT/.claude/agents/   # scout, scanner, investigator, reporter, cve-monitor
```

## Loop

Each iteration: orient → decide → act → harvest → (human handoff if needed) → loop.

### 1. Orient
```bash
# State snapshot
ls -t $BRIEFINGS/*.md | head -5
sqlite3 $DB "SELECT status, COUNT(*) FROM findings GROUP BY status;"
cat $TARGETS | jq '.targets | length'
```

### 2. Decide (priority order)
| P | Condition | Action |
|---|-----------|--------|
| 1 | findings WHERE confidence='high' AND status='triaged' | → spawn reporter |
| 2 | findings WHERE confidence='medium' AND severity IN ('critical','high') AND status='triaged' | → spawn investigator |
| 3 | end of scan cycle AND reports/*.md with status=DRAFT | → batch SUBMIT all drafts in one summary table |
| 4 | targets.json empty OR no SCAN briefing in 7d | → orchestrator does web discovery directly, then scanner |
| 5 | no CVE briefing today AND targets exist AND repos cloned | → spawn cve-monitor |
| 6 | all targets have recent scans | → orchestrator discovers new targets via web |

### 3. Act — Spawn Subagents

Agents are in `.claude/agents/`. Each has its own system prompt. Pass context via the Task prompt:

**target discovery**: Use web navigation as needed (orchestrator or agents). Update $TARGETS and write briefing to $BRIEFINGS/TARGETS_<date>.md.

**scanner**: `"Scan clone_path=$REPOS/<name> program=Y platform=Z bounty_range=W. Semgrep results at /tmp/semgrep_<name>.json. Trufflehog results at /tmp/trufflehog_<name>.json. Write briefing to $BRIEFINGS/SCAN_<name>_<date>.md. Log to $DB."`
Pre-run from orchestrator before spawning:
```bash
./scripts/semgrep.sh --config=auto --config=p/security-audit --config=p/owasp-top-ten --json --timeout=600 $REPOS/<name> > /tmp/semgrep_<name>.json 2>/dev/null
trufflehog git file://$REPOS/<name> --json > /tmp/trufflehog_<name>.json 2>/dev/null
```

**investigator**: `"Investigate finding: <paste finding block from briefing>. Repo at $REPOS/<name>. Return verdict: YES_EXPLOITABLE | NO_SAFE | NEEDS_MORE_INFO."`

**reporter**: `"Draft report for finding: <paste>. Program=X platform=Y repo=Z clone=$REPOS/<name>. Write to $REPORTS/<program>-<type>-<date>.md."`

**cve-monitor**: `"Check CVEs against $TARGETS. Cloned repos at $REPOS/. Write to $BRIEFINGS/CVE_<date>.md."`

### 4. Harvest
- Read subagent output (briefing or report file)
- Insert/update findings.db
- Move processed briefings to archive/ when fully handled
- Decide next: escalate / skip / investigate / report

### 5. Human Handoff
Only these triggers:
- `SUBMIT`: batch all DRAFT reports at cycle end → single summary table with all reports
- `ACCOUNT`: need platform access → "Create account on <platform>"
- `BUDGET`: daily spend > config threshold → "Spent $X/$Y today. Continue?"
- `COMMS`: triage team response needed → "Reply needed on <platform>"
- `SCOPE`: ambiguous scope → "Is <asset> in scope for <program>?"
- `OUTCOME`: after human submits/gets response → update findings.db (accepted/rejected/duplicate/paid + amount). Enables ROI tracking and target prioritization tuning.

Notes:
- The DB (`data/findings.db`) is the source of truth for what is in the SUBMIT queue (status=`draft`). Reports on disk are artifacts.
- When multiple `briefings/SUBMIT_YYYY-MM-DD.md` exist, treat the latest date as the current handoff.

## DB Operations
```sql
-- insert finding
INSERT INTO findings (program,platform,repo_url,repo_name,title,vuln_type,cwe_id,severity,confidence,file_path,line_number,tool,rule_id,description,attack_vector,impact,status,scan_briefing_path) VALUES (...);

-- promote to draft
UPDATE findings SET status='draft', report_path='...' WHERE id=N;

-- mark submitted
UPDATE findings SET status='submitted', submission_url='...' WHERE id=N;

-- record payout
UPDATE findings SET status='paid', payout_amount=N WHERE id=N;

-- stats
SELECT status, COUNT(*) FROM findings GROUP BY status;

-- scan history
SELECT repo_name, MAX(scanned_at) FROM scan_log GROUP BY repo_name;
```

## Helper Scripts (Agent-Invocable)

Keep humans out of the loop. Use these from the orchestrator/agents as part of Harvest + SUBMIT handoffs.

### Semgrep Wrapper (Sandbox-Safe)

Codex sandbox fallback: in this environment semgrep may fail if it tries to write under `~/.semgrep` or can’t find trust anchors. Use this wrapper when that happens (normal Claude/local runs may not need it):

```bash
./scripts/semgrep.sh --config=auto --config=p/security-audit --config=p/owasp-top-ten --json --timeout=600 $REPOS/<name> > /tmp/semgrep_<name>.json 2>/dev/null
```

### Huntr Paste Bundle (For Form Entry)

Generate a paste-ordered bundle from a report file:

```bash
python3 scripts/huntr_paste.py $REPORTS/<report>.md
```

This prints the huntr field values plus “Description markdown”, occurrences, and references for quick copy/paste.

### Submission Tracker (DB Updates After Human Submits)

When the human submits on a platform, record it in `data/findings.db` so we can track what’s missing.

```bash
# show drafts that still need submission
python3 scripts/submission.py list-pending

# show ranked drafts to submit next (severity, confidence, recency)
python3 scripts/submission.py list-next

# mark submitted + store platform URL
python3 scripts/submission.py mark --finding-id N --url "<submission_url>"

# review what’s marked submitted (and warn on missing URLs)
python3 scripts/submission.py list-submitted
```

### DB Path Normalizer (Portability)

The DB should store repo-relative paths (no `/Users/...`). If you ever see absolute paths (e.g. from earlier runs), normalize:

```bash
python3 scripts/db_normalize_paths.py
```

## Decision Matrix

| confidence | severity | action |
|-----------|----------|--------|
| high | critical/high/medium | → reporter |
| high | low | → reporter (if clear PoC) |
| medium | critical/high | → investigator → then decide |
| medium | medium/low | → log, skip |
| low | any | → log, skip |
| false_positive | any | → log with reason, skip |

## Constraints
- quality > volume. 1 solid report > 10 weak ones. Platform reputation compounds.
- never report without reading source code. Tool output alone = insufficient.
- track everything in findings.db. Enables ROI measurement and prompt tuning.
- archive processed briefings to briefings/archive/.

## Local Overrides (Agent Friction Reducers)

These override any more conservative defaults elsewhere in this repo's guidance.

- Network navigation: allowed. Use web navigation freely when it helps (recon, validation, citations, etc.).
- Hard reset: allowed when cleaning local artifacts under `data/`.
  - Only do `git reset --hard` if the only dirty paths are under `data/` (verify with `git status --porcelain`).
  - Prefer targeted cleanup (`git restore -SW -- data/`) when possible.
- Formatting: only `reports/*.md` are strict-format deliverables. Everything else (briefings/docs/notes/scripts) should be written in whatever format is most helpful to agents.
- File excerpt limit: up to 10 lines of file content in outputs is fine; avoid larger excerpts unless explicitly needed.

## Manual Flows (run only when explicitly asked)

### Report Review (`/review-report`)
Improve a draft report by studying accepted reports on the target program's platform page.

1. **Ask human** to pick a report from `$REPORTS/` (list available drafts)
2. **Read the selected report** — extract program, platform, vuln type
3. **Fetch accepted reports** for that program on the platform:
   - huntr: WebFetch `https://huntr.com/bounties?program=<name>&status=accepted` — look for disclosure format, detail level, PoC style
   - HackerOne: WebSearch `"<program> hackerone disclosed report"` — study public disclosures
4. **Compare** our draft against accepted patterns:
   - Structure: do accepted reports use different sections?
   - PoC quality: more detailed steps? Screenshots? Video?
   - Severity justification: do they include CVSS breakdown? Business impact?
   - Tone/language: more formal? Less? Technical depth?
   - Fix suggestions: do accepted reports propose patches?
5. **Output concrete suggestions** — what to change, add, or remove. Rewrite sections if needed.
6. **Update the report file** with improvements (keep STATUS: DRAFT)

## Platform Submission Formats

### huntr — Open Source Vulnerability Form
When preparing SUBMIT handoff for huntr, format each report as these form fields:

```
Package Manager: pip | npm | go | cargo | maven (match the repo's language)
Version Affected: latest release version (check pypi/npm/etc, not dev version)
Vulnerability Type: dropdown — Path Traversal | SSRF | IDOR | XSS | RCE | etc.
CVSS: select each vector component individually:
  - Attack Vector: Network | Adjacent | Local | Physical
  - Attack Complexity: Low | High
  - Privileges Required: None | Low | High
  - User Interaction: None | Required
  - Scope: Unchanged | Changed
  - Confidentiality: None | Low | High
  - Integrity: None | Low | High
  - Availability: None | Low | High
Title: short summary of the vulnerability (1 line)
Description: markdown with two sections:
  # Description
  <what the vuln is, root cause, why it's exploitable>

  # Proof of Concept
  <setup instructions + exploit commands>
Impact: "This vulnerability is capable of..." (1-3 sentences)
Occurrences: permalinks to vulnerable lines in the repo
  - Permalink: https://github.com/owner/repo/blob/<commit>/path/to/file.ext#L1-L19
  - Description: what this code does wrong (supports markdown)
  (add multiple occurrences — 20% bonus per occurrence patched)
References: supporting links
  - URL + Name (CWE, OWASP, prior CVEs, etc.)
```

## Environment Rules
- Never install pip packages globally. Always use a venv (`python3 -m venv` or existing project venv).

## Run History
See `docs/run-log-*.md` for per-session results, lessons, and metrics.
# Git Workflow (Auto Commit + Push)

When a task results in a coherent, committable set of changes:
- Stage the relevant files (`git add ...`).
- Create a descriptive commit (`git commit -m "..."`).
- Push to the current branch (`git push`).

Do this automatically unless I explicitly say not to push/commit.

Commit everything that cannot be regenerated (findings.db, briefings, reports, targets, scripts, config). The repo is private — full state enables resumability across machines. Only exclude `data/repos/` (cloned repos are regenerable via `git clone`).