# Memory Consolidation + Opportunistic Scheduler RFC Status: draft Owner: Sebas Date: 2026-03-26 ## Intent Add a periodic memory consolidation stage to `agents-database` that is explicit, auditable, and catch-up friendly on a laptop. The scheduler must run at the next available moment if the machine was offline at the planned time. ## Requirements (selected) - Keep the existing capture→extract→consolidate→promote→index flow. - Explicit provenance, contradictions, and no silent overwrite. - Preserve originals; no hard deletes during consolidation. - Hybrid retrieval remains functional without embeddings. - LLM steps use **MiniMax M2.5 Free** (only when required). - Scheduler is **opportunistic** (runs when available, not only at exact times). - Ship a complete, v1-safe solution; iterate later. ## Non-Goals - External cron-only scheduling. - Always-on cloud infra or multi-device coordination. - Autonomous truth rewriting without explicit promotion rules. ## Proposed Architecture ### 1) Opportunistic Scheduler (Next Available) Instead of relying on cron/launchd timing alone, schedule lives in the DB and the engine runs jobs when it is awake and detects a missed or due run. Trigger points: - Engine startup. - Periodic engine tick (e.g., every 15 minutes while running). - Any write operation that touches the DB (optional fast check). Semantics: - Each job has a `next_due_at` and `window`. - If `now >= next_due_at`, run immediately. - If machine was asleep/offline, the job runs on the next available tick. - After run, compute the next due time based on cadence and configured windows. This avoids missed runs without requiring exact runtime availability. ### 2) Consolidation Pipeline Stages per job run: 1. **Select candidates** from `memories` with status `inbox` or `episode` older than a minimum age. 2. **Dedupe** via similarity signals: - exact match (content + scope + type) - lexical overlap - embedding similarity (if present) - metadata / temporal proximity 3. **Conflict detection**: if contradictory claims, create `memory_conflicts` + `memory_links` with `contradicts`. 4. **Promotion**: only promote to `active` when above confidence threshold. 5. **Archival**: archive duplicates; never delete originals. LLM usage: - Only for synthesis/abstraction when similarity is mid-range (not near-exact). - Use MiniMax M2.5 Free model for summaries/synthesis. - Hard budget caps per run; fallback to non-LLM merge if budget is exceeded. ### 3) Cost + Safety Guardrails - Max LLM calls per batch. - Budget cap per day/week. - Preserve originals toggle on by default. - Manual review queue for low-confidence or conflict-heavy items. ## Data Model Additions ### maintenance_jobs Stores periodic job definitions and state. ```sql CREATE TABLE IF NOT EXISTS maintenance_jobs ( id TEXT PRIMARY KEY, job_type TEXT NOT NULL, -- e.g. consolidation cadence TEXT NOT NULL, -- daily | weekly | interval interval_minutes INTEGER, -- for interval cadence window_start TEXT, -- optional time window start (local time) window_end TEXT, -- optional time window end (local time) next_due_at TEXT NOT NULL, last_run_at TEXT, last_status TEXT, last_summary TEXT, enabled INTEGER NOT NULL DEFAULT 1, metadata_json TEXT NOT NULL, created_at TEXT NOT NULL, updated_at TEXT NOT NULL ); ``` ### maintenance_runs Audit trail per execution. ```sql CREATE TABLE IF NOT EXISTS maintenance_runs ( id TEXT PRIMARY KEY, job_id TEXT NOT NULL, status TEXT NOT NULL, -- running | completed | failed started_at TEXT NOT NULL, completed_at TEXT, summary TEXT, stats_json TEXT NOT NULL, error_message TEXT, FOREIGN KEY(job_id) REFERENCES maintenance_jobs(id) ); ``` ### memory_conflicts Explicit conflict registry. ```sql CREATE TABLE IF NOT EXISTS memory_conflicts ( id TEXT PRIMARY KEY, memory_id TEXT NOT NULL, conflicting_memory_id TEXT NOT NULL, reason TEXT NOT NULL, created_at TEXT NOT NULL, metadata_json TEXT NOT NULL, FOREIGN KEY(memory_id) REFERENCES memories(id), FOREIGN KEY(conflicting_memory_id) REFERENCES memories(id) ); ``` Indexes: ```sql CREATE INDEX IF NOT EXISTS idx_maintenance_jobs_due ON maintenance_jobs(next_due_at); CREATE INDEX IF NOT EXISTS idx_maintenance_runs_job ON maintenance_runs(job_id); CREATE INDEX IF NOT EXISTS idx_memory_conflicts_mem ON memory_conflicts(memory_id); ``` ## Consolidation Heuristics (v1) - Candidate age: >= 7 days for `episode` and `inbox`. - Dedup threshold: - exact match: immediate archive + link - high similarity (>= 0.90): merge - medium similarity (0.75–0.89): LLM synthesize (MiniMax M2.5 Free) - low similarity: no merge - Promotion threshold: confidence >= 0.70 - Conflict threshold: similarity below merge threshold but overlapping entities/keywords with contradictory polarity. ## LLM Configuration - Provider/model: **MiniMax M2.5 Free** - Use only when `strategy == synthesize` or `abstract`. - Token budget caps per run; hard stop with non-LLM fallback. Execution (v1): - Calls the configured CLI runner and passes the prompt via stdin. - Job metadata can override the command via `llm_command`. ## Scheduler Details (v1) ### Job definition - Daily light runs at 10:00 and 15:00: dedupe + promotion checks (no LLM by default). - Weekly deep run: include LLM synthesis and abstraction. ### Next-available algorithm (pseudo) ```python for job in maintenance_jobs where enabled=1: if now >= job.next_due_at: run_job(job) job.last_run_at = now job.next_due_at = compute_next_due(job) ``` `compute_next_due(job)` respects `window_start/window_end`; if outside window, schedule the next window start. If missed, run immediately and schedule the next window. ## Gotchas - Job windows are interpreted as UTC (based on stored ISO timestamps). - If the configured runner is unavailable, LLM synthesis is skipped without failing the job. - `maintenance-daemon` is cooperative; run it via your preferred login/session manager for continuous checks. - Launchd templates need path edits if your repo lives elsewhere. CLI usage: ```bash .venv/bin/python -m shared_agent_memory.cli --db data/shared-agent-memory.sqlite3 maintenance-status .venv/bin/python -m shared_agent_memory.cli --db data/shared-agent-memory.sqlite3 maintenance-tick .venv/bin/python -m shared_agent_memory.cli --db data/shared-agent-memory.sqlite3 maintenance-config job_consolidation_daily_10 --window-start 10:00 --recompute-next .venv/bin/python -m shared_agent_memory.cli --db data/shared-agent-memory.sqlite3 maintenance-daemon --interval-seconds 900 ``` Launchd template: - `docs/launchd/agents-database-maintenance.plist` Linux systemd user service: - `systemd/user/agents-database-maintenance.service` - install with `ln -sf /mnt/rpi/agents-database/systemd/user/agents-database-maintenance.service /home/sebas/.config/systemd/user/agents-database-maintenance.service` - enable with `systemctl --user daemon-reload && systemctl --user enable --now agents-database-maintenance.service` - requires `loginctl enable-linger sebas` if the machine should restart the user service after reboot without waiting for login ## Progress Log (living) - 2026-03-26: RFC drafted with opportunistic scheduler + consolidation pipeline. - 2026-03-26: Implemented maintenance tables, scheduler hooks, and consolidation runner (v1). - 2026-03-27: Added maintenance config + daemon CLI commands. - 2026-03-30: Fixed the maintenance daemon wrapper to run from a repo checkout by exporting `PYTHONPATH` and preferring `./.venv/bin/python`. - 2026-03-30: Added and enabled a Linux `systemd --user` service with `Restart=always`; verified auto-restart after kill and confirmed `Linger=yes` for reboot persistence. ## Open Questions - Whether to run the engine as a long-lived background process or on-demand via CLI. - Which storage to use for LLM usage counters (likely `maintenance_runs.stats_json`). ## Proposed Next Steps 1. Add scheduler hook coverage beyond the daemon path if we want startup/write-triggered opportunistic runs inside normal service usage. 2. Add tests for daemon/bootstrap operation and job catch-up across missed windows. 3. Add LLM budget counters per run/day to `maintenance_runs.stats_json` or a dedicated table. 4. Decide whether to keep Linux `systemd --user` as the primary runtime and treat launchd as secondary documentation.