--- description: Database migration safety rules for backend services paths: - "**/*migrations/**" - "**/db/changelog/**" - "**/db/migrate/**" - "**/*.sql" - "**/*changelog*.xml" - "**/*changelog*.yaml" - "**/*migration*" --- ## Database Migration Safety Apply these rules whenever the diff includes migration files: SQL files, Liquibase changesets, Sequelize migration files (`.js`, `.ts`, `.cjs` under a `migrations/` or `*-migrations/` directory), jOOQ-related schema scripts, or any file under a `*migrations/` directory, `db/changelog/`, `db/migrate/`, or matching the pattern `*migration*` / `*changelog*.xml` / `*changelog*.yaml`. ### Blocking patterns to flag **[Important] CONCURRENTLY inside an EXPLICIT transaction** `CREATE INDEX CONCURRENTLY` and `DROP INDEX CONCURRENTLY` cannot run inside a transaction — PostgreSQL raises a runtime ERROR and the migration aborts. **Do NOT assume the migration runner wraps the migration in a transaction.** Sequelize/Umzug migrations run in **autocommit by default** (they are NOT wrapped in an implicit transaction), so a bare `CREATE INDEX CONCURRENTLY` / `DROP INDEX CONCURRENTLY` inside a Sequelize/Umzug migration is correct — do **not** flag it on the theory that the runner will wrap it, and do **not** claim it will fail on a fresh/clean database. Only flag a CONCURRENTLY statement when a wrapping transaction is **visible in the migration itself**: an explicit `queryInterface.sequelize.transaction(...)` / `await sequelize.transaction(...)` envelope around the statement, a raw `BEGIN` … `COMMIT`, or (Liquibase) a changeSet missing `runInTransaction="false"`. The fix is to drop that explicit wrapping (Liquibase: add `runInTransaction="false"`). If the Sequelize/Umzug runner is ever reconfigured to wrap each migration in a transaction, update this rule. **[Critical] CREATE INDEX / DROP INDEX without CONCURRENTLY on a large table** `CREATE INDEX` or `DROP INDEX` (without `CONCURRENTLY`) holds an ACCESS EXCLUSIVE lock (DROP) or SHARE lock (CREATE) on the table for the entire operation, blocking all reads and/or writes. This does NOT fail — it runs silently to completion while production traffic is blocked. On a table with tens of millions of rows the lock can last minutes or hours, causing user-visible downtime. It passes unnoticed in dev because local tables are small. Prefer `CREATE INDEX CONCURRENTLY` / `DROP INDEX CONCURRENTLY` — but only when the migration runs entirely outside a transaction (see the CONCURRENTLY rule above). In Liquibase, pair it with `runInTransaction="false"`. In Sequelize, confirm the migration does not use `queryInterface.sequelize.transaction()`. Flag any plain `CREATE INDEX` or `DROP INDEX` on a table that could be large. **[Critical] ADD COLUMN NOT NULL with DEFAULT on a potentially large table** `ALTER TABLE t ADD COLUMN c INT NOT NULL DEFAULT 0` causes a full table rewrite on PostgreSQL < 11 (ACCESS EXCLUSIVE lock held for the entire rewrite). On PG 11+ it is safe for non-volatile constant defaults, but still dangerous on hot tables with hundreds of millions of rows (lock blocks all concurrent reads/writes during the catalog update, and the backfill spikes replication lag). Use the 3-step pattern instead (see below). **[Important] ALTER COLUMN … SET NOT NULL on a large existing table** `ALTER TABLE t ALTER COLUMN c SET NOT NULL` takes an ACCESS EXCLUSIVE lock and performs a **full table scan** to verify no existing NULLs — blocking all reads and writes for the scan duration. There is no rewrite, but on a large table the scan alone causes user-visible stalls. On PostgreSQL 12+, avoid the blocking scan: (1) `ALTER TABLE t ADD CONSTRAINT c_not_null CHECK (c IS NOT NULL) NOT VALID;` (instant — `NOT VALID` skips the scan), (2) `ALTER TABLE t VALIDATE CONSTRAINT c_not_null;` (scans under SHARE UPDATE EXCLUSIVE — does NOT block reads/writes), (3) `ALTER TABLE t ALTER COLUMN c SET NOT NULL;` (now instant — Postgres trusts the validated constraint), then optionally drop the redundant CHECK. Flag a bare `SET NOT NULL` on a table that could be large. **[Important] Large UPDATE or DELETE without batching** A single `UPDATE t SET …` or `DELETE FROM t WHERE …` that touches millions of rows holds a write lock for the entire duration and causes replication lag spikes. Break into batches with one commit per batch — not one giant transaction wrapping all batches. Do not speculate about the migration runner's transaction behavior. Only flag a batched backfill as "one giant transaction" when the wrapping transaction is visible IN THE DIFF (an explicit `await sequelize.transaction(async (t) => …)` envelope around the loop, a Liquibase changeSet without `runInTransaction="false"`, etc.). The common Sequelize/Umzug pattern `while (hasMore) { await queryInterface.sequelize.query(\`UPDATE … LIMIT n\`); }` runs each batch as autocommit by default — each query commits and releases its row locks before the next iteration. Treat it as correctly batched unless the migration itself shows explicit transactional wrapping. **[Important] ALTER TABLE … SET DATA TYPE** Changing a column's type rewrites the entire table under ACCESS EXCLUSIVE lock. Prefer: add a new column, batch-backfill, swap at the application layer, then drop the old column. **[Important] ADD FOREIGN KEY without NOT VALID** `ALTER TABLE t ADD CONSTRAINT fk FOREIGN KEY … REFERENCES` performs a full table scan under SHARE ROW EXCLUSIVE lock. On large tables, use `NOT VALID` + a separate `VALIDATE CONSTRAINT` (weaker lock, interruptible): ```sql ALTER TABLE t ADD CONSTRAINT fk FOREIGN KEY (col) REFERENCES other(id) NOT VALID; ALTER TABLE t VALIDATE CONSTRAINT fk; ``` **[Important] DROP COLUMN, DROP TABLE, RENAME TABLE, or RENAME COLUMN before application code is redeployed** Dropping or renaming a column or table before the code that references it is out of production causes immediate runtime errors (`column not found`, `table not found`, `relation not found`). RENAME is just as dangerous as DROP — the old name disappears instantly. These migrations must run AFTER the application deploy that removes or updates all references to the old name. ### Recommended 3-step pattern for NOT NULL + DEFAULT Whenever the diff adds a column that will eventually be NOT NULL, recommend this pattern: 1. **Add nullable column** — `ALTER TABLE t ADD COLUMN c TYPE;` — fast, no lock, no rewrite. 2. **Batched backfill** — update rows in small batches (e.g. 1 000–10 000 rows), committing after each batch so the write lock is released between iterations: ```sql DO $$ DECLARE lo BIGINT; hi BIGINT; BEGIN FOR lo IN SELECT generate_series(0, (SELECT MAX(id) FROM t), 5000) LOOP hi := lo + 4999; UPDATE t SET c = WHERE id BETWEEN lo AND hi; COMMIT; END LOOP; END $$; ``` 3. **Constrain** — once all rows are populated, set the default and enforce NOT NULL. `ALTER COLUMN c SET DEFAULT ` is instant, but `ALTER COLUMN c SET NOT NULL` triggers a full table scan under ACCESS EXCLUSIVE (see the SET NOT NULL rule above) — on a large table, use the `CHECK (c IS NOT NULL) NOT VALID` → `VALIDATE CONSTRAINT` → `SET NOT NULL` sequence to avoid the blocking scan.