Preflight Checklist
What's Wrong?
Claude Code wrote and executed a SQL script to clean job titles in a PostgreSQL
table (manager_roles, ~24,475 rows representing 3 days of web scraping).
The script used PostgreSQL's regexp_match() with capture groups in the regex
pattern. Because capture groups were present, regexp_match() returned the
captured subgroup content instead of the full match — causing it to return NULL
for nearly every row. The script then deleted all rows where the extracted title
was NULL, wiping 24,472 of 24,475 rows.
The transaction committed before I noticed. I immediately disabled autovacuum
on the table and attempted recovery with pg_dirtyread, but autovacuum had
already run (timestamp: 2026-05-06 15:06:12 UTC) and cleaned the dead tuples
before the extension could be installed.
Impact: 24,472 rows permanently deleted. ~3 days of scraping work lost. I feel like I should be compensated on credits for this MAJOR loss.
What Should Happen?
- Tested the regex on a small SELECT before writing a DELETE
- Shown a row count and asked for confirmation before committing
- Wrapped the script in a transaction with an explicit ROLLBACK safety check
Error Messages/Logs
Steps to Reproduce
-
Have a PostgreSQL table with a numeric(3,2) column and ~24,000 rows
-
Ask Claude Code to clean a text column using regex extraction
-
Claude writes and executes the following SQL script without testing first:
ALTER TABLE manager_roles ADD COLUMN IF NOT EXISTS _clean_title TEXT;
UPDATE manager_roles
SET _clean_title = (regexp_match(job_title,
'Executive Vice President and (Chief Executive Officer|CEO)'
'|Chief Executive Officer'
'|CEO|CFO|...',
'i'
))[1];
DELETE FROM manager_roles WHERE _clean_title IS NULL;
-- (followed by dedup DELETE, UPDATE, DROP COLUMN, COMMIT)
-
Because the regex pattern contains capture groups e.g. (Chief Executive
Officer|CEO), PostgreSQL's regexp_match() returns the captured subgroup
content rather than the full match — returning NULL for nearly every row
-
DELETE FROM manager_roles WHERE _clean_title IS NULL wipes 24,472 of
24,475 rows
-
COMMIT runs. Transaction is permanent.
-
Claude attempts pg_dirtyread recovery but autovacuum had already run
and cleaned dead tuples before the extension could be installed.
-
Data is permanently lost.
Claude Model
Sonnet (default)
Is this a regression?
I don't know
Last Working Version
No response
Claude Code Version
Claude Sonnet 4.6
Platform
Anthropic API
Operating System
macOS
Terminal/Shell
Terminal.app (macOS)
Additional Information
No response
Preflight Checklist
What's Wrong?
Claude Code wrote and executed a SQL script to clean job titles in a PostgreSQL
table (manager_roles, ~24,475 rows representing 3 days of web scraping).
The script used PostgreSQL's regexp_match() with capture groups in the regex
pattern. Because capture groups were present, regexp_match() returned the
captured subgroup content instead of the full match — causing it to return NULL
for nearly every row. The script then deleted all rows where the extracted title
was NULL, wiping 24,472 of 24,475 rows.
The transaction committed before I noticed. I immediately disabled autovacuum
on the table and attempted recovery with pg_dirtyread, but autovacuum had
already run (timestamp: 2026-05-06 15:06:12 UTC) and cleaned the dead tuples
before the extension could be installed.
Impact: 24,472 rows permanently deleted. ~3 days of scraping work lost. I feel like I should be compensated on credits for this MAJOR loss.
What Should Happen?
Error Messages/Logs
Steps to Reproduce
Have a PostgreSQL table with a numeric(3,2) column and ~24,000 rows
Ask Claude Code to clean a text column using regex extraction
Claude writes and executes the following SQL script without testing first:
ALTER TABLE manager_roles ADD COLUMN IF NOT EXISTS _clean_title TEXT;
UPDATE manager_roles
SET _clean_title = (regexp_match(job_title,
'Executive Vice President and (Chief Executive Officer|CEO)'
'|Chief Executive Officer'
'|CEO|CFO|...',
'i'
))[1];
DELETE FROM manager_roles WHERE _clean_title IS NULL;
-- (followed by dedup DELETE, UPDATE, DROP COLUMN, COMMIT)
Because the regex pattern contains capture groups e.g. (Chief Executive
Officer|CEO), PostgreSQL's regexp_match() returns the captured subgroup
content rather than the full match — returning NULL for nearly every row
DELETE FROM manager_roles WHERE _clean_title IS NULL wipes 24,472 of
24,475 rows
COMMIT runs. Transaction is permanent.
Claude attempts pg_dirtyread recovery but autovacuum had already run
and cleaned dead tuples before the extension could be installed.
Data is permanently lost.
Claude Model
Sonnet (default)
Is this a regression?
I don't know
Last Working Version
No response
Claude Code Version
Claude Sonnet 4.6
Platform
Anthropic API
Operating System
macOS
Terminal/Shell
Terminal.app (macOS)
Additional Information
No response