[BUG] Prompt cache regression in --print --resume since v2.1.69(?): cache_read never grows, ~20x cost increase

### Preflight Checklist

- [x] I have searched [existing issues](https://github.com/anthropics/claude-code/issues?q=is%3Aissue%20state%3Aopen%20label%3Abug) and this hasn't been reported yet
- [x] This is a single bug report (please file separate reports for different bugs)
- [x] I am using the latest version of Claude Code

### What's Wrong?

## Summary

`--print --resume` sessions stopped caching conversation turns between API calls starting around v2.1.69. Only Claude Code's internal system prompt (~14.5k tokens) is cached; all conversation history is `cache_create`d from scratch on every message. **This causes a ~20x cost increase per message compared to v2.1.68.**

## Environment

- **Platform:** Ubuntu (Hetzner VPS)
- **Use case:** Discord bot using `claude --print --model <model> --resume <session-id> --output-format stream-json --verbose` with prompts piped via stdin
- **Tested models:** `claude-opus-4-6[1m]`, `opus`, `claude-opus-4-5-20251101`


The regression is version-dependent, not model-dependent.

## Suspect

Something in newer updates after 2.1.68 may have inadvertently broken cache breakpoint placement for `--print --resume` sessions.

## Workaround

Pinned to v2.1.68 (`npm install -g @anthropic-ai/claude-code@2.1.68`).

### What Should Happen?

## Expected behavior (v2.1.68)

`cache_read` grows as conversation accumulates, `cache_create` drops to a small delta (~800 tokens):

```
Message 1: cache_read=13,997  cache_create=22,946  cost=$0.15  (cold start)
Message 2: cache_read=32,849  cache_create=4,636   cost=$0.05
Message 3: cache_read=36,846  cache_create=879     cost=$0.03
Message 4: cache_read=37,295  cache_create=802     cost=$0.02
```

## Actual behavior (v2.1.76 and likely earlier versions after v2.1.68)

`cache_read` is stuck at ~14.5k (Claude Code's system prompt only), `cache_create` equals the full conversation size and grows every message:

```
Message 1: cache_read=14,569  cache_create=54,437  cost=$0.35
Message 2: cache_read=14,569  cache_create=55,084  cost=$0.35
Message 3: cache_read=14,569  cache_create=55,512  cost=$0.35
Message 4: cache_read=14,569  cache_create=55,733  cost=$0.36
Message 5: cache_read=14,569  cache_create=55,954  cost=$0.36
```

The conversation turns are never reused from cache between calls. Only Claude Code's internal system prompt (~14.5k tokens) caches successfully.

### Error Messages/Logs

```shell
## Testing matrix

All tests used fresh session UUIDs and back-to-back messages (well within the 5-minute cache TTL):

| Version | Model | Context | cache_read grows? | Steady-state cost/msg |
|---------|-------|---------|-------------------|----------------------|
| 2.1.68 | `opus` | 200k | **Yes** | ~$0.02 |
| 2.1.68 | `claude-opus-4-6[1m]` | 1M | **Yes** | ~$0.02 |
| 2.1.76 | `opus` | 200k | **No (stuck at 14.5k)** | ~$0.04-0.40 (grows) |
| 2.1.76 | `claude-opus-4-6[1m]` | 1M | **No (stuck at 14.5k)** | ~$0.35-0.40 |
| 2.1.76 | `claude-opus-4-5-20251101` | 200k | **No (stuck at 14.5k)** | ~$0.04-0.40 (grows) |
```

### Steps to Reproduce

## Reproduction

1. Run `claude --print --resume <session-id> --output-format stream-json --verbose` with a prompt via stdin
2. Send 3+ messages to the same session
3. Observe `cache_read_input_tokens` and `cache_creation_input_tokens` in the stream-json `result` output

### Claude Model

Opus

### Is this a regression?

Yes, this worked in a previous version

### Last Working Version

2.1.68

### Claude Code Version

2.1.76

### Platform

Other

### Operating System

Ubuntu/Debian Linux

### Terminal/Shell

Other

### Additional Information

This report (including the testing matrix) was written by Claude Code during a debugging session.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Prompt cache regression in --print --resume since v2.1.69(?): cache_read never grows, ~20x cost increase #34629

Preflight Checklist

What's Wrong?

Summary

Environment

Suspect

Workaround

What Should Happen?

Expected behavior (v2.1.68)

Actual behavior (v2.1.76 and likely earlier versions after v2.1.68)

Error Messages/Logs

Steps to Reproduce

Reproduction

Claude Model

Is this a regression?

Last Working Version

Claude Code Version

Platform

Operating System

Terminal/Shell

Additional Information

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[BUG] Prompt cache regression in --print --resume since v2.1.69(?): cache_read never grows, ~20x cost increase #34629

Description

Preflight Checklist

What's Wrong?

Summary

Environment

Suspect

Workaround

What Should Happen?

Expected behavior (v2.1.68)

Actual behavior (v2.1.76 and likely earlier versions after v2.1.68)

Error Messages/Logs

Steps to Reproduce

Reproduction

Claude Model

Is this a regression?

Last Working Version

Claude Code Version

Platform

Operating System

Terminal/Shell

Additional Information

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions