Summary
Tokscale 2.1.0 still appears to double-count Codex native subagent usage when a forked child JSONL preserves/copies parent context before the child's own turn_context.
This is related to, but distinct from, #490 / #496:
Environment
tokscale 2.1.0
- Codex CLI
0.128.0
- macOS
- Codex sessions under
~/.codex/sessions/...
Observed impact
Four Codex native subagent sessions in the same hour showed a tokscale hourly spike of roughly:
cache read: ~463M
final child JSONL totals summed: ~471.6M tokens
Drilling into raw JSONL showed most of that was inherited parent context copied into each child log, not child-local work.
For the four affected child JSONLs:
sum(child final total_token_usage.total_tokens): 471,647,952
sum(last inherited parent total before turn_context): 466,596,351
rough post-fork cumulative delta: 5,051,601
A parser that skips inherited pre-child records and sums deduped last_token_usage after the child turn_context reports the four children at roughly:
non-cached input: 547,572
output: 22,276
cache read: 4,779,520
billed-style total excluding reasoning/cache write: 5,349,368
So the current rollup can overstate these child sessions by almost two orders of magnitude.
Diagnostic shape
The affected child file begins like this structurally:
{"type":"session_meta","payload":{"id":"child-session","forked_from_id":"parent-session","source":{"subagent":{"thread_spawn":{"parent_thread_id":"parent-session","depth":1}}},"model_provider":"openai", ...}}
... inherited parent event_msg/user_message/agent_message/token_count records ...
{"type":"session_meta","payload":{"id":"parent-session", ...}}
... more inherited parent records, including token_count ...
{"type":"turn_context","payload":{"model":"gpt-5.5", ...}}
{"type":"event_msg","payload":{"type":"user_message","message":"child-local prompt"}}
{"type":"event_msg","payload":{"type":"token_count","info":{"total_token_usage":{...},"last_token_usage":{...}}}}
Important detail: in the real trace, hundreds of inherited parent token_count records appear before the child's first turn_context. They are not merely model-less rows to buffer and later attribute. They are inherited parent history and should be excluded from the child-local usage stream.
Why #496 is not sufficient
PR #496 buffers model-less Codex token_count rows until a later turn_context, then flushes them with the model. That is right for resumed-session attribution.
For forked subagent logs, buffering the pre-turn_context inherited rows and assigning them to the child model still counts parent history inside the child.
Current main has a forked-history test, but the fixture has child turn_context before the repeated parent-like token row. The real failure shape is stronger: copied parent records occur before the child turn_context, immediately after a forked session_meta.
Suggested fix
When parsing a Codex JSONL file:
- Detect forked child session metadata:
entry.type == "session_meta"
payload.forked_from_id is present
-
Preserve child session metadata: id, provider, cwd/workspace, agent nickname, forked_from_id.
-
Suppress inherited records until the child's first turn_context:
if forked_child_session && !seen_child_turn_context:
ignore event_msg / response_item token and conversation records
also ignore copied parent session_meta without forked_from_id
- Start normal token parsing at the first child
turn_context and count subsequent token_count rows using the existing last_token_usage / dedup logic.
This is what fixed the same trace in a local Gaal parser patch. After reindexing the four affected child rows, Gaal changed from ~471.6M apparent child tokens to ~5.35M child-local tokens.
Minimal regression fixture
A compact test should ensure the inherited usage is not emitted:
{"timestamp":"2026-05-05T21:51:57.991Z","type":"session_meta","payload":{"id":"child-session","forked_from_id":"parent-session","cwd":"/repo","cli_version":"0.128.0"}}
{"timestamp":"2026-05-05T21:51:57.992Z","type":"session_meta","payload":{"id":"parent-session","cwd":"/repo","cli_version":"0.128.0"}}
{"timestamp":"2026-05-05T21:51:57.993Z","type":"event_msg","payload":{"type":"user_message","message":"parent prompt copied into child log"}}
{"timestamp":"2026-05-05T21:51:57.994Z","type":"event_msg","payload":{"type":"token_count","info":{"total_token_usage":{"input_tokens":116000,"cached_input_tokens":114000,"output_tokens":1000,"total_tokens":117000},"last_token_usage":{"input_tokens":73000,"cached_input_tokens":72000,"output_tokens":500,"total_tokens":73500}}}}
{"timestamp":"2026-05-05T21:51:58.947Z","type":"turn_context","payload":{"model":"gpt-5.5","cwd":"/repo"}}
{"timestamp":"2026-05-05T21:51:58.948Z","type":"event_msg","payload":{"type":"user_message","message":"child-local prompt"}}
{"timestamp":"2026-05-05T21:51:59.253Z","type":"event_msg","payload":{"type":"token_count","info":{"total_token_usage":{"input_tokens":117500,"cached_input_tokens":115000,"output_tokens":1200,"total_tokens":118700},"last_token_usage":{"input_tokens":1500,"cached_input_tokens":1000,"output_tokens":200,"reasoning_output_tokens":50,"total_tokens":1700}}}}
Expected parsed token row after cache split:
input: 500
cacheRead: 1000
output: 200
reasoning: 50
model: gpt-5.5
The inherited parent last_token_usage of 73000/72000/500 should not be counted in the child session at all.
Summary
Tokscale 2.1.0 still appears to double-count Codex native subagent usage when a forked child JSONL preserves/copies parent context before the child's own
turn_context.This is related to, but distinct from, #490 / #496:
token_countevents replayed beforeturn_contextin resumed sessions.session_meta.payload.forked_from_idis present and the child log contains inherited parent records before the child's firstturn_context.Environment
tokscale 2.1.00.128.0~/.codex/sessions/...Observed impact
Four Codex native subagent sessions in the same hour showed a tokscale hourly spike of roughly:
Drilling into raw JSONL showed most of that was inherited parent context copied into each child log, not child-local work.
For the four affected child JSONLs:
A parser that skips inherited pre-child records and sums deduped
last_token_usageafter the childturn_contextreports the four children at roughly:So the current rollup can overstate these child sessions by almost two orders of magnitude.
Diagnostic shape
The affected child file begins like this structurally:
{"type":"session_meta","payload":{"id":"child-session","forked_from_id":"parent-session","source":{"subagent":{"thread_spawn":{"parent_thread_id":"parent-session","depth":1}}},"model_provider":"openai", ...}} ... inherited parent event_msg/user_message/agent_message/token_count records ... {"type":"session_meta","payload":{"id":"parent-session", ...}} ... more inherited parent records, including token_count ... {"type":"turn_context","payload":{"model":"gpt-5.5", ...}} {"type":"event_msg","payload":{"type":"user_message","message":"child-local prompt"}} {"type":"event_msg","payload":{"type":"token_count","info":{"total_token_usage":{...},"last_token_usage":{...}}}}Important detail: in the real trace, hundreds of inherited parent
token_countrecords appear before the child's firstturn_context. They are not merely model-less rows to buffer and later attribute. They are inherited parent history and should be excluded from the child-local usage stream.Why #496 is not sufficient
PR #496 buffers model-less Codex
token_countrows until a laterturn_context, then flushes them with the model. That is right for resumed-session attribution.For forked subagent logs, buffering the pre-
turn_contextinherited rows and assigning them to the child model still counts parent history inside the child.Current main has a forked-history test, but the fixture has child
turn_contextbefore the repeated parent-like token row. The real failure shape is stronger: copied parent records occur before the childturn_context, immediately after a forkedsession_meta.Suggested fix
When parsing a Codex JSONL file:
Preserve child session metadata: id, provider, cwd/workspace, agent nickname,
forked_from_id.Suppress inherited records until the child's first
turn_context:turn_contextand count subsequenttoken_countrows using the existinglast_token_usage/ dedup logic.This is what fixed the same trace in a local Gaal parser patch. After reindexing the four affected child rows, Gaal changed from ~471.6M apparent child tokens to ~5.35M child-local tokens.
Minimal regression fixture
A compact test should ensure the inherited usage is not emitted:
{"timestamp":"2026-05-05T21:51:57.991Z","type":"session_meta","payload":{"id":"child-session","forked_from_id":"parent-session","cwd":"/repo","cli_version":"0.128.0"}} {"timestamp":"2026-05-05T21:51:57.992Z","type":"session_meta","payload":{"id":"parent-session","cwd":"/repo","cli_version":"0.128.0"}} {"timestamp":"2026-05-05T21:51:57.993Z","type":"event_msg","payload":{"type":"user_message","message":"parent prompt copied into child log"}} {"timestamp":"2026-05-05T21:51:57.994Z","type":"event_msg","payload":{"type":"token_count","info":{"total_token_usage":{"input_tokens":116000,"cached_input_tokens":114000,"output_tokens":1000,"total_tokens":117000},"last_token_usage":{"input_tokens":73000,"cached_input_tokens":72000,"output_tokens":500,"total_tokens":73500}}}} {"timestamp":"2026-05-05T21:51:58.947Z","type":"turn_context","payload":{"model":"gpt-5.5","cwd":"/repo"}} {"timestamp":"2026-05-05T21:51:58.948Z","type":"event_msg","payload":{"type":"user_message","message":"child-local prompt"}} {"timestamp":"2026-05-05T21:51:59.253Z","type":"event_msg","payload":{"type":"token_count","info":{"total_token_usage":{"input_tokens":117500,"cached_input_tokens":115000,"output_tokens":1200,"total_tokens":118700},"last_token_usage":{"input_tokens":1500,"cached_input_tokens":1000,"output_tokens":200,"reasoning_output_tokens":50,"total_tokens":1700}}}}Expected parsed token row after cache split:
The inherited parent
last_token_usageof73000/72000/500should not be counted in the child session at all.