Skip to content

feat(proxy): add 200+error body detection and extra inputs rectifier#2433

Open
NKU100 wants to merge 9 commits intofarion1231:mainfrom
NKU100:main
Open

feat(proxy): add 200+error body detection and extra inputs rectifier#2433
NKU100 wants to merge 9 commits intofarion1231:mainfrom
NKU100:main

Conversation

@NKU100
Copy link
Copy Markdown

@NKU100 NKU100 commented Apr 28, 2026

Summary

  • Add extra inputs rectifier module to auto-strip unsupported API fields, with per-provider caching (1-hour TTL). Detects "extra inputs are not permitted" errors and falls back to checking request body for known Anthropic-only fields (context_management, anthropic_beta, output_config) when error messages don't specify which fields are unsupported
  • Detect non-streaming 200 responses that contain an error key and reclassify them as UpstreamError, enabling rectifiers and failover to trigger
  • Decompress gzip response body before validity check in hyper path (handles proxies that strip gzip wrapper but leave content-encoding header)
  • Skip gzip decompression when magic bytes are absent (prevents ZlibError from misleading content-encoding headers)

Test plan

  • cargo clippy -- -D warnings passes
  • cargo fmt --check passes
  • All unit tests pass (1053+ tests)
  • Verified live with Xiaomi MiFE API returning gzip-compressed 200 responses
  • Verified live with a third-party provider returning 200 + {"error":...} — error-key detection → extra inputs rectifier chain triggers correctly
  • CI passes on GitHub Actions

NKU100 added 3 commits April 28, 2026 23:10
… fields

When a provider returns "Extra inputs are not permitted" errors, the
rectifier extracts the unsupported field names from the error message,
caches them per-provider (1-hour TTL), strips the fields, retries the
request, and pre-filters subsequent requests from the same provider.

Includes settings panel toggle (en/zh/ja) and retry guard to prevent
infinite loops.
Remove unused `extracted_fields` field and `insert` method, use char
array pattern in `rsplit` to satisfy clippy::manual_pattern_char_comparison.
…elds

Non-streaming 200 responses lacking expected keys (content/choices/output/
candidates) are now reclassified as UpstreamError, triggering rectifiers and
failover. When the extra-inputs rectifier fires but cannot extract field
names from the error message, it falls back to stripping known Anthropic-only
fields (context_management, anthropic_beta, output_config) present in the
request body.
@farion1231
Copy link
Copy Markdown
Owner

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 5b68a192c0

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread src-tauri/src/proxy/forwarder.rs Outdated
Comment on lines +1982 to +1985
obj.contains_key("content")
|| obj.contains_key("choices")
|| obj.contains_key("output")
|| obj.contains_key("candidates")
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Accept prompt-blocked Gemini responses as valid

is_valid_response_body only treats objects containing content/choices/output/candidates as success, so a valid Gemini generateContent 200 response that only carries promptFeedback (blocked prompt, no candidates) is reclassified as UpstreamError. In that case this code triggers unnecessary retry/failover instead of returning the provider’s safety-block result to callers, which changes behavior for normal filtered prompts.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个不太好修啊,不知道补上generateContent能不能覆盖所有场景

Comment on lines +82 to +87
let entries = self.entries.read().await;
entries
.iter()
.filter(|(k, v)| k.starts_with(&prefix) && !v.is_expired())
.map(|(k, _)| k[prefix.len()..].to_string())
.collect()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Evict expired cache entries when reading blocked fields

get_blocked_fields filters out expired entries but never removes them from entries, so the cache grows monotonically in long-running processes as new provider/field pairs are learned. This violates the intended 1-hour TTL behavior in practice (expiration affects matching, not retention) and can cause avoidable memory growth over time.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已经在最新提交修复了

NKU100 added 2 commits April 29, 2026 11:24
…ache entries

- Add "promptFeedback" to is_valid_response_body so Gemini safety-
  blocked responses are forwarded instead of misclassified as errors
- Lazily evict expired entries in ExtraInputsCache::get_blocked_fields
  to prevent unbounded memory growth in long-running processes
Replace the key-presence allowlist in is_valid_response_body with
discriminator-field matching per wire format:
- Anthropic Messages: type == "message"
- OpenAI Chat Completions: object == "chat.completion"
- OpenAI Responses API: object == "response"
- Gemini: candidates / promptFeedback (no discriminator in protocol)

This avoids misclassifying 200 bodies that happen to contain strings
like "content" or "choices" as errors, and correctly rejects error
envelopes (type == "error") and SSE chunk types.
@farion1231
Copy link
Copy Markdown
Owner

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Already looking forward to the next diff.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@farion1231
Copy link
Copy Markdown
Owner

@claude review

@claude
Copy link
Copy Markdown

claude Bot commented May 2, 2026

Claude Code is working…

I'll analyze this and get back to you.

View job run

@farion1231
Copy link
Copy Markdown
Owner

@claude review

(Retrying — first attempt hit a git fetch race in claude-code-action.)

@claude
Copy link
Copy Markdown

claude Bot commented May 2, 2026

Claude Code is working…

I'll analyze this and get back to you.

View job run

@farion1231
Copy link
Copy Markdown
Owner

@claude review

@claude
Copy link
Copy Markdown

claude Bot commented May 2, 2026

Claude Code is working…

I'll analyze this and get back to you.

View job run

@farion1231
Copy link
Copy Markdown
Owner

@claude review

@claude
Copy link
Copy Markdown

claude Bot commented May 2, 2026

Claude finished @farion1231's task in 4m 50s —— View job


Claude PR Review

  • Read PR diff
  • Review extra_inputs_rectifier.rs
  • Review forwarder.rs (200+error body detection)
  • Verify Codex review concerns are addressed
  • Post review

Findings

Important — is_valid_response_body will reject valid Gemini :countTokens / :embedContent 200 responses (confidence: 85)

forwarder.rs:1687-1712 runs is_valid_response_body on every non-SSE 200 response. The discriminator-based check accepts only Anthropic (type=="message"), OpenAI (object=="chat.completion"|"response"), or Gemini bodies that contain candidates/promptFeedback.

But the Gemini route in server.rs:328 is a wildcard:

.route("/v1beta/*path", post(handlers::handle_gemini))
.route("/gemini/v1beta/*path", post(handlers::handle_gemini))

So it also forwards non-generateContent Gemini endpoints — and their successful responses do not contain any of the keys above:

  • :countTokens{"totalTokens": 1234, "totalBillableCharacters": 567}
  • :embedContent{"embedding": {"values": [...]}}
  • :batchEmbedContents{"embeddings": [...]}

Concrete failure path: Gemini CLI (and similar clients) call :countTokens before each generateContent. cc-switch receives the 200 from upstream, is_valid_response_body returns false, forward() raises ProxyError::UpstreamError { status: 200, ... }, the request is retried against every configured provider — all of which return the same shape — and the client eventually sees a hard failure on what was a perfectly valid response.

Two minimal-surface fixes:

  • Skip the discriminator check when the endpoint is not a generation endpoint (e.g. endpoint.contains("generateContent") || endpoint.contains("/messages") || endpoint.contains("/chat/completions") || endpoint.contains("/responses")), or
  • Extend the Gemini branch in is_valid_response_body to also accept totalTokens, embedding, or embeddings as keys.

Fix this →


Notes on prior Codex feedback

Both items are already resolved on this branch, looks good:

  • Gemini promptFeedback-only 200 is accepted (commit f7057aa, then refactored in 3a20666) — covered by valid_gemini_safety_blocked_response test.
  • ExtraInputsCache now evicts expired entries on read (extra_inputs_rectifier.rs:94-100).

Otherwise the rectifier module is well-scoped, the per-provider keying is correct, the pre-filter ordering sits cleanly before the existing body filter, and the discriminator-based body classifier is a nice tightening over the original key-set check.

NKU100 added 2 commits May 3, 2026 00:36
…per path

The hyper client does not auto-decompress responses, so gzip-compressed
200 bodies (e.g. from Xiaomi mimo API) were misidentified as invalid
by is_valid_response_body, triggering unnecessary retries. Add manual
decompression using response_processor::decompress_body before the
validity check, matching what the reqwest path already does.
…ection

Instead of whitelisting valid response shapes per protocol (Anthropic
type=="message", OpenAI object=="chat.completion", Gemini candidates/
promptFeedback, ...), detect the `error` key in JSON body. Any 200
response containing `error` is reclassified as UpstreamError; everything
else is trusted. This eliminates the need for per-protocol patches when
new endpoints or providers are added.
NKU100 added 2 commits May 3, 2026 01:02
MiFE proxy may strip gzip wrapping but leave content-encoding: gzip
header intact. decompress_body then fails with "invalid gzip header",
and the misleading header is forwarded to clients causing ZlibError.

Check for gzip magic bytes (1f 8b) before attempting decompression.
If absent, return the body as-is with Ok, which triggers content-encoding
header stripping in read_decoded_body.
…back strategy

The old description only mentioned matching "Extra inputs are not permitted"
errors. The actual implementation also falls back to checking Anthropic-specific
fields (context_management, anthropic_beta, output_config) in the request body.
Update zh/en/ja descriptions to reflect this broader detection behavior.
@NKU100
Copy link
Copy Markdown
Author

NKU100 commented May 2, 2026

Updated PR description to reflect the latest implementation changes since the original submission:

  1. 200+error detection: Simplified from protocol-specific whitelist (content/choices/output/candidates) to error key detection — no more per-protocol patches needed
  2. Gzip decompression: Added magic bytes guard (1f 8b) to handle proxies (e.g. Xiaomi MiFE) that strip gzip wrapper but leave content-encoding: gzip header, preventing ZlibError on downstream clients
  3. Extra Inputs rectifier description: Updated i18n text (zh/en/ja) to reflect the Anthropic-only fields fallback strategy

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants