Background
Cherry Studio's Agent mode currently only works with models exposed through an Anthropic-format endpoint. Frontier non-Anthropic models — for example GPT-5.5, even when accessed through aggregator providers like CherryIN that already proxy them — cannot be used in Agent mode at all. Users who want to drive Agent with a non-Claude model end up running an external proxy (cc-switch / cc-nim) on the side just to translate request formats, which is fragile and hard to discover. The same friction shows up in Chat mode for aggregator providers that expose multiple endpoint types per model: users have to guess which endpoint to pick, and frequently hit "endpoint mismatch" errors.
Goal
Any supported model — including non-Anthropic models like GPT-5.5 — can be used in both Agent and Chat modes, with format conversion handled transparently inside Cherry Studio so users no longer need external proxies like cc-switch.
Spec
- Agent mode accepts any model the user has configured, regardless of the model's native API format.
- When the user's selected model exposes an Anthropic-format endpoint (e.g. via an aggregator), Agent uses that endpoint directly without conversion.
- When the model has no Anthropic-format endpoint available, Cherry Studio translates between the model's native format (OpenAI, Gemini) and the Anthropic format expected by Agent, transparently in both request and response paths.
- Format translation preserves streaming, tool-use, and structured-output behaviors — the user does not see degraded capabilities when running a non-Anthropic model in Agent.
- The same translation layer also applies in Chat mode for aggregator providers, so users do not have to manually pick an endpoint that matches the requesting mode.
- When a request genuinely cannot be honored (the model truly does not support a feature being asked for), the UI surfaces a clear, actionable message instead of leaking a raw upstream API error.
Verification
- Configure CherryIN provider, select GPT-5.5, switch to Agent mode → Agent runs end-to-end with tool calls and streaming working, no external cc-switch / cc-nim required.
- Configure CherryIN provider, select Claude Sonnet through the OpenAI-compatible endpoint, use in Chat mode → conversation works end-to-end with no "endpoint mismatch" error.
- Streaming a long Agent response from GPT-5.5 produces incremental chunks at the same cadence as a native Claude model in Agent (no perceived full-response buffering).
- Use Agent's tool-use with GPT-5.5 → tool calls and tool results both round-trip correctly; assistant final answer references the tool output.
- Add a Gemini provider and use a Gemini model in Agent mode → works end-to-end through translation.
- Disconnect upstream while a translated request is in-flight → user sees a clear "request failed, please retry" message, not an unhandled exception or stuck UI.
Related
#13917 #13880 #13853 #13625 #12921 #12618 #12380 #14480
Background
Cherry Studio's Agent mode currently only works with models exposed through an Anthropic-format endpoint. Frontier non-Anthropic models — for example GPT-5.5, even when accessed through aggregator providers like CherryIN that already proxy them — cannot be used in Agent mode at all. Users who want to drive Agent with a non-Claude model end up running an external proxy (cc-switch / cc-nim) on the side just to translate request formats, which is fragile and hard to discover. The same friction shows up in Chat mode for aggregator providers that expose multiple endpoint types per model: users have to guess which endpoint to pick, and frequently hit "endpoint mismatch" errors.
Goal
Any supported model — including non-Anthropic models like GPT-5.5 — can be used in both Agent and Chat modes, with format conversion handled transparently inside Cherry Studio so users no longer need external proxies like cc-switch.
Spec
Verification
Related
#13917 #13880 #13853 #13625 #12921 #12618 #12380 #14480