refactor(openai): Split token counting by API for easier deprecation#5930
Draft
ericapisani wants to merge 5 commits intomasterfrom
Draft
refactor(openai): Split token counting by API for easier deprecation#5930ericapisani wants to merge 5 commits intomasterfrom
ericapisani wants to merge 5 commits intomasterfrom
Conversation
…Responses API functions Replace the shared `_calculate_token_usage()` and `_get_usage()` with two API-specific functions: `_calculate_completions_token_usage()` and `_calculate_responses_token_usage()`. This makes it clear which token fields belong to which API and enables clean removal of Chat Completions support when it is deprecated. - Completions function extracts `prompt_tokens`, `completion_tokens`, `total_tokens` and supports `streaming_message_token_usage` for stream_options include_usage - Responses function extracts `input_tokens`, `output_tokens`, `total_tokens` plus `cached_tokens` and `reasoning_tokens` details - Add API section comments in `_set_common_output_data` - Update all call sites to use the appropriate API-specific function - Convert Completions call sites to use keyword arguments - Update and rename unit tests; add Responses API token usage tests - Add sync and async streaming tests for usage-in-stream Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…n_usage call sites Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Contributor
Semver Impact of This PR🟢 Patch (bug fixes) 📋 Changelog PreviewThis is how your changes will appear in the changelog. New Features ✨
Internal Changes 🔧
🤖 This preview updates automatically when you update the PR. |
Member
Author
|
bugbot run |
Contributor
Codecov Results 📊✅ 13 passed | Total: 13 | Pass Rate: 100% | Execution Time: 6.57s All tests are passing successfully. ❌ Patch coverage is 0.00%. Project has 14826 uncovered lines. Files with missing lines (1)
Generated by Codecov Action |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Autofix Details
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: Response usage silently overrides streaming token usage
- Changed the second if statement to elif to ensure streaming_message_token_usage takes precedence over response.usage, preventing silent data loss.
Or push these changes by commenting:
@cursor push 6e65058f68
Preview (6e65058f68)
diff --git a/sentry_sdk/integrations/openai.py b/sentry_sdk/integrations/openai.py
--- a/sentry_sdk/integrations/openai.py
+++ b/sentry_sdk/integrations/openai.py
@@ -164,8 +164,7 @@
if streaming_message_token_usage:
usage = streaming_message_token_usage
-
- if hasattr(response, "usage"):
+ elif hasattr(response, "usage"):
usage = response.usage
if usage is not None:This Bugbot Autofix run was free. To enable autofix for future PRs, go to the Cursor dashboard.
… usage The refactor that split _calculate_token_usage into separate Completions and Responses functions dropped extraction of prompt_tokens_details.cached_tokens and completion_tokens_details.reasoning_tokens from the Completions path. This restores those fields so spans for cached prompts and reasoning models (e.g. o1/o3) report complete token usage metrics. Also fixes streaming usage priority: streaming_message_token_usage now correctly takes precedence over response.usage via elif. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The test used MagicMock(message="one") where message was a plain string, but the real OpenAI API returns Choice objects with message.content. The counting code checks hasattr(choice.message, "content"), which failed on strings, so manual token counting was never exercised. Use real Choice and ChatCompletionMessage objects and fix the expected output_tokens. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Member
Author
|
bugbot run |
The stream_options parameter was not available in early versions of the OpenAI Python SDK, causing TypeError on v1.0.1 CI runs. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

_calculate_token_usage()and_get_usage()with two API-specific functions:_calculate_completions_token_usage()(Chat Completions / Embeddings) and_calculate_responses_token_usage()(Responses API)["input_tokens", "prompt_tokens"]streaming_message_token_usagefromstream_options={"include_usage": True}in streaming Completions_set_common_output_datato clarify which branch handles which APIMotivation
When Chat Completions is deprecated, removing it should be a simple delete operation without auditing shared code. Before this change,
_calculate_token_usagehandled both APIs with interleaved logic, making it unclear what was safe to remove.We also needed to ensure that, when handling streamed responses from the chat completions API, the
usagein the final chunk of the message correctly made its way to the_calculate_*_token_usagemethod.Other design decisions to note
Some duplication of code between the completions api and responses api token counting
This is to allow for easier removal of the completions api logic once it's fully deprecated/removed from the OpenAI API.