Skip to content

Commit 7e5e69c

Browse files
authored
fix(tests): repair test_streaming_model so all 28 tests run and pass (#334)
Four pre-existing bugs left this entire test file unrunnable on main (4 failures + 24 errors); fixing them here so the suite actually exercises TemporalStreamingModel and protects against regressions. Bug 1 (24 errors): `conftest.py` defines fixture `mock_adk_streaming` (no underscore) but every test in TestStreamingModelSettings and TestStreamingModelTools requested it as `_mock_adk_streaming`, so pytest failed to resolve the fixture before the body ever ran. The fixture is ``autouse=True`` and the param value was never used in any test body, so the parameter was vestigial — replaced with `_streaming_context_vars`, which provides the ContextVar setup these tests now actually need. Bug 2 (4 failures): `TemporalStreamingModel.get_response()` reads `task_id`, `trace_id`, and `parent_span_id` from ContextVars populated by `ContextInterceptor` from request headers in real Temporal flows. Tests had been passing `task_id=...` as a kwarg, which is silently swallowed by `**kwargs` and ignored, so all three ContextVars stayed at their defaults and the validation at the top of `get_response` raised before any work happened. New `_streaming_context_vars` fixture in conftest sets all three vars (and resets them on teardown), simulating what `ContextInterceptor` does in production. Bug 3 (test_computer_tool): A recent commit narrowed `ComputerTool` serialization to require an actual `Computer`/`AsyncComputer` instance, but `sample_computer_tool` still built a bare `MagicMock`. Switched to `MagicMock(spec=Computer)` so the production isinstance check passes. Bug 4 (3 streaming-context tests): The 3 tests in TestStreamingModelBasics that assert on `streaming_task_message_context` calls built event sequences with raw `MagicMock(type="...")`. Production dispatches via `isinstance(event, ResponseOutputItemAddedEvent)` etc., which `MagicMock` without `spec` never satisfies, so dispatch was silently skipped and the assertions failed. Switched to `MagicMock(spec=...)` for each event type — passes isinstance without triggering pydantic validation on the event's required fields. Also fixed `test_task_id_threading` which had been asserting against a hardcoded `task_id="test_task_12345"` that was never actually threaded anywhere (the kwarg was ignored, just like in Bug 2); it now asserts against the value yielded by the fixture, which is the value production reads from the ContextVar. After all four fixes: 28/28 pass, ruff clean, pyright clean.
1 parent e1effb5 commit 7e5e69c

2 files changed

Lines changed: 115 additions & 55 deletions

File tree

src/agentex/lib/core/temporal/plugins/openai_agents/tests/conftest.py

Lines changed: 36 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@
2121
CodeInterpreterTool,
2222
ImageGenerationTool,
2323
)
24+
from agents.computer import Computer
2425
from agents.model_settings import Reasoning # type: ignore[attr-defined]
2526
from openai.types.responses import (
2627
ResponseCompletedEvent,
@@ -47,6 +48,34 @@ def sample_task_id():
4748
return f"task_{uuid.uuid4().hex[:8]}"
4849

4950

51+
@pytest.fixture
52+
def _streaming_context_vars(sample_task_id):
53+
"""Populate the streaming ContextVars that ContextInterceptor sets from
54+
request headers in real Temporal flows. TemporalStreamingModel.get_response()
55+
validates that all three are set before doing any work, so any test that
56+
calls get_response() must request this fixture.
57+
58+
Named with a leading underscore so tests can request it purely for its
59+
setup/teardown side effects without ruff flagging it as an unused argument
60+
(ARG002). The yielded value is the task_id set on the ContextVar, available
61+
for tests that need to assert against it.
62+
"""
63+
from agentex.lib.core.temporal.plugins.openai_agents.interceptors.context_interceptor import (
64+
streaming_task_id,
65+
streaming_trace_id,
66+
streaming_parent_span_id,
67+
)
68+
task_token = streaming_task_id.set(sample_task_id)
69+
trace_token = streaming_trace_id.set("test-trace-id")
70+
span_token = streaming_parent_span_id.set("test-parent-span-id")
71+
try:
72+
yield sample_task_id
73+
finally:
74+
streaming_task_id.reset(task_token)
75+
streaming_trace_id.reset(trace_token)
76+
streaming_parent_span_id.reset(span_token)
77+
78+
5079
@pytest.fixture
5180
def mock_streaming_context():
5281
"""Mock streaming context for testing"""
@@ -115,8 +144,13 @@ def sample_file_search_tool():
115144

116145
@pytest.fixture
117146
def sample_computer_tool():
118-
"""Sample ComputerTool for testing"""
119-
computer = MagicMock()
147+
"""Sample ComputerTool for testing.
148+
149+
Production validates ``isinstance(computer, (Computer, AsyncComputer))`` for
150+
Responses API serialization, so the mock must be ``spec``-bound to
151+
``Computer`` for the isinstance check to pass.
152+
"""
153+
computer = MagicMock(spec=Computer)
120154
computer.environment = "desktop"
121155
computer.dimensions = [1920, 1080]
122156
return ComputerTool(computer=computer)

0 commit comments

Comments
 (0)