test(etl): Wave 4E — real-data e2e + per-route latency regression by 0bserver07 · Pull Request #82 · 0bserver07/StackUnderflow

0bserver07 · 2026-05-06T21:06:29Z

Summary

Adds two slow-marker test files under a new tests/stackunderflow/integration/ package and registers a slow pytest marker (opt-in via pytest -m slow; default pytest tests/ -q keeps the existing 1474-test collection untouched).

test_etl_pipeline_e2e.py — builds a 10K-message synthetic store across 5 providers (claude/codex/cursor/gemini/cline) over 30 days × 20 projects, runs every registered Normalizer end-to-end, refreshes every mart, asserts cost conservation across daily/session/project/provider_day/model_day, then sweeps every dashboard route via FastAPI's TestClient asserting 200 + non-empty + <500ms.
test_route_perf_regression.py — parametrises every dashboard route against a pre-populated marts fixture (100K daily / 50K session / 1K project / 2K provider_day / 5K model_day rows) plus a small 1K-message set so aggregator-driven routes stay tight. Each route runs 1 warmup + 5 cold + 5 warm requests; max(warm) must beat the per-route budget.
pyproject.toml — new [tool.pytest.ini_options] section registers the slow marker and adds addopts = "-m 'not slow'" so the default suite skips the slow tests automatically. Run the new suite with pytest -m slow.

/api/etl/status is listed for forward compatibility — the route isn't implemented yet on main, so the test accepts a 404 (e2e) / pytest.skip (regression) until the endpoint lands.

Latency table (dev box, M-series, Python 3.12)

route                                cold(p50)  warm(p50)  warm(max)  budget
/api/projects?include_stats=true        5.8ms      5.8ms      6.2ms    100ms
/api/dashboard-data                     8.6ms      7.1ms      7.7ms    100ms
/api/cost-data?period=month            12.1ms     11.9ms     16.0ms    100ms
/api/cost-data/by-provider              1.4ms      1.1ms      2.6ms     50ms
/api/compare?period=month               1.7ms      1.7ms      1.8ms    100ms
/api/yield?period=week                  1.3ms      1.2ms      1.2ms    200ms
/api/optimize?period=month             81.9ms    100.7ms    153.2ms    200ms
/api/messages/summary                   1.8ms      1.6ms      1.6ms     50ms
/api/etl/status                                              (404 — route not yet implemented)

Test counts

Default pytest tests/ -q → 1472 passed, 2 skipped, 11 deselected (slow tests). Total collection 1474 — matches baseline.
pytest -m slow tests/stackunderflow/integration -q → 10 passed, 1 skipped (the /api/etl/status placeholder).

Test plan

pytest -m slow tests/stackunderflow/integration -q — 10 passed + 1 skipped (etl_status placeholder).
pytest tests/ -q — 1472 passed + 2 skipped + 11 deselected (unchanged from baseline).
ruff check tests/stackunderflow/integration/ — all checks passed.
No source code touched (Wave 4E scope discipline) — only test files + pyproject.toml marker config + CHANGELOG.

🤖 Generated with Claude Code

Adds two slow-marker test files under a new ``tests/stackunderflow/integration/`` package: * ``test_etl_pipeline_e2e.py`` — builds a 10K-message synthetic store across 5 providers (claude, codex, cursor, gemini, cline) over 30 days × 20 projects, runs every registered Normalizer end-to-end, refreshes every mart, and asserts cost-conservation across all five marts. Then mounts the production routers behind a TestClient and hits every dashboard route asserting 200 + non-empty + <500ms. * ``test_route_perf_regression.py`` — parametrises every dashboard route against a pre-populated synthetic marts fixture (100K daily, 50K session, 1K project, 2K provider_day, 5K model_day rows) plus a small 1K-message set so aggregator-driven routes stay quick. Each route gets 1 warmup + 5 cold + 5 warm runs; max(warm) must beat the per-route budget. Prints a cold/warm/budget table to the log so future regressions can be calibrated from CI output alone. Both files are gated on the new ``slow`` pytest marker registered in ``pyproject.toml``. Default ``pytest tests/ -q`` keeps its 1474-test collection unchanged (11 slow tests deselected by ``addopts = "-m 'not slow'"``); run the integration suite explicitly with ``pytest -m slow tests/stackunderflow/integration -q``. ``/api/etl/status`` is listed for forward compatibility — the route isn't yet implemented in the current main, so the test accepts a 404 in lieu of a 200 (e2e) / pytest.skip (regression) until the route lands. Latency table from a recent dev-box run: projects_with_stats cold 5.8ms warm 5.8ms budget 100 dashboard_data cold 8.6ms warm 7.1ms budget 100 cost_data cold 12.1ms warm 11.9ms budget 100 cost_data_by_provider cold 1.4ms warm 1.1ms budget 50 compare cold 1.7ms warm 1.7ms budget 100 yield cold 1.3ms warm 1.2ms budget 200 optimize cold 81.9ms warm 100.7ms budget 200 messages_summary cold 1.8ms warm 1.6ms budget 50 Synthetic stores live in ``tmp_path`` — the user's real ``~/.stackunderflow/store.db`` is never touched. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Bumps to 0.7.0. Consolidates the [Unreleased] CHANGELOG entries from the 11 ETL PRs (#72, #73, #74, #75, #76, #79, #81, #80, #78, #77, #82) into a single [0.7.0] section. New: docs/HANDOFF.md — state-of-the-codebase walkthrough for incoming agents. Architecture map, recent history, key gotchas, what's left, files-to-read-first. End-state on the maintainer's real store: 150,337 usage_events Marts populated and watermarks in sync Dashboard cold-load 2.5s → <50ms warm Watcher 155ms end-to-end source-file-write → dashboard-data-fresh 1598 backend tests passing, 2 skipped, 11 deselected (slow suite). Frontend typecheck + build clean. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

0bserver07 force-pushed the feat/etl-real-data-tests branch from 0b36218 to c34a329 Compare May 6, 2026 21:18

0bserver07 merged commit 1e61554 into main May 6, 2026
9 checks passed

0bserver07 deleted the feat/etl-real-data-tests branch May 6, 2026 21:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test(etl): Wave 4E — real-data e2e + per-route latency regression#82

test(etl): Wave 4E — real-data e2e + per-route latency regression#82
0bserver07 merged 1 commit into
mainfrom
feat/etl-real-data-tests

0bserver07 commented May 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

0bserver07 commented May 6, 2026

Summary

Latency table (dev box, M-series, Python 3.12)

Test counts

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant