fix: swap betterproto for grpc-requests on snapshot fetches to reduce CPU by braddf · Pull Request #250 · openclimatefix/quartz-api

braddf · 2026-03-27T23:13:04Z

Pull Request

Description

Reduces CPU and serialisation overhead on the /forecast/all/ endpoint, which fetches data for all 331 GSPs, as well as tweaking exactly what behaviour we want to allow on this endpoint.

gRPC client swap (original PoC — h/t @peterdudfield)

betterproto is pure Python, so deserialising 331 GSPs per snapshot was hammering the CPU. grpc-requests uses the protobuf C extension, does the same work in native C, and releases the GIL so the event loop stays free during the cache warm. CPU during cache warming dropped noticeably (~1% locally).

Serialisation

Swapped jsonable_encoder + json.dumps for Pydantic's TypeAdapter.dump_json, which uses the Rust-based serialiser.

Sync client extended (from #255)

The sync grpc-requests path now also covers GetForecastAsTimeseries, used by /{gsp_id}/forecast. Tests added with mocking.

Cache/routing logic

Default (no gsp_ids, default timestamps): served from warm cache only. Cache miss triggers a background warm and returns 503 + Retry-After: 60.
Custom gsp_ids or non-default timestamps: bypasses cache and hits the backend live.
Timestamp validation added when gsp_ids is not set.
Response ordering
compact=false response is now sorted by gsp_id ascending (sort happens at build time, before caching).

Default factories

default_now_window_start() and default_window_end() extracted as named, exportable functions so they can be called directly in route handler logic for default comparisons.

Helps with https://github.com/openclimatefix/client-private/issues/294

How Has This Been Tested?

Please describe the tests that you ran to verify your changes. Provide instructions so we can reproduce.
Please also list any relevant details for your test configuration

Local API against dev DP

If your changes affect data processing, have you plotted any changes? i.e. have you done a quick sanity check?

Yes

Checklist:

My code follows OCF's coding style guidelines
I have performed a self-review of my own code
I have made corresponding changes to the documentation
I have added tests that prove my fix is effective or that my feature works
I have checked my code and corrected any misspellings

devsjc

Nice Brad! You've beaten me to it.

devsjc · 2026-03-29T08:01:21Z

+                created_timestamp=dt.datetime.fromisoformat(v["created_timestamp_utc"].rstrip("Z")).replace(tzinfo=dt.UTC),
+                init_timestamp=dt.datetime.fromisoformat(v["initialization_timestamp_utc"].rstrip("Z")).replace(tzinfo=dt.UTC),


Surely there's no need for the rstrip -> replace here? Does fromisoformat not handle timezones??

Yeah I know, it was having trouble with this for some reason! Weird error, but please fiddle if you did fancy, otherwise I can have a quick look tomo

dt.datetime.fromisoformat(resp["timestamp_utc"]) seems to work for me

Nice, in that case much nicer without this hacky string functions! 👍

ive put a fix in #253

devsjc · 2026-03-29T08:02:34Z

+        if forecaster_version is None:
+            resp = svc.ListForecasters({
+                "forecaster_names_filter": [forecaster_name],
+                "latest_versions_only": True,
+            })
+            forecaster_name = resp["forecasters"][0]["forecaster_name"]
+            forecaster_version = resp["forecasters"][0]["forecaster_version"]


Could save yourself some lines here and make the version non-optional. We always know it now - especially in the forecast/all call which is where this is being used.

Ah okay good to know, just trying to stay quite defensive atm but if safe to remove then even better

peterdudfield · 2026-03-29T13:14:50Z

FYI I had to remake uv.lock, this might help CI

peterdudfield · 2026-03-29T14:59:37Z

    ) -> list[models.PredictedGenerationValue]:
+        if self._sync_client is not None:
+            loop = asyncio.get_running_loop()
+            return await loop.run_in_executor(


I think you can just run return self._sync_snapshot(....), unless ive missed something

Ah I was leaving in the default branch as a fallback, but maybe this is overkill – I assume the gRPC server should always have reflection if it does atm

sorry i meant you dont need loop = asyncio.get_running_loop(), you can juste run self._sync_snapshot(....)

* remake uv.lock, remove strip and timezone * lint * add jinja2 * TDD: add tests * dont use cache if start and end is the same * add params back in * add jinja2 * lint * also use sync for `GetForecastAsTimeseries` * remove code and tidy * dont use real host or port * lint * add mocking

devsjc

The sad thing about all this is that we lose the nice type hinting / IDE awareness around changes to the RPC schema. But if that's the cost of lower CPU, so be it!

devsjc · 2026-03-30T13:47:04Z

            )
            client = dp.DataPlatformDataServiceStub(channel=grpc_channel)
            storage = DataPlatformStorage.from_dp(dp_client=client)
+            storage.set_sync_client(


I'd imagine the type hinter might complain about this since it isn't a function on the generic interface, but I don't think it's worth quibbling over now.

devsjc · 2026-03-30T13:47:22Z

  host_url = "host-url-not-set"
  host_url = ${?HOST_URL} 
-  workers = 2
+  workers = 1


This feels a shame but I understand the cache thing is a seperate issue

ill make an issue

Is there any meaningful overhead to leaving gunicorn in without actually using (multiple) workers do you know @devsjc?

peterdudfield · 2026-03-30T14:45:01Z

+
+    if gsp_ids is None and start_datetime_utc != end_datetime_utc:
+            if start_datetime_utc_set or end_datetime_utc_set:
+                raise HTTPException(


braddf added 4 commits March 27, 2026 17:18

fix(forecast-all): swap json encoding to use Rust-based adapter

758b694

fix(forecast-all): PoC grpc-requests sync client

c6913fe

chore(forecast-all): ruff ignores

767929e

chore(forecast-all): lint

0875988

braddf requested review from devsjc and peterdudfield March 27, 2026 23:13

Merge branch 'main' into fix/forecast-all-reduce-cpu

2acb729

devsjc reviewed Mar 29, 2026

View reviewed changes

This was referenced Mar 29, 2026

Fix/forecast all reduce cpu pd #253

Closed

SYNC on gsp/Forecast/all and gsp/{gsp_id}/forecast - DRAFT #255

Merged

peterdudfield reviewed Mar 29, 2026

View reviewed changes

chore(uv): fix uv.lock merge issue

7366432

peterdudfield reviewed Mar 30, 2026

View reviewed changes

Comment thread src/quartz_api/cmd/server.conf

peterdudfield mentioned this pull request Mar 30, 2026

Add forecast all one timestamp back in #254

Closed

7 tasks

devsjc approved these changes Mar 30, 2026

View reviewed changes

peterdudfield mentioned this pull request Mar 30, 2026

Upgrade Cache #259

Open

braddf added 2 commits March 30, 2026 15:23

fix(forecast-all): add timestamp validation when gsp_ids not set

dc685c2

chore(forecast-all): lint fixes

f1d32ff

peterdudfield reviewed Mar 30, 2026

View reviewed changes

Comment thread src/quartz_api/internal/service/uk_national/gsp_router.py

peterdudfield reviewed Mar 30, 2026

View reviewed changes

braddf added 2 commits March 30, 2026 15:49

fix(forecast-all): use default factories for start/end

ec8e2f3

fix(forecast-all): sort gsp compact=false response

4c545c7

peterdudfield approved these changes Mar 30, 2026

View reviewed changes

braddf merged commit d137a3d into main Mar 30, 2026
8 checks passed

braddf deleted the fix/forecast-all-reduce-cpu branch March 30, 2026 15:07

		created_timestamp=dt.datetime.fromisoformat(v["created_timestamp_utc"].rstrip("Z")).replace(tzinfo=dt.UTC),
		init_timestamp=dt.datetime.fromisoformat(v["initialization_timestamp_utc"].rstrip("Z")).replace(tzinfo=dt.UTC),

Uh oh!

Conversation

braddf commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request

Description

gRPC client swap (original PoC — h/t @peterdudfield)

Serialisation

Sync client extended (from #255)

Cache/routing logic

Default factories

How Has This Been Tested?

Checklist:

Uh oh!

devsjc left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

peterdudfield commented Mar 29, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

devsjc left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

braddf commented Mar 27, 2026 •

edited

Loading

devsjc left a comment •

edited

Loading