Skip to content

Implement streaming response optimization for non-Next.js publisher proxy #563

@aram356

Description

@aram356

Context

The publisher proxy currently buffers the entire response body in memory before sending any bytes to the client. For a 222KB HTML page, peak memory is ~4x the response size and no bytes reach the client until all processing completes.

Performance results (staging vs production, median over 5 runs, Chrome 1440x900)

Metric Production (v135, buffered) Staging (v136, streaming) Delta
TTFB 54 ms 35 ms -19 ms (-35%)
First Paint 186 ms 160 ms -26 ms (-14%)
First Contentful Paint 186 ms 160 ms -26 ms (-14%)
DOM Content Loaded 286 ms 282 ms -4 ms (~same)
DOM Complete 1060 ms 663 ms -397 ms (-37%)

Measured on getpurpose.ai. Production (v135) buffers the entire response before sending. Staging (v136) streams processed chunks incrementally via StreamingBody.

Spec

See streaming response design spec (PR #562).

Plan

See implementation plan (PR #562).

Phase 1: Make streaming pipeline chunk-emitting (PR #583)

Ships independently with immediate memory savings.

Phase 2: Stream responses to client via StreamingBody (PR #585)

Depends on Phase 1. Adds TTFB/TTLB improvement.

Phase 3: Make script rewriters fragment-safe (PR #591)

Depends on Phase 2. Removes the buffered fallback, enabling full streaming even with GTM/NextJS script rewriters active.

Phase 4: Stream binary pass-through responses

Depends on Phase 2. Non-processable content (images, fonts, video) currently buffers in memory unnecessarily. Phase 4 streams them directly via io::copy into StreamingBody.

Acceptance Criteria

  • Streaming activates for all 2xx responses (text and binary)
  • Peak memory per request reduced from ~4x to constant (chunk buffer + parser state)
  • Client receives first body bytes after first processed chunk, not after full buffering
  • No regressions on static, auction, or discovery endpoints
  • Buffered fallback for HTML with post-processors and non-2xx error pages
  • Script rewriters (GTM, NextJS) work correctly under streaming fragmentation
  • Binary responses (images, fonts) stream via pass-through without processing overhead

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No fields configured for Epic.

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions