Skip to content

Commit d1837ad

Browse files
authored
ci: make build cache restore resilient to eviction (#16275)
# Overview Downstream CI jobs fail when re-running individual jobs after the build cache has been evicted from GitHub's 10GB repo-wide LRU cache. For example, [this run](https://github.com/payloadcms/payload/actions/runs/24259697364/job/71108809903?pr=15268#step:5:22) had 4 jobs fail at "Restore build" because the cache was evicted ~43 hours after the original build. ## Time Savings This replaces the fixed 120s `sleep` propagation delay and hard-failing `fail-on-cache-miss: true` with a polling + fallback approach that self-heals on cache miss. **This change will save 2 mins per run.** ## Key Changes - **New `restore-build` input on the setup action** - When `true`, the action polls for the build cache using `gh cache list` (10s intervals, 120s timeout), restores it if found, or falls back to `pnpm install && pnpm run build:all` if not. This replaces `pnpm-run-install: false`, `pnpm-restore-cache: false`, `cache-propagation-delay: 120`, and the separate `Restore build` step that every downstream job previously had. - **Simplified downstream jobs** - All 8 downstream jobs (`tests-unit`, `tests-types`, `tests-int`, `e2e-prep`, `tests-e2e`, `build-and-test-templates`, `tests-type-generation`, `analyze`) go from ~10 lines of setup + cache restore boilerplate to `restore-build: true`. - **Removed `cache-propagation-delay` input** - The fixed 120s sleep is no longer needed. Polling finds the cache as soon as it's available (typically instantly), and fallback handles the miss case. ## Design Decisions **Polling over fixed sleep:** The old 120s delay was wasted time on the happy path (propagation is effectively instant) and didn't help when the cache was evicted entirely. Polling with `gh cache list` finds the cache as soon as it propagates, and the timeout triggers a fallback build instead of a hard failure. **`gh cache list` over `actions/cache/restore` with `lookup-only`:** `lookup-only` is a step-level action that can't be called in a bash loop. `gh cache list` is a CLI command available on all runners, works in a loop, and requires no extra extensions. Uses exact key match via jq filter to avoid prefix-match false positives. **Fallback builds with warm pnpm store:** The pnpm store cache is restored regardless of `restore-build` mode, so if the fallback build triggers, `pnpm install` has a warm store rather than starting cold. **Error handling:** If `gh cache list` fails (rate limit, auth, network), the loop breaks immediately and falls back to a full build rather than silently polling for 120s. ## Overall Flow ```mermaid sequenceDiagram participant B as Build Job participant C as GitHub Cache participant D as Downstream Job participant S as Setup Action B->>C: Save build cache (key: SHA) D->>S: restore-build: true S->>S: Restore pnpm store cache loop Every 10s (up to 120s) S->>C: gh cache list (exact key match) C-->>S: Found / Not found end alt Cache found S->>C: actions/cache/restore C-->>S: Build artifacts else Cache not found (evicted or timeout) S->>S: pnpm install && pnpm run build:all end ``` ## References / Links - [actions/cache#1710](actions/cache#1710) — original cache propagation delay issue - [gh cache list docs](https://cli.github.com/manual/gh_cache_list) - [actions/cache/restore README](https://github.com/actions/cache/blob/main/restore/README.md)
1 parent 1e1c591 commit d1837ad

2 files changed

Lines changed: 62 additions & 89 deletions

File tree

.github/actions/setup/action.yml

Lines changed: 54 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -15,9 +15,9 @@ inputs:
1515
default: 'true'
1616
pnpm-install-cache-key:
1717
description: The cache key override for the pnpm install cache
18-
cache-propagation-delay:
19-
description: Seconds to wait after setup for cache propagation (workaround for https://github.com/actions/cache/issues/1710)
20-
default: '0'
18+
restore-build:
19+
description: Whether to restore the build cache (polls for availability, falls back to full build on miss)
20+
default: 'false'
2121

2222
outputs:
2323
pnpm-install-cache-key:
@@ -92,7 +92,7 @@ runs:
9292
echo "PNPM_INSTALL_CACHE_KEY=$PNPM_INSTALL_CACHE_KEY" >> $GITHUB_ENV
9393
9494
- name: Restore pnpm install cache
95-
if: ${{ inputs.pnpm-restore-cache == 'true' }}
95+
if: ${{ inputs.pnpm-restore-cache == 'true' || inputs.restore-build == 'true' }}
9696
uses: actions/cache@27d5ce7f107fe9357f9df03efb73ab90386fccae # v5.0.5
9797
with:
9898
path: ${{ env.STORE_PATH }}
@@ -102,7 +102,7 @@ runs:
102102
pnpm-store-
103103
104104
- name: Run pnpm install
105-
if: ${{ inputs.pnpm-run-install == 'true' }}
105+
if: ${{ inputs.pnpm-run-install == 'true' && inputs.restore-build != 'true' }}
106106
shell: bash
107107
run: pnpm install
108108

@@ -112,9 +112,54 @@ runs:
112112
shell: bash
113113
id: compute-output
114114
115-
- name: Wait for cache propagation
116-
if: ${{ inputs.cache-propagation-delay != '0' }}
115+
- name: Poll for build cache
116+
if: ${{ inputs.restore-build == 'true' }}
117117
shell: bash
118+
env:
119+
GH_TOKEN: ${{ github.token }}
118120
run: |
119-
echo "Waiting ${{ inputs.cache-propagation-delay }}s for cache propagation..."
120-
sleep ${{ inputs.cache-propagation-delay }}
121+
CACHE_KEY="${{ github.sha }}"
122+
POLL_INTERVAL=10
123+
TIMEOUT=120
124+
ELAPSED=0
125+
CACHE_FOUND=false
126+
127+
echo "Polling for build cache key: $CACHE_KEY"
128+
129+
while [ $ELAPSED -lt $TIMEOUT ]; do
130+
if ! RESULT=$(gh cache list --repo "$GITHUB_REPOSITORY" --key "$CACHE_KEY" --json key --jq "[.[] | select(.key == \"$CACHE_KEY\")] | length"); then
131+
echo "::warning::gh cache list failed, falling back to full build"
132+
break
133+
fi
134+
if [ "$RESULT" -gt 0 ]; then
135+
echo "Cache found after ${ELAPSED}s"
136+
CACHE_FOUND=true
137+
break
138+
fi
139+
echo "Cache not found yet (${ELAPSED}s elapsed), retrying in ${POLL_INTERVAL}s..."
140+
sleep $POLL_INTERVAL
141+
ELAPSED=$((ELAPSED + POLL_INTERVAL))
142+
done
143+
144+
if [ "$CACHE_FOUND" = "false" ]; then
145+
echo "::warning::Build cache not found after ${TIMEOUT}s — will run full build as fallback"
146+
fi
147+
148+
echo "CACHE_FOUND=$CACHE_FOUND" >> $GITHUB_ENV
149+
150+
- name: Restore build cache
151+
if: ${{ inputs.restore-build == 'true' && env.CACHE_FOUND == 'true' }}
152+
uses: actions/cache/restore@27d5ce7f107fe9357f9df03efb73ab90386fccae # v5.0.5
153+
with:
154+
path: ./*
155+
key: ${{ github.sha }}
156+
157+
- name: Fallback — install and build
158+
if: ${{ inputs.restore-build == 'true' && env.CACHE_FOUND != 'true' }}
159+
shell: bash
160+
run: |
161+
echo "::warning::Build cache miss — running full build as fallback"
162+
pnpm install
163+
pnpm run build:all
164+
env:
165+
DO_NOT_TRACK: 1

.github/workflows/main.yml

Lines changed: 8 additions & 80 deletions
Original file line numberDiff line numberDiff line change
@@ -103,16 +103,7 @@ jobs:
103103
- name: Node setup
104104
uses: ./.github/actions/setup
105105
with:
106-
pnpm-run-install: false
107-
pnpm-restore-cache: false # Full build is restored below
108-
cache-propagation-delay: 120 # https://github.com/actions/cache/issues/1710
109-
110-
- name: Restore build
111-
uses: actions/cache@27d5ce7f107fe9357f9df03efb73ab90386fccae # v5.0.5
112-
with:
113-
path: ./*
114-
key: ${{ github.sha }}
115-
fail-on-cache-miss: true
106+
restore-build: true
116107

117108
- name: Unit Tests
118109
run: pnpm test:unit
@@ -129,16 +120,7 @@ jobs:
129120
- name: Node setup
130121
uses: ./.github/actions/setup
131122
with:
132-
pnpm-run-install: false
133-
pnpm-restore-cache: false # Full build is restored below
134-
cache-propagation-delay: 120 # https://github.com/actions/cache/issues/1710
135-
136-
- name: Restore build
137-
uses: actions/cache@27d5ce7f107fe9357f9df03efb73ab90386fccae # v5.0.5
138-
with:
139-
path: ./*
140-
key: ${{ github.sha }}
141-
fail-on-cache-miss: true
123+
restore-build: true
142124

143125
- name: Types Tests
144126
run: pnpm test:types --target '>=5.7'
@@ -198,16 +180,7 @@ jobs:
198180
- name: Node setup
199181
uses: ./.github/actions/setup
200182
with:
201-
pnpm-run-install: false
202-
pnpm-restore-cache: false # Full build is restored below
203-
cache-propagation-delay: 120 # https://github.com/actions/cache/issues/1710
204-
205-
- name: Restore build
206-
uses: actions/cache@27d5ce7f107fe9357f9df03efb73ab90386fccae # v5.0.5
207-
with:
208-
path: ./*
209-
key: ${{ github.sha }}
210-
fail-on-cache-miss: true
183+
restore-build: true
211184

212185
- name: Start LocalStack
213186
run: pnpm docker:start
@@ -266,16 +239,7 @@ jobs:
266239
- name: Node setup
267240
uses: ./.github/actions/setup
268241
with:
269-
pnpm-run-install: false
270-
pnpm-restore-cache: false
271-
cache-propagation-delay: 120 # https://github.com/actions/cache/issues/1710
272-
273-
- name: Restore build
274-
uses: actions/cache@27d5ce7f107fe9357f9df03efb73ab90386fccae # v5.0.5
275-
with:
276-
path: ./*
277-
key: ${{ github.sha }}
278-
fail-on-cache-miss: true
242+
restore-build: true
279243

280244
- name: Prepare prod test environment
281245
run: pnpm prepare-run-test-against-prod:ci
@@ -307,16 +271,7 @@ jobs:
307271
- name: Node setup
308272
uses: ./.github/actions/setup
309273
with:
310-
pnpm-run-install: false
311-
pnpm-restore-cache: false # Full build is restored below
312-
cache-propagation-delay: 120 # https://github.com/actions/cache/issues/1710
313-
314-
- name: Restore build
315-
uses: actions/cache@27d5ce7f107fe9357f9df03efb73ab90386fccae # v5.0.5
316-
with:
317-
path: ./*
318-
key: ${{ github.sha }}
319-
fail-on-cache-miss: true
274+
restore-build: true
320275

321276
- name: Restore prepared test environment
322277
uses: actions/cache/restore@27d5ce7f107fe9357f9df03efb73ab90386fccae # v5.0.5
@@ -447,16 +402,7 @@ jobs:
447402
- name: Node setup
448403
uses: ./.github/actions/setup
449404
with:
450-
pnpm-run-install: false
451-
pnpm-restore-cache: false # Full build is restored below
452-
cache-propagation-delay: 120 # https://github.com/actions/cache/issues/1710
453-
454-
- name: Restore build
455-
uses: actions/cache@27d5ce7f107fe9357f9df03efb73ab90386fccae # v5.0.5
456-
with:
457-
path: ./*
458-
key: ${{ github.sha }}
459-
fail-on-cache-miss: true
405+
restore-build: true
460406

461407
- name: Start database
462408
id: db
@@ -522,16 +468,7 @@ jobs:
522468
- name: Node setup
523469
uses: ./.github/actions/setup
524470
with:
525-
pnpm-run-install: false
526-
pnpm-restore-cache: false # Full build is restored below
527-
cache-propagation-delay: 120 # https://github.com/actions/cache/issues/1710
528-
529-
- name: Restore build
530-
uses: actions/cache@27d5ce7f107fe9357f9df03efb73ab90386fccae # v5.0.5
531-
with:
532-
path: ./*
533-
key: ${{ github.sha }}
534-
fail-on-cache-miss: true
471+
restore-build: true
535472

536473
- name: Generate Payload Types
537474
run: pnpm dev:generate-types fields
@@ -585,16 +522,7 @@ jobs:
585522
- name: Node setup
586523
uses: ./.github/actions/setup
587524
with:
588-
pnpm-run-install: false
589-
pnpm-restore-cache: false # Full build is restored below
590-
cache-propagation-delay: 120 # https://github.com/actions/cache/issues/1710
591-
592-
- name: Restore build
593-
uses: actions/cache@27d5ce7f107fe9357f9df03efb73ab90386fccae # v5.0.5
594-
with:
595-
path: ./*
596-
key: ${{ github.sha }}
597-
fail-on-cache-miss: true
525+
restore-build: true
598526

599527
- run: pnpm run build:bundle-for-analysis # Esbuild packages that haven't already been built in the build step for the purpose of analyzing bundle size
600528
env:

0 commit comments

Comments
 (0)