GSD 2 uses a three-stage promotion pipeline that automatically moves merged PRs through Dev → Test → Prod environments using npm dist-tags.
PR merged to main
│
▼
┌─────────┐ ci.yml passes (build, test, typecheck)
│ DEV │ → publishes gsd-pi@<version>-dev.<sha> with @dev tag
└────┬────┘
▼ (automatic if green)
┌─────────┐ CLI smoke tests + LLM fixture replay
│ TEST │ → promotes to @next tag
└────┬────┘ → pushes Docker image as :next
▼ (manual approval required)
┌─────────┐ optional real-LLM integration tests
│ PROD │ → promotes to @latest tag
└─────────┘ → creates GitHub Release
Every merged PR is immediately installable:
# Latest dev build (bleeding edge, every merged PR)
npx gsd-pi@dev
# Test candidate (passed smoke + fixture tests)
npx gsd-pi@next
# Stable production release
npx gsd-pi@latest # or just: npx gsd-pi# Test candidate
docker run --rm -v $(pwd):/workspace ghcr.io/gsd-build/gsd-pi:next --version
# Stable
docker run --rm -v $(pwd):/workspace ghcr.io/gsd-build/gsd-pi:latest --version- Find the PR's merge commit SHA (first 7 chars)
- Check if it's in
@dev:npm view gsd-pi@dev version- If the version ends in
-dev.<your-sha>, your PR is in dev
- If the version ends in
- Check if it promoted to
@next:npm view gsd-pi@next version - Check if it's in production:
npm view gsd-pi@latest version
| Workflow | File | Trigger | Purpose |
|---|---|---|---|
| CI | ci.yml |
PR + push to main | Build, test, typecheck — gate for all promotions |
| Release Pipeline | pipeline.yml |
After CI succeeds on main | Three-stage promotion |
| Native Binaries | build-native.yml |
v* tags |
Cross-compile platform binaries |
| Dev Cleanup | cleanup-dev-versions.yml |
Weekly (Monday 06:00 UTC) | Unpublish -dev. versions older than 30 days |
The pipeline only triggers after ci.yml passes. Key gating tests include:
- Unit tests (
npm run test:unit) — includesauto-session-encapsulation.test.tswhich enforces that all auto-mode state is encapsulated inAutoSession, plus dispatch loop regression tests that exercise the fullderiveState → resolveDispatch → idempotencychain without an LLM. Any PR adding module-level mutable state toauto.tswill fail CI and block the pipeline. - Integration tests (
npm run test:integration) - Extension typecheck (
npm run typecheck:extensions) - Package validation (
npm run validate-pack) - Smoke tests (
npm run test:smoke) — run post-build in the pipeline against the local binary and again against the globally-installed@devpackage - Fixture tests (
npm run test:fixtures) — replay recorded LLM conversations without hitting real APIs - Live regression tests (
npm run test:live-regression) — run against the installed binary in the Test stage to catch runtime regressions before promotion to@next
- A version reaches the Test stage automatically
- In GitHub Actions, the
prod-releasejob will show "Waiting for review" - Click Review deployments → select
prod→ Approve - The version is promoted to
@latestand a GitHub Release is created
To enable live LLM tests during Prod promotion:
- Set the
RUN_LIVE_TESTSenvironment variable totrueon theprodenvironment
If a broken version reaches production:
# Roll back npm
npm dist-tag add gsd-pi@<previous-good-version> latest
# Roll back Docker
docker pull ghcr.io/gsd-build/gsd-pi:<previous-good-version>
docker tag ghcr.io/gsd-build/gsd-pi:<previous-good-version> ghcr.io/gsd-build/gsd-pi:latest
docker push ghcr.io/gsd-build/gsd-pi:latestFor @dev or @next rollbacks, the next successful merge will overwrite the tag automatically.
| Setting | Value |
|---|---|
Environment: dev |
No protection rules |
Environment: test |
No protection rules |
Environment: prod |
Required reviewers: maintainers |
Secret: NPM_TOKEN |
All environments |
Secret: ANTHROPIC_API_KEY |
Prod environment only |
Secret: OPENAI_API_KEY |
Prod environment only |
Variable: RUN_LIVE_TESTS |
false (set to true to enable live LLM tests) |
| GHCR | Enabled for the gsd-build org |
| Image | Base | Purpose | Tags |
|---|---|---|---|
ghcr.io/gsd-build/gsd-ci-builder |
node:24-bookworm |
CI build environment with Rust toolchain | :latest, :<date> |
ghcr.io/gsd-build/gsd-pi |
node:24-slim |
User-facing runtime | :latest, :next, :v<version> |
The CI builder image is rebuilt automatically when the Dockerfile changes. It eliminates ~3-5 min of toolchain setup per CI run.
The fixture system records and replays LLM conversations without hitting real APIs (zero cost).
npm run test:fixtures# Set your API key, then record
GSD_FIXTURE_MODE=record GSD_FIXTURE_DIR=./tests/fixtures/recordings \
node --experimental-strip-types tests/fixtures/record.tsFixtures are JSON files in tests/fixtures/recordings/. Each one captures a conversation's request/response pairs and replays them by turn index.
Re-record fixtures when:
- Provider wire format changes (e.g., new field in Anthropic response)
- Tool definitions change (affects request shape)
- System prompt changes (may cause turn count mismatch)
| Tag | Published | Format | Who uses it |
|---|---|---|---|
@dev |
Every merged PR | 2.27.0-dev.a3f2c1b |
Developers verifying fixes |
@next |
Auto-promoted from dev | Same version | Early adopters, beta testers |
@latest |
Manually approved | Same version | Production users |
Old -dev. versions are cleaned up weekly (30-day retention).