Compose Workflow

Reusable GitHub Actions workflows for Docker Compose deployments across multiple repositories. Provides centralized lint and deploy automation with self-hosted-runner-based deployment to the host on which compose stacks live.

Architecture

As of 2026-05-02, deployment uses self-hosted GitHub Actions runners that live on the deployment hosts themselves — there is no SSH-from-CI path. The runner runs as a deploy user, in the docker group and the admin's group, and pulls jobs from GitHub. Eliminates the SSH-key/Tailscale/sudo-rule class of issues that plagued the prior design.

Three caller repos use this workflow:

docker-piwine — runner label piwine
docker-piwine-office — runner label piwine-office
docker-zendc — runner label zendc

Key Features

🔒 Security first — input validation, GitGuardian secret scanning, 1Password integration
🏠 Self-hosted deploy — no inbound network access required; runner pulls jobs from GitHub
🔄 Automatic rollback — git reset --hard <previous_sha> + redeploy on deploy or health failure
🔍 Failure diagnostics — on stack failure, dumps docker compose ps -a, healthcheck history (.State.Health.Log), and scoped service logs
📊 Discord notifications — pipeline-status icon line with deploy/health/rollback states, commit link, user mention on failure
🚦 Critical stack detection — auto-detects from com.compose.tier: infrastructure labels
🔐 Multi-registry auth — single 1P round trip + docker/login-action per registry (ghcr.io, docker.io, registry.gitlab.com, custom GitLab)

Available Workflows

`compose-lint.yml` — validation

Parallel GitGuardian + yamllint + docker compose config validation. Runs on GitHub-hosted runners (no host access needed).

jobs:
  lint:
    uses: owine/compose-workflow/.github/workflows/compose-lint.yml@main
    secrets: inherit
    with:
      stacks: '["stack1", "stack2", "stack3"]'
      webhook-url: "op://Docker/discord-github-notifications/<env>_webhook_url"
      repo-name: "my-docker-repo"
      target-repository: ${{ github.repository }}
      target-ref: ${{ github.sha }}
      discord-user-id: "op://Docker/discord-github-notifications/user_id"
      # plus event-context inputs (see compose-lint.yml for the full list)

`deploy.yml` — self-hosted deploy

5-job pipeline: prepare → deploy → health-check → rollback → notify. Runs on [self-hosted, <runner-label>].

on:
  workflow_run:
    workflows: ["Lint Docker Compose"]
    types: [completed]
    branches: [main]
  workflow_dispatch:
    inputs:
      force-deploy:
        type: boolean
        default: false

concurrency:
  group: deploy-<repo-label>
  cancel-in-progress: false

jobs:
  deploy:
    if: ${{ github.event.workflow_run.conclusion == 'success' || github.event_name == 'workflow_dispatch' }}
    uses: owine/compose-workflow/.github/workflows/deploy.yml@<sha>
    secrets: inherit
    with:
      runner-label: piwine                  # or piwine-office, zendc
      live-repo-path: /opt/compose
      repo-name: "docker-piwine"
      webhook-url: "op://Docker/discord-github-notifications/piwine_webhook_url"
      discord-user-id: "op://Docker/discord-github-notifications/user_id"
      target-ref: ${{ github.event.workflow_run.head_sha || github.sha }}
      has-dockge: true                      # false for zendc
      force-deploy: ${{ inputs.force-deploy || false }}

Optional inputs: live-dockge-path (when has-dockge: true), auto-detect-critical (default true), critical-services (manual override), image-pull-timeout, service-startup-timeout, failed-container-log-lines.

Required Configuration

Repository structure (caller repo)

├── .yamllint                     # yamllint configuration
├── compose.env                   # env file with op:// references
├── .github/
│   ├── actionlint.yaml           # declares the runner-label
│   └── workflows/
│       ├── lint.yml              # calls compose-lint.yml
│       └── deploy.yml            # calls compose-workflow's deploy.yml
├── stack1/compose.yaml
├── stack2/compose.yaml
└── ...

actionlint.yaml must declare the runner label or PRs fail with "label X is unknown":

self-hosted-runner:
  labels:
    - piwine     # whichever label this repo's deploy.yml passes as runner-label

Required secrets

Calling repos need exactly one secret:

OP_SERVICE_ACCOUNT_TOKEN — 1Password service account token. Used by both lint (GitGuardian API key) and deploy (env-file resolution + multi-registry credentials + Discord webhook).

The previously-required SSH_USER / SSH_HOST secrets were deleted from all three caller repos on 2026-05-03; only OP_SERVICE_ACCOUNT_TOKEN remains.

1Password references

op://Docker/discord-github-notifications/<env>_webhook_url
op://Docker/discord-github-notifications/user_id
op://Docker/ghcr-pat/{username,pat}
op://Docker/docker-hub/{username,token}
op://Docker/gitlab-registry/{username,token}
op://Docker/gitlab-container-zenterprise/{username,token}
op://Docker/gitguardian/api_key

Self-hosted runner host requirements

The runner host needs docker, jq, timeout (coreutils), gh, and op (1Password CLI) on the deploy user's PATH. Plus a registered runner systemd service running as deploy. Full host-prep playbook in docs/superpowers/runbooks/self-hosted-runner-migration.md — covers path-ownership pattern (admin owns tree, deploy in admin's group via safe.directory), setgid + group-write, and the mandatory umask 002 for both admin and runner users.

Healthcheck requirements for `--wait`

deploy.yml invokes docker compose up --wait, which only verifies services that have healthchecks defined. Services without healthchecks start but don't gate the deploy. See CLAUDE.md for healthcheck patterns.

One-shot containers (e.g. migration sidecars gated via service_completed_successfully) end up exited with code 0 — the health-check job recognizes this as success.

Disabling stacks and services

Two mechanisms cover different scopes. Both rely on docker compose up --wait --remove-orphans (which deploy.yml already uses on every stack invocation) to tear down whatever is no longer present.

Disabling a whole stack: `.disabled` marker file

Touch <stack>/.disabled alongside the stack's compose.yaml to disable it:

touch silo/.disabled
git add silo/.disabled
git commit -m "disable silo"
git push

On the next deploy:

The Discover stacks step excludes the directory from active stacks and emits it on the disabled_stacks output.
The change-detection script reclassifies the stack as removed (effectively-present-in-CURRENT, not effectively-present-in-TARGET).
The Teardown removed stacks step runs docker compose down against the still-present compose.yaml in the live tree.
Discord embed and PR comment surface a 🛑 **Disabled stacks:** line alongside the existing 🗑️ **Removed stacks:** line.

The stack directory and its compose.yaml stay in the repo — only the running containers go away. Re-enable with git rm <stack>/.disabled; the next deploy classifies it as a new stack and runs up --wait.

Lint coverage is unaffected: caller-repo lint workflows do not filter .disabled, so YAML and compose config validation continue to run on disabled stacks. This keeps the file from rotting silently between disable and re-enable.

Detection rules in detail. A stack is "effectively present" at a given SHA iff compose.yaml exists and .disabled does not. The change-detection script applies an effectively-present-in-CURRENT guard on removed-stack detectors and an effectively-present-in-TARGET guard on new-stack detectors. This correctly handles edge cases:

Transition	Action
Enabled → disabled	`docker compose down` (teardown path)
Disabled → enabled	`docker compose up --wait` (new-stack path)
Stay disabled	no-op
Born disabled (added with `.disabled` already present)	no-op
Delete while disabled	no-op (already torn down at disable time)

Design spec: docs/superpowers/specs/2026-05-26-disabled-stacks-design.md.

Disabling individual services: comment them out

Compose itself has no first-class "disable one service" mechanism, but up --wait --remove-orphans treats any service no longer present in the compose model as an orphan and removes it on the next deploy. The simplest way to remove a service: comment out its block in compose.yaml.

services:
  primary:
    image: ghcr.io/example/app:1.2.3
    # ...

  # postgres:
  #   image: postgres:16
  #   restart: unless-stopped
  #   # ... rest of service definition

On the next deploy:

up --wait --remove-orphans brings down any containers belonging to the commented-out service.
The stack itself stays "existing" (its directory and compose.yaml are still present), so it isn't reclassified.

Re-enable by uncommenting and pushing; up --wait recreates the service.

Why not Compose profiles? profiles: keeps the service definition active in YAML, so Renovate continues opening image-bump PRs for a service nobody is running. Commenting out freezes the service at its current image and stops the bump churn. Use profiles only when you expect a short-term disable and want the image to stay current.

Caveats of commenting:

Larger diffs (the entire service block goes from active to commented). You've already accepted this as the tradeoff for simplicity.
The commented YAML is no longer parsed, so if the surrounding compose structure drifts (new networks, renamed volumes), re-enabling may require a touch-up.
Service-specific named volumes/networks declared at the top level are not removed by --remove-orphans (it only removes containers). Manual cleanup with docker volume rm / docker network rm if you want the storage gone.

Testing and Development

# Lint workflow files
actionlint .github/workflows/compose-lint.yml \
           .github/workflows/deploy.yml \
           .github/workflows/workflow-lint.yml
yamllint --strict .github/workflows/*.yml

# Lint deployment scripts
shellcheck scripts/deployment/*.sh scripts/linting/*.sh

# Local testing utilities
./scripts/testing/test-workflow.sh
./scripts/testing/validate-compose.sh

Security

Input validation

Stack names validated against ^[a-zA-Z0-9._-]+$ before any docker compose invocation
Target refs validated as 40-char hex SHAs
Webhook URLs validated as 1Password references

Secret management

All secrets stored in 1Password (no plaintext in repos or workflow files)
op run --env-file=… resolves references at deploy time
1password/load-secrets-action for individual values (registry creds, Discord webhook)
Multi-registry creds cached in the runner host's ~/.docker/config.json via docker/login-action with logout: false

Network model

No inbound network access to deployment hosts is required for CI — runners on the host pull jobs from GitHub
Outbound: runner → GitHub Actions API, image registries, 1Password, Discord webhook
No Tailscale dependency (the prior SSH-based design needed it; the self-hosted approach makes it unnecessary)

Troubleshooting

Symptom	First thing to check
Runner shows `offline` in GitHub	`sudo systemctl status actions.runner.<...>.service` on the host
`git reset --hard` permission denied	`umask 002` missing in admin user's `.zshrc`/`.bashrc` — see runbook
Stack fails with no log lines	Check the failure diagnostic dump in the run — `compose ps -a` + `inspect Health.Log` + scoped `compose logs` should be there
Discord embed wrong color	Verify the `case` statement in notify job's status step still maps `healthy → success` and `failed → failure`
GitGuardian failure	Verify `OP_SERVICE_ACCOUNT_TOKEN` and that the 1P service account has access to the GitGuardian API key

For self-hosted runner setup or migration of a new host, see the runbook.

Version management

Latest: @main for newest features
Pinned: full 40-char SHA on the uses: line — Renovate auto-bumps this on the caller side

Contributing

Run actionlint, yamllint --strict, and shellcheck on changed files
Update CLAUDE.md and README.md when behavior changes
For breaking changes (new required input on a reusable workflow), bump every caller's SHA pin in the same push session — Renovate eventually does this but with a window of broken deploys in between

License

Private repository, internal use only.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Compose Workflow

Architecture

Key Features

Available Workflows

`compose-lint.yml` — validation

`deploy.yml` — self-hosted deploy

Required Configuration

Repository structure (caller repo)

Required secrets

1Password references

Self-hosted runner host requirements

Healthcheck requirements for `--wait`

Disabling stacks and services

Disabling a whole stack: `.disabled` marker file

Disabling individual services: comment them out

Testing and Development

Security

Input validation

Secret management

Network model

Troubleshooting

Version management

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 335 Commits
.github		.github
docs/superpowers		docs/superpowers
scripts		scripts
.gitignore		.gitignore
.yamllint		.yamllint
CLAUDE.md		CLAUDE.md
README.md		README.md
default.json		default.json

Folders and files

Latest commit

History

Repository files navigation

Compose Workflow

Architecture

Key Features

Available Workflows

compose-lint.yml — validation

deploy.yml — self-hosted deploy

Required Configuration

Repository structure (caller repo)

Required secrets

1Password references

Self-hosted runner host requirements

Healthcheck requirements for --wait

Disabling stacks and services

Disabling a whole stack: .disabled marker file

Disabling individual services: comment them out

Testing and Development

Security

Input validation

Secret management

Network model

Troubleshooting

Version management

Contributing

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

`compose-lint.yml` — validation

`deploy.yml` — self-hosted deploy

Healthcheck requirements for `--wait`

Disabling a whole stack: `.disabled` marker file

Packages