Skip to content

block/cachew

Cachew

Cachew (pronounced "cashew") is a tiered, protocol-aware, caching HTTP proxy for software engineering infrastructure. It understands higher-level protocols (Git, Docker, Go modules, etc.) and makes smarter caching decisions than a naive HTTP proxy.

Strategies

Git

Caches Git repositories with two complementary techniques:

  1. Snapshots — periodic .tar.zst archives that restore 4–5x faster than git clone.
  2. Pack caching — passthrough caching of packs from git-upload-pack for incremental pulls.

Redirect Git traffic through cachew:

[url "https://cachew.example.com/git/github.com/"]
  insteadOf = https://github.com/

Restore a repository from a snapshot (with automatic delta bundle to reach HEAD):

cachew git restore https://github.com/org/repo ./repo
git {
  snapshot-interval = "1h"
  repack-interval   = "1h"
}

GitHub Releases

Caches public and private GitHub release assets. Private orgs use a token or GitHub App for authentication.

URL pattern: /github-releases/{owner}/{repo}/{tag}/{asset}

github-releases {
  token        = "${GITHUB_TOKEN}"
  private-orgs = ["myorg"]
}

Go Modules

Go module proxy (GOPROXY-compatible). Private modules are fetched via git clone.

URL pattern: /gomod/...

export GOPROXY=http://cachew.example.com/gomod,direct
gomod {
  proxy         = "https://proxy.golang.org"
  private-paths = ["github.com/myorg/*"]
}

Hermit

Caches Hermit package downloads. GitHub release URLs are automatically routed through the github-releases strategy.

URL pattern: /hermit/{host}/{path...}

hermit {}

Artifactory

Caches artifacts from JFrog Artifactory with host-based or path-based routing.

artifactory "example.jfrog.io" {
  target = "https://example.jfrog.io"
}

Host

Generic reverse-proxy caching for arbitrary HTTP hosts, with optional custom headers.

host "https://ghcr.io" {
  headers = {
    "Authorization": "Bearer QQ=="
  }
}

host "https://w3.org" {}

HTTP Proxy

Caching proxy for clients that use absolute-form HTTP requests (e.g. Android sdkmanager --proxy_host).

proxy {}

Cache Backends

Multiple backends can be configured simultaneously — they are automatically combined into a tiered cache. Reads check each tier in order and backfill lower tiers on a hit. Writes go to all tiers in parallel.

Memory

In-memory LRU cache.

memory {
  limit-mb = 1024   # default
  max-ttl  = "1h"   # default
}

Disk

On-disk LRU cache with TTL-based eviction.

disk {
  limit-mb = 250000
  max-ttl  = "8h"
}

S3

S3-compatible object storage (AWS S3, MinIO, etc.).

s3 {
  bucket   = "my-cache-bucket"
  endpoint = "s3.amazonaws.com"
  region   = "us-east-1"
}

Authorization (OPA)

Cachew uses Open Policy Agent for request authorization. The default policy allows all methods from 127.0.0.1 and GET/HEAD from elsewhere.

Policies must be in package cachew.authz and define a deny rule set. If the set is empty, the request is allowed; otherwise the reasons are returned to the client.

opa {
  policy = <<EOF
    package cachew.authz
    deny contains "unauthenticated" if not input.headers["authorization"]
    deny contains "writes not allowed" if input.method == "PUT"
  EOF
}

Or reference an external file with optional data:

opa {
  policy-file = "./policy.rego"
  data-file   = "./opa-data.json"
}

Input fields: input.method, input.path (string array), input.headers, input.remote_addr (includes port — use startswith to match by IP).

GitHub App Authentication

For private Git repositories and GitHub release assets, configure a GitHub App:

github-app {
  app-id           = "12345"
  private-key-path = "./github-app.pem"
  installations    = { "myorg": "67890" }
}

Installations can also be discovered dynamically via the GitHub API.

CLI

Server (cachewd)

cachewd --config cachew.hcl
cachewd --schema  # print config schema

Client (cachew)

# Object operations
cachew get <namespace> <key> [-o file]
cachew put <namespace> <key> [file] [--ttl 1h]
cachew stat <namespace> <key>
cachew delete <namespace> <key>
cachew namespaces

# Directory snapshots
cachew snapshot <namespace> <key> <directory> [--ttl 1h] [--exclude pattern]
cachew restore <namespace> <key> <directory>

# Git
cachew git restore <repo-url> <directory> [--no-bundle]

Global flags: --url (CACHEW_URL), --authorization (CACHEW_AUTHORIZATION), --platform (prefix keys with os-arch), --daily/--hourly (prefix keys with date).

Observability

log {
  level = "info"  # debug, info, warn, error
}

metrics {
  service-name = "cachew"
}

Admin endpoints: /_liveness, /_readiness, PUT /admin/log/level, /admin/pprof/.

Full Configuration Example

state = "./state"
bind  = "0.0.0.0:8080"
url   = "http://cachew.example.com:8080/"

log {
  level = "info"
}

opa {
  policy = <<EOF
    package cachew.authz
    deny contains "not localhost" if not startswith(input.remote_addr, "127.0.0.1:")
  EOF
}

metrics {}

github-app {
  app-id           = "12345"
  private-key-path = "./github-app.pem"
}

git-clone {}

git {
  snapshot-interval = "1h"
  repack-interval   = "1h"
}

github-releases {
  token        = "${GITHUB_TOKEN}"
  private-orgs = ["myorg"]
}

gomod {
  proxy         = "https://proxy.golang.org"
  private-paths = ["github.com/myorg/*"]
}

hermit {}

host "https://ghcr.io" {
  headers = {
    "Authorization": "Bearer ${GHCR_TOKEN}"
  }
}

disk {
  limit-mb = 250000
  max-ttl  = "8h"
}

proxy {}

About

Cachew (pronounced cashew) is a super-fast application-level pass-through cache

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages