controller: support namespace-scoped cache via WATCH_NAMESPACE by zjh7890 · Pull Request #766 · alibaba/OpenSandbox

zjh7890 · 2026-04-21T12:42:08Z

Motivation

Both BatchSandboxReconciler and PoolReconciler declare Owns(&corev1.Pod{}) in SetupWithManager. In cmd/controller/main.go, the manager is created without Cache.DefaultNamespaces, which means the underlying controller-runtime shared informer for Pods has cluster-wide scope.

This is controller-runtime's documented default and isn't wrong on its own. However, on a large shared cluster that hosts many unrelated workloads alongside OpenSandbox:

The shared Pod informer performs a full list+watch of every Pod in the cluster on startup.
The Pod.ownerRefUID field index (internal/utils/fieldindex) is built over that entire set.
Cold-start memory usage grows with cluster-wide Pod count, not with the number of OpenSandbox-owned Pods, which can significantly increase startup memory pressure on multi-tenant clusters.

Observed symptom on a shared multi-tenant cluster where only one namespace contained OpenSandbox CRs:

opensandbox-controller-manager crash-looped with exitCode=137 / reason=OOMKilled within ~5s of startup. Logs only got as far as register field index / starting manager / starting server name=health probe before the kernel killed the process on cgroup memory limit.
Raising the Pod memory limit helped but is a band-aid — headroom still has to cover unrelated cluster growth.

Why this is safe to narrow

The current reconcile code already treats BatchSandbox/Pool/Pod as a same-namespace graph, so restricting the cache to the namespaces that actually contain OpenSandbox CRs does not remove any existing capability:

BatchSandbox.Spec.PoolRef is just a name, with no namespace field
(apis/sandbox/v1alpha1/batchsandbox_types.go L34).
PoolReconciler lists BatchSandbox only within pool.Namespace
(internal/controller/pool_controller.go L146-L148).
PoolReconciler maps back from BatchSandbox to Pool using batchSandbox.Namespace
(internal/controller/pool_controller.go L295).
BatchSandboxReconciler fetches owned Pods by batchSbx.Namespace
(internal/controller/batchsandbox_controller.go L342).

So the cluster-wide cache is paying a cost that none of the reconcile paths actually consume.

Change

Add an opt-in flag to narrow the manager cache to a specific set of namespaces.

New flag --watch-namespaces (comma-separated list).
Defaults to the WATCH_NAMESPACE environment variable (common operator convention).
When non-empty, wires into cache.Options.DefaultNamespaces, producing a multi-namespace cache.
When empty (or unset), behavior is unchanged — the manager stays cluster-scoped, preserving backward compatibility.

Parser is fail-closed, not fail-open

The flag's purpose is least-privilege / smaller cache, so malformed input must not silently widen scope back to cluster-wide. buildWatchNamespaces rejects any malformed input with a clear error and the controller exits non-zero at startup:

Input	Behavior
`""` (empty)	cluster-wide (explicit opt-out)
`"ns-a"` / `"ns-a,ns-b"`	namespace-scoped
`" ns-a , ns-b "`	trimmed, namespace-scoped
`"ns-a,ns-a"`	de-duplicated to `{ns-a}`
`","`, `" , "`, `"ns-a,"`, `",ns-a"`, `"ns-a,,ns-b"`	error, exit 1 (empty segment)
`"Default"`, `"bad_ns"`, 64-char name	error, exit 1 (fails DNS-1123 label validation via `k8s.io/apimachinery/pkg/util/validation.IsDNS1123Label`)

The startup log also sorts the namespace list before printing so the line is deterministic across restarts.

Example:

# Deployment snippet
env:
  - name: WATCH_NAMESPACE
    value: "opensandbox"

or:

sandbox-controller --watch-namespaces=opensandbox,another-ns

Log line on startup:

restricting manager cache to specific namespaces  namespaces=[opensandbox]

vs. unset:

manager cache is cluster-scoped (watching all namespaces)

Compatibility & risk

Flag defaults to empty → no behavior change for existing users.
No CRD / RBAC / Helm chart changes in this PR. Operators who opt in can tighten RBAC from ClusterRole to Role/RoleBinding in a follow-up; that is not required for the cache change itself to take effect.
Owns(&corev1.Pod{}) is unchanged. Pool / BatchSandbox reconcile logic is unchanged.
cache.Options.DefaultNamespaces is compatible with the existing Metrics / WebhookServer manager options and with fieldindex.RegisterFieldIndexes — multiNamespaceCache supports IndexField.
Note for future contributors: if webhook handlers are later added that need to read across namespaces via the manager cache, those webhooks would need to be scoped consistently with the configured watch-namespaces.
Cost model caveat: DefaultNamespaces creates one cache/informer per namespace, so this flag is intended for "a small number of namespaces replaces one cluster-wide cache", not for arbitrarily long lists.

Testing

go build ./cmd/controller/... passes.
go vet ./cmd/controller/... passes.
New table-driven unit test cmd/controller/main_test.go covers 15 cases: empty, whitespace-only, single, multi, trim, duplicates, mid-empty / leading-empty / trailing-empty / single-comma / whitespace-between-commas (all rejected), uppercase (rejected), underscore (rejected), 64-char overlong (rejected), 63-char max-length (accepted). All pass under go test -count=1 -v ./cmd/controller/....

--help shows the new flag:

-watch-namespaces string
    Comma-separated list of namespaces the manager should watch.
    Empty (default) watches all namespaces; providing a list restricts
    the shared informer cache to only those namespaces. Also reads the
    WATCH_NAMESPACE environment variable.

Follow-ups (not in this PR, happy to do separately if accepted)

Document the WATCH_NAMESPACE env / --watch-namespaces flag in config/manager/manager.yaml and the Helm chart so the feature is discoverable.
Optional: make RBAC generation respect the watched-namespace set (emit Role/RoleBinding instead of ClusterRole when a scope is requested).

Happy to iterate on naming (--watch-namespaces vs --namespaces vs adopting the prometheus-operator style multi-flag split) or on defaults if maintainers prefer a different convention.

Both BatchSandboxReconciler and PoolReconciler declare Owns(&corev1.Pod{}). Because cmd/controller/main.go does not configure Cache.DefaultNamespaces, controller-runtime builds a shared Pod informer whose list/watch scope is the full cluster. On clusters with many unrelated workloads, this causes the manager to hold the entire cluster's Pod list in its cache even when only one namespace contains OpenSandbox CRs, which can lead to OOMKills during cold-start list+decode+index. Add a new --watch-namespaces flag (comma-separated) that defaults to the WATCH_NAMESPACE environment variable. When set, the value is wired into cache.Options.DefaultNamespaces so the shared informer cache only serves the listed namespaces. When unset, behavior is unchanged (cluster-scoped cache, watches all namespaces) for backward compatibility. - Pool and BatchSandbox CRDs are already namespaced. - Reconcile-time list/get calls already restrict to the CR's own namespace. - BatchSandbox.Spec.PoolRef has no namespace field, so cross-namespace allocation is not an existing feature and is not regressed.

CLAassistant · 2026-04-21T12:42:14Z

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.

zjh7890 seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

Address review feedback on the initial commit: - buildWatchNamespaces now returns (map[string]cache.Config, error). - Empty / whitespace-only input -> (nil, nil) preserves cluster-wide default. - Non-empty input that yields no valid tokens returns an error instead of silently falling back to cluster-wide. This matters because the flag's purpose is to narrow scope; silent fallback would broaden privileges. - Empty segments inside a non-empty list (",", "ns-a,,ns-b", "ns-a,") are now rejected rather than skipped, so operator/templating mistakes surface instead of being masked. - Each token is validated with validation.IsDNS1123Label so typos like "Bad_NS" or overlong names fail fast at startup. - Duplicates in a valid list continue to be collapsed (set semantics). - main() exits non-zero with a clear setupLog.Error if parsing fails. - Startup log sorts the watched namespace list so log output is stable across restarts (Go map iteration is intentionally unordered). - Add table-driven test cmd/controller/main_test.go covering empty, whitespace, single, multiple, duplicate, mid-empty, leading/trailing empty, DNS-1123 violations (case, underscore), max-length (63) and overlong (64) cases.

Spground · 2026-04-23T13:54:00Z

 	flag.BoolVar(&logCompress, "log-compress", true, "Compress determines if the rotated log files should be compressed using gzip")
 	flag.Float64Var(&kubeClientQPS, "kube-client-qps", 100, "QPS for Kubernetes client rate limiter.")
 	flag.IntVar(&kubeClientBurst, "kube-client-burst", 200, "Burst for Kubernetes client rate limiter.")
+	flag.StringVar(&watchNamespaces, "watch-namespaces", os.Getenv("WATCH_NAMESPACE"),


The watch-namespace flag name is weird, How about namespace, moreover, In my opinion, there is no need to support value from env.

zjh7890 requested review from Generalwin, Pangjiping, Spground, fengcone, hittyt, kevinlynx and ninan-nn as code owners April 21, 2026 12:42

Spground reviewed Apr 23, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

controller: support namespace-scoped cache via WATCH_NAMESPACE#766

controller: support namespace-scoped cache via WATCH_NAMESPACE#766
zjh7890 wants to merge 2 commits intoalibaba:mainfrom
zjh7890:feat/namespace-scoped-cache

zjh7890 commented Apr 21, 2026 •

edited

Loading

Uh oh!

CLAassistant commented Apr 21, 2026

Uh oh!

Spground Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

zjh7890 commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Why this is safe to narrow

Change

Parser is fail-closed, not fail-open

Compatibility & risk

Testing

Follow-ups (not in this PR, happy to do separately if accepted)

Uh oh!

CLAassistant commented Apr 21, 2026

Uh oh!

Spground Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

zjh7890 commented Apr 21, 2026 •

edited

Loading