Skip to content

cmd/prometheus: add --auto-gomemlimit.refresh-interval flag#18843

Open
sanidhyasin wants to merge 1 commit into
prometheus:mainfrom
sanidhyasin:auto-gomemlimit-refresh-interval
Open

cmd/prometheus: add --auto-gomemlimit.refresh-interval flag#18843
sanidhyasin wants to merge 1 commit into
prometheus:mainfrom
sanidhyasin:auto-gomemlimit-refresh-interval

Conversation

@sanidhyasin

@sanidhyasin sanidhyasin commented Jun 3, 2026

Copy link
Copy Markdown

Which issue(s) does the PR fix:

Fixes #18712

Release notes for end users (ALL commits must be considered).

[FEATURE] Prometheus: Add `--auto-gomemlimit.refresh-interval` flag to periodically re-detect the container or system memory limit and update `GOMEMLIMIT` at runtime.

What this does

The auto-GOMEMLIMIT feature (--auto-gomemlimit) detects the cgroup or system memory limit only once at startup, via a single call to memlimit.SetGoMemLimitWithOpts. If the limit changes while Prometheus is running — for example, a Vertical Pod Autoscaler (VPA) adjusting a container's memory — GOMEMLIMIT keeps the value computed at startup until the process restarts.

This PR adds an optional --auto-gomemlimit.refresh-interval flag that wires the automemlimit library's existing WithRefreshInterval option. When set to a non-zero duration, the limit is periodically re-detected and GOMEMLIMIT is reapplied if it changed.

  • Default is 0s, which preserves the current behaviour exactly (detect once at startup, no background refresh — WithRefreshInterval(0) is a no-op in the library).
  • The flag is only used when --auto-gomemlimit is enabled, consistent with --auto-gomemlimit.ratio.
  • Negative durations are rejected at flag-parse time by model.Duration, so the library's ticker can never receive an invalid interval.

Notes

  • As discussed in the issue, the polling cost is negligible at typical intervals (≥30–60s) — it reads a single value from cgroup memory. The flag defaults to disabled so there is no behaviour change unless opted in.
  • The help text calls out that a downward change in the limit (e.g. a VPA scale-down) can cause a temporary increase in GC activity, so operators aren't surprised.

Changes

  • cmd/prometheus/main.go: new flag + config field, passed through as memlimit.WithRefreshInterval.
  • docs/command-line/prometheus.md: regenerated via make cli-documentation.

I have signed the DCO.

The auto-GOMEMLIMIT feature detects the cgroup or system memory limit
only once at startup. When the limit changes at runtime (for example, a
Vertical Pod Autoscaler adjusting a container's resources), GOMEMLIMIT
keeps the value computed at startup until the process restarts.

Add an optional --auto-gomemlimit.refresh-interval flag that wires
automemlimit's WithRefreshInterval option, so the limit is periodically
re-detected and reapplied. The default of 0s preserves the previous
behaviour of detecting the limit only once at startup.

Fixes prometheus#18712

Signed-off-by: Sanidhya Singh <singhsanidhya741@gmail.com>
@sanidhyasin sanidhyasin requested a review from a team as a code owner June 3, 2026 04:44
@sanidhyasin sanidhyasin requested a review from simonpasquier June 3, 2026 04:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Provide refresh interval config for GOMEMLIMIT to support dynamic memory limits

1 participant