Skip to content

TREE-679: Added support for health probes#12

Open
sujaya-sys wants to merge 11 commits intomainfrom
add-healthprobes
Open

TREE-679: Added support for health probes#12
sujaya-sys wants to merge 11 commits intomainfrom
add-healthprobes

Conversation

@sujaya-sys
Copy link
Copy Markdown
Contributor

@sujaya-sys sujaya-sys commented Mar 4, 2026

Context

As part of the Checkly agent v.6.3.1 support for health probe endpoints has been added:

The agent now exposes HTTP health probe endpoints, allowing users to configure liveness and readiness probes (/-/liveness, /-/readiness, /health) for their agent deployments.

More info in our Dev docs: health probe endpoints

Request

Add liveness and readiness probes to the helm chart

Comment thread charts/agent/values.yaml Outdated
Comment thread charts/agent/values.yaml Outdated
Comment thread charts/agent/values.yaml Outdated
Comment thread charts/agent/values.yaml Outdated
Comment thread charts/agent/values.yaml Outdated
Comment thread charts/agent/values.yaml Outdated
Comment thread charts/agent/values.yaml Outdated
sujaya-sys and others added 7 commits March 4, 2026 08:42
Co-authored-by: pelzerim <immanuel.pelzer@googlemail.com>
Co-authored-by: pelzerim <immanuel.pelzer@googlemail.com>
Co-authored-by: pelzerim <immanuel.pelzer@googlemail.com>
Co-authored-by: pelzerim <immanuel.pelzer@googlemail.com>
Co-authored-by: pelzerim <immanuel.pelzer@googlemail.com>
Co-authored-by: pelzerim <immanuel.pelzer@googlemail.com>
Co-authored-by: pelzerim <immanuel.pelzer@googlemail.com>
@sujaya-sys
Copy link
Copy Markdown
Contributor Author

Thanks for the quick review @pelzerim! Added your changes.

Comment thread charts/agent/templates/deployment.yaml Outdated
Comment thread charts/agent/values.yaml
@pelzerim
Copy link
Copy Markdown
Contributor

pelzerim commented Mar 4, 2026

After we release this lets update the chart here https://github.com/checkly/infra/blob/a0aa06c510ad8725ebbd09b5802bf82bd6735df6/backend/agent.tf#L69 to see if it actually works

Comment thread charts/agent/templates/deployment.yaml Outdated
ports:
- name: metrics
containerPort: {{ .Values.metrics.port }}
- name: health
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so i just realized that these are two ports. HOWEVER its the same server in the agent. So you can technically currently pass 2 ports but only one is used

Lets get rid of healthPort and just use metrics.port only. in the broader sense health checks are kinda metrics

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha, makes sense, I missed this. Added in my latest commit: cb4417d

@sujaya-sys sujaya-sys changed the title Added support for health probes TREE-679: Added support for health probes Mar 4, 2026
@sujaya-sys sujaya-sys requested a review from pelzerim March 4, 2026 08:36
@sujaya-sys
Copy link
Copy Markdown
Contributor Author

@pelzerim is this PR g2g from your side? If so I'd kick off testing as you suggested here: #12 (comment)

Thank you!

Comment thread charts/agent/templates/deployment.yaml Outdated
{{- end }}
- name: "AGENT_SERVER_PORT"
value: "{{ .Values.metrics.port }}"
- name: "AGENT_HEALTH_PORT"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am sorry for the confusion here, but I had another look and saw this comment in the code

	// HealthPort is the port for health probe endpoints (/-/readiness, /-/liveness, /health).
	// Default 8081 avoids conflicts with Node agent (which uses 8080).
	// Configure via AGENT_HEALTH_PORT if needed.
	HealthPort int `conf:"default:8081,env:AGENT_HEALTH_PORT"`

This makes things a tad more complicated.

I think we now need 2 readiness/liveness probes for each of the processes in the agent.

Additionally, this comes with a breaking change of the chart. It will only work with a certain version of the agent and up

cc @ejanusevicius

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, what if we had a single endpoint but had one agent call the other to confirm readiness?

Copy link
Copy Markdown
Contributor

@pelzerim pelzerim Mar 9, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets just do a breaking change release of this chart? I'd rather not leak this problem into the runners

Copy link
Copy Markdown
Contributor

@ejanusevicius ejanusevicius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove the .DS_Store?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants