Problem
When a user deploys an application via the dashboard, the application detail view shows a single high-level status (Progressing, Ready, Failed). For long-running installs that get stuck at the Kubernetes layer — Pod cannot be scheduled, PVC stays Pending, image pull fails, init container crashes — the dashboard surface gives no signal beyond Progressing. The user has to drop to kubectl describe pod / kubectl get events to find out what is actually wrong.
Reproduction
- Pick any external app from the marketplace that requests heavy resources (e.g. a vLLM-style chart with 16 GiB memory request)
- Deploy with default spec on a cluster that cannot satisfy the request — e.g. a 3-node cluster with no GPU and modest memory
- Watch the application detail view in the dashboard
What the user sees
Status: Progressing
Age: 28m
Version: 0.1.1+...
Message: Running 'install' action with timeout of 5m0s
The status remains Progressing indefinitely — Flux's helm-controller spins through 5-minute install timeouts and retries while the underlying Pod can never be scheduled.
What is actually happening (visible only via kubectl)
$ kubectl describe pod -l app.kubernetes.io/instance=<release>
Events:
FailedScheduling: 0/3 nodes are available:
1 node(s) had untolerated taint {drbd.linbit.com/lost-quorum: }
2 Insufficient memory
Expected behavior
The application detail view should surface:
- Pod-level events for resources created by the chart — at minimum the most recent
FailedScheduling, FailedMount, BackOff, Unhealthy events
- HelmRelease's lastAttemptedHelmInstall message — the helm logs currently buried in
kubectl describe helmrelease
- A derived
Reason field summarising the most actionable problem (e.g. "Insufficient memory on all nodes", "PVC not bound: no provisioner", "Image pull error: …")
This makes the chain CR → HelmRelease → Resources → Pod legible without leaving the dashboard.
Proposed solution
UI changes
- Add an Events panel to the application detail view that lists events for all resources labelled with this release (filtered by
app.kubernetes.io/instance or the HelmRelease's chart-managed-by label)
- Add a Helm output panel showing the tail of
status.history[-1] and status.lastAttemptedHelmInstall.message from the corresponding HelmRelease
- Promote the most recent
Warning-level event into the top status bar as a Reason field
Controller-side (optional, separate change)
- The ApplicationDefinition controller could propagate a condensed
Reason from the underlying HelmRelease into the CR's status.conditions[].message. The dashboard then would not need to chase HelmRelease → Pod every time — the CR itself would carry enough information for the high-level view.
Workarounds today
Operators currently rely on:
kubectl describe pod -n <tenant-ns> -l app.kubernetes.io/instance=<release>
kubectl get events -n <tenant-ns> --sort-by=.lastTimestamp | tail -20
kubectl describe helmrelease -n <tenant-ns> <release>
This is acceptable for operators familiar with Kubernetes internals but defeats the purpose of having a dashboard for application lifecycle management.
Context
Observed during the development of an external-apps catalog (vLLM, ComfyUI, JupyterHub, Langflow, n8n, Open WebUI, HolmesGPT) registered as ApplicationDefinitions in Cozystack 1.4. Several of these apps have hard resource requirements (GPU, memory) that fail to satisfy on smaller smoke clusters — the dashboard's silent Progressing state actively misleads operators into waiting hours before discovering the install can never succeed.
Problem
When a user deploys an application via the dashboard, the application detail view shows a single high-level status (
Progressing,Ready,Failed). For long-running installs that get stuck at the Kubernetes layer — Pod cannot be scheduled, PVC stays Pending, image pull fails, init container crashes — the dashboard surface gives no signal beyondProgressing. The user has to drop tokubectl describe pod/kubectl get eventsto find out what is actually wrong.Reproduction
What the user sees
The status remains
Progressingindefinitely — Flux's helm-controller spins through 5-minute install timeouts and retries while the underlying Pod can never be scheduled.What is actually happening (visible only via kubectl)
Expected behavior
The application detail view should surface:
FailedScheduling,FailedMount,BackOff,Unhealthyeventskubectl describe helmreleaseReasonfield summarising the most actionable problem (e.g. "Insufficient memory on all nodes", "PVC not bound: no provisioner", "Image pull error: …")This makes the chain
CR → HelmRelease → Resources → Podlegible without leaving the dashboard.Proposed solution
UI changes
app.kubernetes.io/instanceor the HelmRelease's chart-managed-by label)status.history[-1]andstatus.lastAttemptedHelmInstall.messagefrom the corresponding HelmReleaseWarning-level event into the top status bar as aReasonfieldController-side (optional, separate change)
Reasonfrom the underlying HelmRelease into the CR'sstatus.conditions[].message. The dashboard then would not need to chaseHelmRelease → Podevery time — the CR itself would carry enough information for the high-level view.Workarounds today
Operators currently rely on:
This is acceptable for operators familiar with Kubernetes internals but defeats the purpose of having a dashboard for application lifecycle management.
Context
Observed during the development of an external-apps catalog (vLLM, ComfyUI, JupyterHub, Langflow, n8n, Open WebUI, HolmesGPT) registered as
ApplicationDefinitions in Cozystack 1.4. Several of these apps have hard resource requirements (GPU, memory) that fail to satisfy on smaller smoke clusters — the dashboard's silentProgressingstate actively misleads operators into waiting hours before discovering the install can never succeed.