Skip to content

bug(chart): VLAN sub-interface missing from talm template output on v1.12 multi-doc path #140

@lexfrei

Description

What

A VLAN sub-interface that is configured on a Talos node is not emitted as a VLANConfig document by talm template on the v1.12 multi-doc rendering path. The node ends up with the VLAN silently dropped from its rendered config.

Repro

Node has a VLAN sub-interface configured (carrying the default route, in the reporter's setup). Then:

talm template -t templates/controlplane.yaml -n <NODE_IP> -e <NODE_IP> > nodes/cozy-01.yaml

Expected: nodes/cozy-01.yaml contains an apiVersion: v1alpha1, kind: VLANConfig document describing the VLAN.

Observed: no VLANConfig in the output. The VLAN information is absent.

Affected versions

Regression introduced by the migration to the Talos v1.12 multi-doc config format in #116 (v0.24.0). The legacy v1alpha1 single-doc render path is not affected.

Candidate root causes (need confirmation against the live repro)

The multi-doc renderer in charts/generic/templates/_helpers.tpl (talos.config.network.multidoc, lines 86-213) and the cozystack mirror in charts/cozystack/templates/_helpers.tpl only emits a VLANConfig when:

  1. talm.discovered.default_link_name_by_gateway resolves to a link, AND
  2. that link has spec.kind == "vlan" per the live COSI links resource.

Three independent paths drop a configured VLAN:

  1. Secondary VLANs are dropped unconditionally. The renderer handles only the gateway-bearing link. A VLAN that does not carry the default route (storage / mgmt) is never emitted, regardless of discovery state. See charts/generic/templates/_helpers.tpl:121-184. Tracked separately as a coverage gap.

  2. existing_interfaces_configuration is intentionally bypassed in multi-doc mode (charts/generic/templates/_helpers.tpl:87-90). The legacy path consults lookup "machineconfig" "" "v1alpha1" and copies user-declared interfaces verbatim; the multi-doc path explicitly does not. Any VLAN declared in the existing machine config (legacy schema) is silently discarded. Tracked separately.

  3. Discovery may not see the VLAN at template time. is_vlan reads $link.spec.kind == "vlan" from the live COSI links resource (charts/talm/templates/_helpers.tpl:262-268). If the VLAN link is not yet up (first apply, or replay after wipe, or the older Talos on the live node represents the link differently), the check fails and no VLANConfig is emitted.

Information needed to narrow it down

  • Output of talm template ... for the affected node (IPs may be redacted).
  • talosctl --nodes <NODE_IP> get links --output yaml to confirm what discovery actually sees, and the value of spec.kind for the VLAN link.
  • Talos version reported by talosctl --nodes <NODE_IP> version vs talosVersion declared in values.yaml / Chart.yaml.
  • Where the VLAN was declared on the cluster: in the running MachineConfig (legacy machine.network.interfaces[].vlans[]), or already as a separate VLANConfig document?
  • Whether the VLAN carries the default route.

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/chartIssues or PRs related to charts/ (Chart.yaml, helpers, templates)area/networkingIssues or PRs related to networking (interfaces, VIP, routes)kind/bugCategorizes issue or PR as related to a bugkind/regressionCategorizes issue or PR as related to a regression from a prior releasepriority/critical-urgentHighest priority. Must be actively worked on as someones top priority right nowpriority/important-soonMust be staffed and worked on either currently, or very soon, ideally in time for the next releasetriage/acceptedIndicates an issue is ready to be actively worked on

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions