fix(charts): rewrite v1.12 multi-doc network renderer for full link coverage#147
Conversation
There was a problem hiding this comment.
Code Review
This pull request updates the Talos v1.12+ multi-document network configuration rendering logic in the cozystack and generic charts. The changes ensure that all configurable links—including physical NICs, bonds, and VLANs—are correctly processed with MTU support, while bond slaves are filtered out to prevent configuration conflicts. It also introduces guardrails to fail fast when legacy network schemas are detected or when unsupported bridge configurations carry the default route. Feedback was provided regarding the README.md, where the collapsing of paragraphs into single long lines was identified as a readability and maintainability issue for the source file.
| > **Note:** The output format depends on the Talos version configured in `Chart.yaml` (`templateOptions.talosVersion`) or via the `--talos-version` CLI flag. | ||
| > For Talos < v1.12, the output is a single YAML document with `machine.network` and `machine.registries` sections (as shown above). | ||
| > For Talos >= v1.12, the output uses the multi-document format with separate typed documents instead of the deprecated monolithic fields. `HostnameConfig`, `ResolverConfig` and a network interface document (`LinkConfig`, `BondConfig`, or `VLANConfig` — depending on topology) are always emitted; `Layer2VIPConfig` appears on controlplane nodes when `floatingIP` is set; `RegistryMirrorConfig` is emitted only by the cozystack chart. | ||
| > **Note:** The output format depends on the Talos version configured in `Chart.yaml` (`templateOptions.talosVersion`) or via the `--talos-version` CLI flag. For Talos < v1.12, the output is a single YAML document with `machine.network` and `machine.registries` sections (as shown above). For Talos >= v1.12, the output uses the multi-document format with separate typed documents instead of the deprecated monolithic fields. `HostnameConfig` and `ResolverConfig` are always emitted; one network interface document is emitted per configurable link on the node (`LinkConfig` for physical NICs, `BondConfig` for bond masters, `VLANConfig` for VLAN sub-interfaces) — multi-NIC nodes therefore produce one document per NIC, not one document total. The link carrying the IPv4 default route gets the gateway entry on its document; every other link is emitted gateway-less. Both IPv4 and IPv6 global-scope addresses on a link are surfaced in its document. Bond slaves are filtered out so they do not collide with the master's `BondConfig`. Bridges are deliberately not auto-emitted as `BridgeConfig` yet — a non-gateway bridge is skipped (declare it via a per-node body overlay if needed); a bridge that carries the default route fails the render with a clear migration hint. The operator-declared `floatingIP` is stripped from per-link addresses so the VIP currently held by a leader does not leak into the static `LinkConfig`. `Layer2VIPConfig` appears on controlplane nodes when `floatingIP` is set; `RegistryMirrorConfig` is emitted only by the cozystack chart. |
There was a problem hiding this comment.
This line and several others in this file have been collapsed into very long single lines of text. While this doesn't affect the rendered output, it harms the readability and maintainability of the source Markdown file. Please consider re-wrapping these long paragraphs to a conventional line length (e.g., 80-120 characters) to improve readability for future contributors.
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (5)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
…U, bond, bridge, and legacy guardrail Add MultiDoc subtests covering the headline coverage gaps in the v1.12 multi-doc renderer: - TestMultiDocRendersAllConfigurableLinks: fixture with eth0 (routed) + eth1 (storage). Renderer must emit a LinkConfig per configurable link instead of stopping at the gateway-bearing one. - TestMultiDocLinkConfigCarriesMTU: same fixture with non-default MTU on eth1 (jumbo). LinkConfig must surface mtu so a re-render does not silently drop the value. - TestMultiDocEmitsVLANConfigForDiscoveredVLAN: fixture with eth0.4000 carrying the default route via a VLAN over eth0. The renderer must emit LinkConfig (parent eth0) + VLANConfig (eth0.4000 with parent + vlanID) and place the gateway on the VLAN. - TestMultiDocBondSlavesNotEmittedAsLinkConfig: fixture with two physical NICs enrolled into a bond master. Slaves must not appear as standalone LinkConfig documents alongside the master's BondConfig — Talos rejects the conflicting declarations. - TestMultiDocBridgeSkipsLinkConfigBranch: fixture with a discovered bridge link. Bridges must not fall through to LinkConfig (wrong document kind) — they are skipped until BridgeConfig support lands. - TestMultiDocBondAndVLANCarryMTU: pin that MTU surfaces on BondConfig and VLANConfig, not just LinkConfig. - TestMultiDocLayer2VIPLinkOnVLANDefaultGateway: pin that the discovery-derived Layer2VIPConfig.link points at the VLAN sub-interface when a VLAN carries the default route. - TestMultiDocBondConfigOmitsMissingBondMasterFields: pin graceful degradation when discovery returns a bond with partial bondMaster — the renderer must omit unset fields rather than emit bondMode: <nil> and break YAML. - TestMultiDocFailsOnLegacyInterfacesInRunningConfig: fixture with non-empty machine.network.interfaces[] in the running MachineConfig. The renderer must fail with a clear migration message rather than silently dropping the legacy block on first re-render after upgrade from a chart that emitted the legacy schema. Also update TestCozystackChartRendersIPv4GatewayOnDualStack's two-NIC subtest: with the multi-link renderer landing in the same change, eth1 must now appear as its own LinkConfig (gateway-less, IPv6-only addresses), not be silently dropped. The single-NIC subtest contract is unchanged. Six new fixtures support the new tests: multiNicLookup, multiNicWithVLANLookup, bondWithSlavesLookup, bondWithoutBondMasterLookup, bridgeLookup, legacyInterfacesInRunningConfigLookup. All fixtures return the resource map for per-id lookups (mirroring real discovery shape) so the slave-filter and bridge-skip branches hit production-like inputs. All new tests fail on the previous chart; the renderer rewrite in the follow-up commit makes them green. Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>
…overage The previous multi-doc renderer resolved a single $defaultLinkName via talm.discovered.default_link_name_by_gateway and emitted exactly one network document for it: BondConfig, VLANConfig, or LinkConfig depending on the gateway link kind. Every other link on the node — secondary physical NIC, VLAN sub-interface on a non-gateway parent, additional bond — was silently dropped from the rendered config. Replace the single-link block with a range over talm.discovered.configurable_link_names, emitting the appropriate document per link kind (bond / vlan / physical). The IPv4 default-route gateway is placed only on the link that carries it; every other link gets addresses without a default route. MTU is surfaced on each document when discovery reports a value, so non-default-MTU links (jumbo frames, GRE) survive a re-render. Also harden three coverage edges that the simple range exposes: - Filter bond slaves out of configurable_link_names. Slaves match the physical-NIC regex but are members of a bond master via spec.slaveKind; emitting a standalone LinkConfig for a slave alongside the master's BondConfig produces conflicting declarations Talos rejects on controller convergence. - Skip kind=bridge in the renderer. The chart does not yet emit BridgeConfig, so falling through to LinkConfig would land a wrong-kind document. The branch is reserved for a future PR that adds bridge support. - Gate every BondMaster sub-field on its presence. A bond returned with a partial or absent bondMaster (real Talos shape on freshly- created bonds before the master controller fills the spec) used to render bondMode: <nil> and break the parse. Surface a hard-fail in the multi-doc renderer when the running MachineConfig still carries non-empty machine.network.interfaces[]. The renderer cannot translate those entries today, and silently dropping them on the first re-render after a chart upgrade from the legacy schema would erase user-declared interfaces, addresses, and VLANs. The fail message points the operator at the concrete migration path: move the interfaces into per-node body overlays as v1.12 typed documents, or pin templateOptions.talosVersion to v1.11 until the translator lands. README updated to describe the per-link emission, the bond-slave filter, and the upgrade-from-legacy guardrail with the fail message text and migration path. Together this closes the multi-doc renderer coverage gaps: single-link-only emission, dropped non-gateway VLAN sub-interfaces, silent loss of legacy machine.network.interfaces during upgrade, and missing MTU on every multi-doc document kind. Assisted-By: Claude <noreply@anthropic.com> Signed-off-by: Aleksei Sviridkin <f@lex.la>
79be083 to
e64dfd2
Compare
Maxim Kitsunoff (kitsunoff)
left a comment
There was a problem hiding this comment.
LGTM
What changed
The v1.12 multi-doc network renderer in cozystack and generic charts now emits one document per configurable link on the node instead of stopping at the gateway-bearing one. Multi-NIC nodes therefore produce multiple
LinkConfig/BondConfig/VLANConfigdocuments — a node with a routed NIC plus a storage NIC ends up with both NICs configured in the rendered config.The IPv4 default-route gateway is placed only on the link that carries it; every other link is emitted gateway-less. MTU is surfaced on
LinkConfig,BondConfig, andVLANConfigwhen discovery reports a value, so non-default-MTU links (jumbo frames, GRE) survive a re-render. Both IPv4 and IPv6 global-scope addresses on a link are surfaced.Guardrails:
configurable_link_namesso they do not collide with the master'sBondConfig.BridgeConfigyet — a non-gateway bridge is skipped (declare it via per-node body); a bridge that carries the default route fails the render with a clear migration hint.parentorvlanIDfails the render rather than emitting a document Talos rejects on apply.bondMastergracefully degrades — unset fields are omitted instead of producingbondMode: <nil>.floatingIPis stripped from per-link addresses so the VIP currently held by a leader does not leak into the staticLinkConfig.machine.network.interfaces[]in the running MachineConfig triggers a hard fail with a migration message, preventing silent loss of user network declarations during upgrade from a chart that emitted the legacy schema.Tests
12 new
MultiDocsubtests pin every guardrail and every emit path: multi-link iteration, MTU on all three doc kinds, VLAN emission, bond slaves filtered, bridge skip + bridge-as-gateway fail, VLAN-without-parent fail, VLAN-without-vlanID fail, partial-bondMaster gracefully degraded, floatingIP stripped from addresses, dual-stack IPv6 addresses surfaced, Layer2VIPConfig.link on bond and VLAN gateway, legacy-interfaces guardrail. Six new fixtures mirror real Talos COSI shape (slaveKind, masterIndex, linkIndex). README updated to describe per-link emission, the bond-slave filter, the bridge contract, the dual-stack contract, the VIP-strip, and the upgrade-from-legacy guardrail.Closes #140.
Closes #141.
Closes #142.
Closes #143.