Skip to content

Adopt controllers should wait for ASO CRDs instead of crash-looping on fresh install #6337

@mboersma

Description

@mboersma

Background

Since #6304 ("Let ASO v2.17.0 manage its own CRDs"), CAPZ no longer bundles the ASO CRDs in infrastructure-components.yaml. ASO installs them at runtime via --crd-pattern. Previously these 11 CRDs (including managedclusters and managedclustersagentpools) were applied during clusterctl init before the CAPZ manager started.

Problem

Two controllers statically watch an ASO-installed CRD at manager startup:

  • ManagedClusterAdoptReconcilerFor(&asocontainerservicev1.ManagedCluster{})
  • AgentPoolAdoptReconcilerFor(&asocontainerservicev1.ManagedClustersAgentPool{})

Both are gated behind the ASOAPI feature gate, which defaults to true. On a fresh install, if the CAPZ manager starts before ASO has applied these CRDs, the typed informers fail their initial cache sync, mgr.Start returns an error, and the manager pod exits (os.Exit(1)) and restarts. It self-heals once ASO creates the CRDs, but produces transient CrashLoopBackOff and up to a cache-sync-timeout (~2m) of startup delay.

(The AzureASOManaged{Cluster,ControlPlane,MachinePool} controllers are unaffected — they watch concrete ASO types dynamically via external.ObjectTracker in resource_reconciler.go.)

Proposed fix

Add a small manager.Runnable that waits (with backoff, honoring the manager's context) for the ManagedCluster and ManagedClustersAgentPool REST mappings to resolve via mgr.GetRESTMapper(), then registers the two adopt controllers via SetupWithManager. The default dynamic RESTMapper reloads on miss, so the mapping resolves once ASO applies the CRDs. Adding informers after mgr.Start() is supported by controller-runtime.

No new dependencies; ~50–90 lines plus a unit test, localized to main.go and the two adopt controllers.

Acceptance criteria

  • Fresh clusterctl init with default feature gates brings the CAPZ manager to Running without CrashLoopBackOff attributable to missing ASO CRDs.
  • make kind-create tilt-up comes up cleanly.
  • The two adopt controllers begin reconciling once the CRDs exist.

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions