Skip to content

kops create cluster panics on empty or sign-only --kubernetes-feature-gates values #18407

@fallintoplace

Description

@fallintoplace

/kind bug

1. What kops version are you running? The command kops version, will display
this information.

Still reproducible in the current master checkout at 432f70c764.

History note: this behavior appears to have been introduced by commit f60df9b95506 (merged via PR #14577, "Add option for setting Kubernetes feature gates"). The first released tag containing that commit appears to be v1.26.0.

2. What Kubernetes version are you running? kubectl version will print the
version if a cluster is running or provide the Kubernetes version specified as
a kops flag.

Not cluster-specific. This happens during kops create cluster before a cluster is created.

3. What cloud provider are you using?

Any. The panic is in the generic cluster creation path before provider-specific behavior matters.

4. What commands did you run? What is the simplest way to reproduce this issue?

A minimal repro is to pass a sign-only feature gate value:

kops create cluster example.com \
  --cloud=aws \
  --zones=us-east-1a \
  --networking=cni \
  --dry-run -oyaml \
  --kubernetes-feature-gates=+

The same code path is also reachable if KubernetesFeatureGates contains an empty string.

5. What happened after the commands executed?

kops create cluster can panic in upup/pkg/fi/cloudup/new_cluster.go when it iterates opt.KubernetesFeatureGates.

The loop indexes featureGate[0] before validating that the string is non-empty, and then indexes featureGate[0] again after stripping a leading +.

Current behavior from source inspection:

  • "" -> panic (index out of range)
  • "+" -> strips to "", then panics on the second featureGate[0] access
  • "-" -> does not panic, but stores an empty feature-gate name as disabled
  • "+Foo", "-Foo", "Foo" -> accepted

The current implementation on master is here:

  • upup/pkg/fi/cloudup/new_cluster.go:287-300

6. What did you expect to happen?

Malformed values should return a validation error (for example, rejecting empty or sign-only feature gate names) instead of panicking or writing an empty feature-gate key.

7. Please provide your cluster manifest. Execute
kops get --name my.example.com -o yaml to display your cluster manifest.
You may want to remove your cluster name and other sensitive information.

# Minimal relevant input is the feature gate list itself.
KubernetesFeatureGates:
- "+"

8. Please run the commands with most verbose logging by adding the -v 10 flag.
Paste the logs into this report, or in a gist and provide the gist link here.

I found this via source inspection in the current repo snapshot, so I do not have a full runtime log to attach.

Related history / coverage:

9. Anything else do we need to know?

A straightforward fix would be to trim/validate each entry before indexing, reject empty/sign-only values, and add regression coverage for:

  • ""
  • "+"
  • "-"
  • "+Foo"
  • "-Foo"
  • "Foo"

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugCategorizes issue or PR as related to a bug.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions