Skip to content

conformance: base-resource cleanup deletes the namespace without waiting, breaking re-run in one process (-count>1) #4942

@lexfrei

Description

@lexfrei

What happened

Running TestGatewayAPIConformance more than once in a single go test process (e.g. -count=2) fails immediately on the second iteration:

gateways.gateway.networking.k8s.io "..." is forbidden: unable to create new content in namespace gateway-conformance-infra because it is being terminated

Why

The base-resource cleanup in Applier.MustApplyWithCleanup (conformance/utils/kubernetes/apply.go) deletes each created object — including the gateway-conformance-infra Namespace — with a plain c.Delete and does not wait for deletion to finish:

t.Cleanup(func() {
	ctx, cancel = context.WithTimeout(context.Background(), timeoutConfig.DeleteTimeout)
	defer cancel()
	err = c.Delete(ctx, uObj)   // returns once deletion is requested, not complete
	...
})

Delete on a Namespace only moves it to Terminating; finalizers drain asynchronously. The cleanup returns while the namespace is still terminating, and the next iteration's setup re-applies the base manifests into it before it is gone.

Why it matters

This blocks running any conformance subtest under -count>1 in a single process — e.g. stress-looping a flaky subtest to characterize it, which is exactly how the RequestMirror timing flake (#4940) is best reproduced. Today each run needs a fresh go test invocation.

Expected

A re-run in the same process should succeed: either the cleanup waits for the namespace to be fully deleted, or setup waits out a Terminating namespace before applying.

Proposal

Either have the namespace cleanup poll until the namespace is actually gone (a WaitForDeletion after Delete), or have setup tolerate a Terminating namespace by waiting for it to clear before applying base resources.

Willing to contribute

Happy to send a PR once the preferred direction is clear.

A quick process question, since I've been turning up a number of these conformance-machinery issues lately: at this volume, what workflow do you prefer? File an issue and wait for a maintainer go-ahead before each PR, open the issue and PR together, or just send PRs directly for small, self-evident fixes? Happy to follow whatever keeps your review queue manageable.

/kind bug

/area conformance-machinery

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/conformance-machineryIssues or PRs related to the machinery and the suite used to run conformance tests.kind/bugCategorizes issue or PR as related to a bug.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions