Skip to content

Receive: panic "Invalid name validation scheme requested: unset" when using --receive.relabel-config-file #8823

@mariadb-MarioEApostolov

Description

Thanos, Prometheus and Golang version used:

  • Thanos: v0.41.0 (docker image thanosio/thanos:v0.41.0)
  • Prometheus (sending via remote-write): v3.9.2 / v3.7.3 (both trigger the bug)
  • Go: 1.25.0 (as built in v0.41.0)
  • Dependencies: prometheus/common@v0.67.5, prometheus/prometheus@v0.308.0

Object Storage Provider: N/A (affects receive component only)

What happened:

The Thanos receiver-router panics with Invalid name validation scheme requested: unset when processing incoming remote-write requests with --receive.relabel-config-file configured. The panic causes Go's HTTP server to close the connection, resulting in downstream clients receiving EOF errors.

Panic trace

panic: Invalid name validation scheme requested: unset
goroutine 5312 [running]:
net/http.(*conn).serve.func1()
	/usr/local/go/src/net/http/server.go:1943 +0xd3
panic({0x2c21820?, 0xc00063e050?})
	/usr/local/go/src/runtime/panic.go:783 +0x132
github.com/prometheus/common/model.ValidationScheme.IsValidLabelName(0xc0015660f7?, {0xc0014b2010?, 0xc?})
	/go/pkg/mod/github.com/prometheus/common@v0.67.5/model/metric.go:203 +0xb4
github.com/prometheus/prometheus/model/relabel.relabel(0xc000390b60, 0xc00150e050)
	/go/pkg/mod/github.com/prometheus/prometheus@v0.308.0/model/relabel/relabel.go:340 +0x812
github.com/prometheus/prometheus/model/relabel.ProcessBuilder(...)
	/go/pkg/mod/github.com/prometheus/prometheus@v0.308.0/model/relabel/relabel.go:290
github.com/prometheus/prometheus/model/relabel.Process({0xc000591b08?, 0x1a?, 0x23?}, {0xc0006d3780, 0x4, 0x4?})
	/go/pkg/mod/github.com/prometheus/prometheus@v0.308.0/model/relabel/relabel.go:280 +0x5f
github.com/thanos-io/thanos/pkg/receive.(*Handler).relabel(0xc000274460, 0xc000755410)
	/app/pkg/receive/handler.go:1132 +0x108
github.com/thanos-io/thanos/pkg/receive.(*Handler).receiveHTTP(0xc000274460, {0x42d3580, 0xc001a000e0}, 0xc0002ae500)
	/app/pkg/receive/handler.go:620 +0xf8d

What you expected to happen:

Relabeling should work without panics, using UTF-8 validation as the default scheme.

How to reproduce it (as minimally and precisely as possible):

  1. Deploy Thanos v0.41.0 receiver-router with --receive.relabel-config-file pointing to a relabel config that includes a replace action with a non-default regex:
    - source_labels: [some_label]
      regex: "(.+)"
      target_label: new_label
  2. Send remote-write requests from a Prometheus instance that has some_label populated on its metrics
  3. Observe panic in receiver-router logs

Full logs to relevant components:

See panic trace above. The panic is recovered by Go's HTTP server, logged at error level, and the connection is closed (causing EOF on the client side).

Anything else we need to know:

Root Cause

In v0.41.0, the relabel.Config struct (from prometheus/prometheus@v0.308.0) has a NameValidationScheme field of type model.ValidationScheme. When relabel configs are parsed from YAML via yaml.Unmarshal in cmd/thanos/receive.go:224, this field is not present in user-supplied YAML files, so it defaults to the zero value UnsetValidation (0).

At runtime, relabel.relabel() at relabel.go:340 calls:

if !cfg.NameValidationScheme.IsValidLabelName(target) {

This panics because IsValidLabelName does not handle UnsetValidation.

The existing unit tests in handler_test.go pass because they manually set NameValidationScheme: model.UTF8Validation on each test config struct.

Why some deployments are unaffected

The panic only triggers when the Replace action's regex matches the source label value:

case Replace:
    // Fast path - skips IsValidLabelName entirely
    if val == "" && cfg.Regex == DefaultRelabelConfig.Regex && ... {
        lb.Set(cfg.TargetLabel, cfg.Replacement)
        break
    }

    indexes := cfg.Regex.FindStringSubmatchIndex(val)
    if indexes == nil {
        break  // regex didn't match - skips IsValidLabelName
    }
    target := ...
    if !cfg.NameValidationScheme.IsValidLabelName(target) {  // PANIC when UnsetValidation

If the incoming metrics don't have the source labels populated (e.g., customer_key is empty), then:

  • val = ""regex: "(.+)" doesn't match → indexes == nilbreak before reaching the panic

This means clusters where the sending Prometheus doesn't emit metrics with the relevant labels will never hit the panic, even on v0.41.0 with the same relabel config.

Suggested fix

After unmarshaling relabel configs, set NameValidationScheme = model.UTF8Validation on any config where it is unset:

for _, rc := range relabelConfig {
    if rc.NameValidationScheme == model.UnsetValidation {
        rc.NameValidationScheme = model.UTF8Validation
    }
}

Environment:

  • Kubernetes: GKE 1.33+
  • Thanos receiver mode: router

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions