Skip to content

receive: Active tenants not compacting due to wall clock check#8850

Draft
saswatamcode wants to merge 1 commit into
thanos-io:mainfrom
saswatamcode:recwallclock
Draft

receive: Active tenants not compacting due to wall clock check#8850
saswatamcode wants to merge 1 commit into
thanos-io:mainfrom
saswatamcode:recwallclock

Conversation

@saswatamcode

@saswatamcode saswatamcode commented Jun 13, 2026

Copy link
Copy Markdown
Member

Adds a test to capture scenario where an active tenant (after new rollout), wouldn't get compacted until second tick due to wall clock check in doIter (now tryCompactHead).

Trying some fixes for it, will test on actual cluster monday to actually confirm the issue here, needs to be rolled out and tested over multiple hours.

TestPeriodicHeadCompaction passes but TestWallClockCompactionBug fails.

Running tool: /usr/local/go/bin/go test -test.fullpath=true -timeout 30s -run ^TestPeriodicHeadCompaction$ github.com/thanos-io/thanos/pkg/receive

ok  	github.com/thanos-io/thanos/pkg/receive	5.863s

Running tool: /usr/local/go/bin/go test -test.fullpath=true -timeout 30s -run ^TestWallClockCompactionBug$ github.com/thanos-io/thanos/pkg/receive

level=info component=multi-tsdb tenant=test-tenant msg="opening TSDB"
level=info component=multi-tsdb tenant=test-tenant caller=head.go:681 time=2026-06-13T13:46:56.827012+01:00 msg="Replaying on-disk memory mappable chunks if any"
level=info component=multi-tsdb tenant=test-tenant caller=head.go:767 time=2026-06-13T13:46:56.827117+01:00 msg="On-disk memory mappable chunks replay completed" duration=18.958µs
level=info component=multi-tsdb tenant=test-tenant caller=head.go:775 time=2026-06-13T13:46:56.827133+01:00 msg="Replaying WAL, this may take a while"
level=info component=multi-tsdb tenant=test-tenant caller=head.go:848 time=2026-06-13T13:46:56.827374+01:00 msg="WAL segment loaded" segment=0 maxSegment=0 duration=192.125µs
level=info component=multi-tsdb tenant=test-tenant caller=head.go:885 time=2026-06-13T13:46:56.827385+01:00 msg="WAL replay completed" checkpoint_replay_duration=44.709µs wal_replay_duration=201.709µs wbl_replay_duration=42ns chunk_snapshot_load_duration=0s mmap_chunk_replay_duration=18.958µs total_replay_duration=286.25µs
level=info component=multi-tsdb tenant=test-tenant caller=db.go:2096 time=2026-06-13T13:46:56.827976+01:00 msg="Compactions disabled"
level=info component=multi-tsdb tenant=test-tenant msg="TSDB is now ready"
level=info component=multi-tsdb tenant=test-tenant msg="starting periodic head compaction" initial_delay=5m15.368s interval=2h0m0s
--- FAIL: TestWallClockCompactionBug (0.02s)
    /Users/saswatamcode/web/thanos/pkg/receive/multitsdb_test.go:689: Data span: 12600000ms (3.5h), threshold: 10800000ms (3.0h) should compact
    /Users/saswatamcode/web/thanos/pkg/receive/multitsdb_test.go:691: Wall-clock since MinTime: 4ms, which means we just skip compaction
    /Users/samukher/web/thanos/pkg/receive/multitsdb_test.go:703: �[31mmultitsdb_test.go:703: "compaction should have created blocks: before=0, after=0"�[39m
        
FAIL
FAIL	github.com/thanos-io/thanos/pkg/receive	0.790s
FAIL

  • I added CHANGELOG entry for this change.
  • Change is not relevant to the end user.

Changes

Verification

Adds a test to capture scenario where an active tenant (after new
rollout), wouldn't get compacted to due wall clock check in doIter (now
tryCompactHead).

Trying some fixes for it, will test on actual cluster monday.

Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant