Fix missing move of query_metadata_cache in BlockIO::operator=#96995
Fix missing move of query_metadata_cache in BlockIO::operator=#96995alexey-milovidov wants to merge 5 commits intomasterfrom
Conversation
Closes #95742 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…cache Concurrent mutations, selects, and detach/attach operations exercise the race where the cache is destroyed prematurely while the storage is being removed from the database. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
| # DETACH removes the storage from the database, dropping its StoragePtr. | ||
| # If the cache snapshot holds the last ref, destroying the cache will | ||
| # free the storage while parts still exist in shared_ranges_in_parts. | ||
| $CLICKHOUSE_CLIENT --query "DETACH TABLE ${TABLE}" 2>/dev/null |
Related PRs and IssuesPrevious fix attempts (all reverted)
Origin of the bug
Issues
|
…opTablesParallel` path Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
MSan stress test report — same root cause confirmedAnalysisThe MSan trace from the Destruction path shown in the MSan trace:
Why our fix addresses this: The By moving |
The `select_thread` function outputs `count()` results to stdout, polluting the test output with `0` values that don't match the reference file. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary
BlockIO::operator=(BlockIO&&)was not movingquery_metadata_cache, causing the cache to be destroyed prematurely on every query through both TCP and HTTP paths.Closes #95742
Root Cause Analysis
Bug #1:
BlockIO::operator=does NOT movequery_metadata_cacheIn
src/QueryPipeline/BlockIO.cpp:26-44:This is triggered on every query via both execution paths:
executeQuery.cpp:1970):res = executeQueryImpl(...)executeQuery.cpp:2168):streams = executeQueryImpl(...)The consequence:
executeQueryImplpacks the cache into its returnedBlockIO, butoperator=strips it out. The tempBlockIOis destroyed immediately, destroying the cache while the pipeline lives on inres.What happens when the cache is destroyed prematurely
For mutation validation queries (ALTER TABLE ... UPDATE):
MutationsInterpreter::validate()builds a validation pipeline internally, which caches aStorageSnapshotPtrviagetStorageSnapshotin theQueryMetadataCachevalidate()returns — its internal pipeline is destroyed, releasing itsStorageSnapshotPtrStorageSnapshotPtr(refcount = 1)InterpreterAlterQuery::execute()returns an emptyBlockIO(no pipeline)executeQueryImplpacks the cache into the returnedBlockIOres = executeQueryImpl(...)→operator=moves the pipeline (empty) but NOT the cacheBlockIOis destroyed → cache destroyed → lastStorageSnapshotPtrreleased~StorageSnapshot→~SnapshotData:partsreleased → parts freed →clearCaches()→ accessesstorageWhy
SnapshotData::storagedoesn't always save usWithin
SnapshotDatadestruction,parts(line 623) is freed beforestorage(line 619). So normally the storage IS alive whenclearCachesruns.But there's a complicating factor:
MergeTreeData::shared_ranges_in_parts(line 1496) may hold the sameRangesInDataPartsPtr. When bothSnapshotData::partsandshared_ranges_in_partsshare the sameshared_ptr, the destruction chain becomes:SnapshotData::partsreleased → refcount 2→1 (not freed,shared_ranges_in_partsstill holds it)SnapshotData::storagereleased → if table was DETACHED, this is the last ref →MergeTreeDatadestructor runs~MergeTreeData(),shared_ranges_in_parts(line 1496) is destroyed → parts freed →clearCaches()→ accessesthis(the MergeTreeData being destroyed)At this point we're inside MergeTreeData's destructor. The
StorageMergeTree/StorageReplicatedMergeTreedestructor has already run (derived class destructors run first). Ifshutdown()invalidated any state thatclearCachesdepends on (like context caches, virtual table pointers after derived destructor), this could SEGFAULT.Bug #2:
InterpreterOptimizeQuerymutably modifies cached snapshotIn
src/Interpreters/InterpreterOptimizeQuery.cpp:81-82:This clears
partson a cachedStorageSnapshotvia a mutable cast, whilestorageremains set. This is a data race on shared state and could cause parts to be freed unexpectedly.Historical context
Azat Khuzhin attempted a different fix in Jan 2026:
1002f7ce907— RemovedSnapshotData::storage, addedBlockIO::resetPipeline()to destroy cache before pipelined02700909db— Reordered BlockIO fields to match8c44f5e9374— Reverted both, puttingSnapshotData::storagebackThe revert restored the status quo, but the
BlockIO::operator=bug (which predates all this) was never fixed.Changelog category:
Changelog entry:
Fix crash (SEGFAULT) in
clearCachescaused byBlockIO::operator=not movingquery_metadata_cache, leading to premature destruction of cached storage snapshots and use-after-free ofMergeTreeDatastorage.🤖 Generated with Claude Code