[Enhancement] avoid redundant directory creation when create file block (backport #55716) #55878
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Why I'm doing:
After version 3.3, we support spilling operator intermediates to remote object storage. However, in the current implementation, every time a block is allocated, it checks whether the directory exists, resulting in unnecessary API calls and latency.
What I'm doing:
This pull request includes changes to the
FileBlockManager
class to optimize the creation of directories in the file block manager. The most important changes involve adding a check to avoid redundant directory creation and introducing a new member variable to track the last created directory.Optimization of directory creation:
be/src/exec/spill/file_block_manager.cpp
: Added a check to ensure that the directory is only created if it has not been created before, reducing unnecessary filesystem operations. ([be/src/exec/spill/file_block_manager.cppR237-R240](https://github.com/StarRocks/starrocks/pull/55878/files#diff-3bb4fc69ffe78e61658f4b5aa8ac78514b11550a0c87a1781c33d6a9d19a5e02R237-R240)
)be/src/exec/spill/file_block_manager.h
: Introduced a new member variable_last_created_container_dir
to store the path of the last created directory, which is used in the aforementioned check. ([be/src/exec/spill/file_block_manager.hR51](https://github.com/StarRocks/starrocks/pull/55878/files#diff-80a364cbfceb51fff006948ceed1b690236ad393f2fca768cf22032eb88b34e9R51)
)What type of PR is this:
Does this PR entail a change in behavior?
If yes, please specify the type of change:
Checklist: