Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Enhancement] Throttle scan to wait for topn filter (backport #55660) #55875

Closed
wants to merge 1 commit into from

Conversation

mergify[bot]
Copy link
Contributor

@mergify mergify bot commented Feb 13, 2025

Why I'm doing:

High associative hash join is time-consuming, since it magnifies data volume many times. if there is a topn operator above it, then we can use this topn filter generated by topn operator to reduce input data volume of the hash join; however, when perform tests on this, the scan operator below hash join always transfers data to the hash join so fast that make the topn filter take effects on scan operator too late, so input data volume of the hash join is not reduced successfully, so we design a back pressure mechanism that works as follows:

  1. scan operator allows rows of 10 times of limit+offset in topn operator to pass through to hash join operator, then wait for a small period of time(e.g. 100ms), we call this period the throttle period.
  2. scan operator has_output return false in throttle period, so scan operator does not transfer any data, just give a chance to topn operator to generate a topn filter.
  3. when current throttle period ends, scan operator use topn filter to filter its output data, if the topn filter is high selective, then scan operator can terminate this back pressure mechanism, just use this topn filter to filter incoming data.
  4. otherwise, scan operator begins an another throttle period.
  5. scan operator maybe begin throttle period for several times which controlled by the session variable: back_pressure_back_rounds, the throttle period equals to back_pressure_throttle_time_upper_bound/back_pressure_back_rounds.
  6. topn_filter_back_pressure_mode is used to turn on/off the back pressure mechanism.

Test

when topn filter back pressure mechanism is opened,data volume of left side of hash join is reduced to 1/60.
image

when it is closed
image

data volume of left side of hash join is reduced to 1/60.

  1. pipeline_dop=0, concurrency=20,back_pressure_back_rounds=3
+===================+==============+
| cases             | latency(sec) |
+===================+==============+
| disable opt       | 11.692       |
| enable opt(60ms)  | 5.920        |
| enable opt(100ms) | 5.853        |
| enable opt(300ms) | 5.959        |
| enable opt(600ms) | 6.279        |
+-------------------+--------------+

disable opt means turn off the optimization;
enable opt(60ms) means turn on the optimization; and back_pressure_throttle_time_upper_bound=60, i.e. total throttle time does not exceeds 60ms.

  1. pipeline_dop=1, back_pressure_throttle_time_upper_bound=300,back_pressure_back_rounds=10
+=============+==================+=================+=========+
| concurrency | disable opt(sec) | enable opt(sec) | speedup |
+=============+==================+=================+=========+
| 1           | 0.991            | 0.953           | 1.0X    |
| 10          | 4.089            | 2.831           | 1.4X    |
| 20          | 7.735            | 5.034           | 1.5X    |
| 40          | 15.210           | 9.688           | 1.5X    |
| 60          | 22.760(OOM)      | 14.600          | 1.5X    |
+-------------+------------------+-----------------+---------+

What I'm doing:

Fixes #issue

What type of PR is this:

  • BugFix
  • Feature
  • Enhancement
  • Refactor
  • UT
  • Doc
  • Tool

Does this PR entail a change in behavior?

  • Yes, this PR will result in a change in behavior.
  • No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • Parameter changes: default values, similar parameters but with different default values
  • Policy changes: use new policy to replace old one, functionality automatically enabled
  • Feature removed
  • Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • I have added test cases for my bug fix or my new feature
  • This pr needs user documentation (for new or modified features or behaviors)
    • I have added documentation for my new feature or new function
  • This is a backport pr

Signed-off-by: satanson <[email protected]>
(cherry picked from commit f86795b)

# Conflicts:
#	be/src/exec/pipeline/operator.h
#	be/src/exec/pipeline/scan/scan_operator.h
#	be/src/exprs/runtime_filter_bank.cpp
#	gensrc/thrift/PlanNodes.thrift
@mergify mergify bot added the conflicts label Feb 13, 2025
Copy link
Contributor Author

mergify bot commented Feb 13, 2025

Cherry-pick of f86795b has failed:

On branch mergify/bp/branch-3.3/pr-55660
Your branch is up to date with 'origin/branch-3.3'.

You are currently cherry-picking commit f86795b85a.
  (fix conflicts and run "git cherry-pick --continue")
  (use "git cherry-pick --skip" to skip this patch)
  (use "git cherry-pick --abort" to cancel the cherry-pick operation)

Changes to be committed:
	modified:   be/src/exec/connector_scan_node.cpp
	modified:   be/src/exec/olap_scan_node.cpp
	modified:   be/src/exec/pipeline/scan/scan_operator.cpp
	new file:   be/src/exec/pipeline/topn_runtime_filter_back_pressure.h
	modified:   be/src/exec/scan_node.h
	modified:   be/src/exprs/runtime_filter_bank.h
	modified:   fe/fe-core/src/main/java/com/starrocks/planner/OlapScanNode.java
	modified:   fe/fe-core/src/main/java/com/starrocks/planner/RuntimeFilterDescription.java
	modified:   fe/fe-core/src/main/java/com/starrocks/planner/SortNode.java
	modified:   fe/fe-core/src/main/java/com/starrocks/qe/SessionVariable.java
	new file:   test/sql/test_topn_filter_throttle_scan/R/test_topn_filter_throttle_scan
	new file:   test/sql/test_topn_filter_throttle_scan/T/test_topn_filter_throttle_scan

Unmerged paths:
  (use "git add <file>..." to mark resolution)
	both modified:   be/src/exec/pipeline/operator.h
	both modified:   be/src/exec/pipeline/scan/scan_operator.h
	both modified:   be/src/exprs/runtime_filter_bank.cpp
	both modified:   gensrc/thrift/PlanNodes.thrift

To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally

@wanpengfei-git wanpengfei-git enabled auto-merge (squash) February 13, 2025 09:34
@mergify mergify bot closed this Feb 13, 2025
auto-merge was automatically disabled February 13, 2025 09:34

Pull request was closed

Copy link
Contributor Author

mergify bot commented Feb 13, 2025

@mergify[bot]: Backport conflict, please reslove the conflict and resubmit the pr

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant