Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some order by SQL didn't use window sort #5463

Open
evenyag opened this issue Jan 27, 2025 · 3 comments · May be fixed by #5543
Open

Some order by SQL didn't use window sort #5463

evenyag opened this issue Jan 27, 2025 · 3 comments · May be fixed by #5543
Labels
A-query Involves code in query path C-bug Category Bugs C-performance Category Performance

Comments

@evenyag
Copy link
Contributor

evenyag commented Jan 27, 2025

What type of bug is this?

Performance issue

What subsystems are affected?

Distributed Cluster, Standalone mode

Minimal reproduce step

Execute the following SQL:

CREATE TABLE IF NOT EXISTS `orderby_test` (
  `file` STRING NULL,
  `host` STRING NULL,
  `message` STRING NULL FULLTEXT WITH(analyzer = 'English', case_sensitive = 'false'),
  `source_type` STRING NULL,
  `timestamp` STRING NULL,
  `greptime_timestamp` TIMESTAMP(9) NOT NULL,
  TIME INDEX (`greptime_timestamp`)
)
ENGINE=mito
WITH(
  append_mode = 'true'
);

explain analyze select * from orderby_test where greptime_timestamp >= now() - '10m'::interval and greptime_timestamp < now() order by greptime_timestamp desc limit 10;

explain analyze select greptime_timestamp as timestamp, message from orderby_test order by greptime_timestamp desc limit 10;

What did you expect to see?

We expect that we should use window sort like this SQL

explain analyze select * from orderby_test order by greptime_timestamp desc limit 10;
+-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| stage | node | plan                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
+-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|     0 |    0 |  MergeScanExec: peers=[4440996184064(1034, 0), ] metrics=[output_rows: 0, greptime_exec_read_cost: 0, first_consume_time: 2662581, finish_time: 2665042, ready_time: 2489376, ]
                                                                                                                                                                                                                                                                                                                                                                |
|     1 |    0 |  GlobalLimitExec: skip=0, fetch=10 metrics=[output_rows: 0, elapsed_compute: 1, ]
  WindowedSortExec: expr=greptime_timestamp@5 DESC num_ranges=1 fetch=10 metrics=[output_rows: 0, elapsed_compute: 1, ]
    PartSortExec: expr=greptime_timestamp@5 DESC num_ranges=1 limit=10 metrics=[output_rows: 0, elapsed_compute: 2, row_replacements: 0, ]
      UnorderedScan: region=4440996184064(1034, 0), partition_count=0 (0 memtable ranges, 0 file 0 ranges) metrics=[output_rows: 0, mem_used: 0, elapsed_poll: 16375, elapsed_await: 1, ]
 |
|  NULL | NULL | Total rows: 0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   |
+-------+------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3 rows in set (0.01 sec)

What did you see instead?


explain analyze select * from orderby_test where greptime_timestamp >= now() - '10m'::interval and greptime_timestamp < now() order by greptime_timestamp desc limit 10;
+-------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| stage | node | plan                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
+-------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|     0 |    0 |  MergeScanExec: peers=[4440996184064(1034, 0), ] metrics=[output_rows: 0, greptime_exec_read_cost: 0, first_consume_time: 3120000, finish_time: 3122206, ready_time: 2841293, ]
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 |
|     1 |    0 |  GlobalLimitExec: skip=0, fetch=10 metrics=[output_rows: 0, elapsed_compute: 1, ]
  SortPreservingMergeExec: [greptime_timestamp@5 DESC] metrics=[output_rows: 0, elapsed_compute: 1375, ]
    CoalesceBatchesExec: target_batch_size=8192 metrics=[output_rows: 0, elapsed_compute: 1171, ]
      SortExec: expr=[greptime_timestamp@5 DESC], preserve_partitioning=[true] metrics=[output_rows: 0, elapsed_compute: 10, spill_count: 0, spilled_bytes: 0, spilled_rows: 0, ]
        CoalesceBatchesExec: target_batch_size=8192 metrics=[output_rows: 0, elapsed_compute: 2171, ]
          FilterExec: greptime_timestamp@5 >= 1737958907134709000 AND greptime_timestamp@5 < 1737959507134709000 metrics=[output_rows: 0, elapsed_compute: 10, ]
            RepartitionExec: partitioning=RoundRobinBatch(10), input_partitions=1 metrics=[fetch_time: 66418, send_time: 10, repart_time: 1, ]
              UnorderedScan: region=4440996184064(1034, 0), partition_count=0 (0 memtable ranges, 0 file 0 ranges) metrics=[output_rows: 0, mem_used: 0, elapsed_poll: 17834, elapsed_await: 1, ]
 |
|  NULL | NULL | Total rows: 0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
+-------+------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3 rows in set (0.02 sec)

explain analyze select greptime_timestamp as timestamp, message from orderby_test order by greptime_timestamp desc limit 10;
+-------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| stage | node | plan                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     |
+-------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|     0 |    0 |  ProjectionExec: expr=[timestamp@0 as timestamp, message@1 as message] metrics=[output_rows: 0, elapsed_compute: 1, ]
  GlobalLimitExec: skip=0, fetch=10 metrics=[output_rows: 0, elapsed_compute: 1, ]
    SortPreservingMergeExec: [greptime_timestamp@2 DESC] metrics=[output_rows: 0, elapsed_compute: 875, ]
      SortExec: TopK(fetch=10), expr=[greptime_timestamp@2 DESC], preserve_partitioning=[true] metrics=[output_rows: 0, elapsed_compute: 1107626, row_replacements: 0, ]
        ProjectionExec: expr=[greptime_timestamp@5 as timestamp, message@2 as message, greptime_timestamp@5 as greptime_timestamp] metrics=[output_rows: 0, elapsed_compute: 10, ]
          MergeScanExec: peers=[4440996184064(1034, 0), ] metrics=[output_rows: 0, greptime_exec_read_cost: 0, first_consume_time: 1738376, finish_time: 1743417, ready_time: 1457499, ]
 |
|     1 |    0 |  UnorderedScan: region=4440996184064(1034, 0), partition_count=0 (0 memtable ranges, 0 file 0 ranges) metrics=[output_rows: 0, mem_used: 0, elapsed_poll: 20125, elapsed_await: 1, ]
                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    |
|  NULL | NULL | Total rows: 0                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            |
+-------+------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3 rows in set (0.03 sec)

What operating system did you use?

Unrelated

What version of GreptimeDB did you use?

v0.11.2

Relevant log output and stack trace

@evenyag evenyag added C-bug Category Bugs C-performance Category Performance A-query Involves code in query path labels Jan 27, 2025
@NiwakaDev
Copy link
Collaborator

NiwakaDev commented Jan 27, 2025

@evenyag
I'm interested in this issue.

It seems such a bug occurs when LogicalPlan::Projection contains an expression that isn't Commutativity::Commutative.

for expr in &proj.expr {
let commutativity = Self::check_expr(expr);
if !matches!(commutativity, Commutativity::Commutative) {
return commutativity;
}
}
Commutativity::Commutative

select greptime_timestamp as timestamp, message from orderby_test order by greptime_timestamp desc limit 10;

For this query, because LogicalPlan::Projection has Expr::Alias, it isn't pushed down below MergeScan.

Why Expr::Alias is Commutativity::Unimplemented?

@evenyag
Copy link
Contributor Author

evenyag commented Jan 30, 2025

Why Expr::Alias is Commutativity::Unimplemented?

As far as I know, we can add an alias for a subquery so pushing all alias expressions down isn't always correct.

Maybe we can push the expr down if it is a column.

Updated: Should we use the commutativity of alias.expr as the commutativity of alias?

@evenyag
Copy link
Contributor Author

evenyag commented Feb 6, 2025

Modifying the commutativity works in standalone mode. However, the distributed mode still has some troubles.

For example, the following SQL doesn't work in distributed mode.

CREATE TABLE test(i INTEGER, t TIMESTAMP TIME INDEX) WITH('compaction.type'='twcs', 'compaction.twcs.max_inactive_window_files'='4');

INSERT INTO test VALUES (1, 1), (NULL, 2), (1, 3);

EXPLAIN ANALYZE SELECT t as ts FROM test ORDER BY t DESC LIMIT 5;

substrait modified the plan and added a temp name t__temp__0. The windowed sort rule can't handle this currently.

ProjectionExec:        expr=[i@0 as i, t@1 as ts]
   SortPreservingMergeExec: [test.t__temp__0@2 DESC], fetch=5
     SortExec: TopK(fetch=5), expr=[test.t__temp__0@2 DESC], preserve_partitioning=[true]
       ProjectionExec: expr=[i@0 as i, t@1 as t, t@1 as test.t__temp__0]
         SeqScan: region=4398046511104(1024, 0), partition_count=4 (1 memtable ranges, 3 file 3 ranges)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-query Involves code in query path C-bug Category Bugs C-performance Category Performance
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants