feat: support intra-operator parallelism by wangrunji0408 · Pull Request #856 · risinglightdb/risinglight

wangrunji0408 · 2024-11-24T13:59:25Z

This PR adds data partitioning and intra-operator parallelism.

The performance of TPC-H improved on my M1 Pro (10 cores):

Seems resolve #748

Signed-off-by: Runji Wang <wangrunji0408@163.com>

skyzh · 2024-11-27T05:22:05Z

two quick questions: what is the schema plan node? and what is the definition of exchange node? is it the distribution of the child, or the expected distribution of the output node?

wangrunji0408 · 2024-11-27T14:01:44Z

what is the schema plan node?

The schema node is a virtual node that only changes the output schema of the child node. It was introduced to resolve a tricky issue in 2-phase aggregation.

Let's say we have a query: select sum(a) * 2 from t;

The original plan is:

Proj: sum(a) * 2
    Agg: sum(a)
        Scan: t(a)

After parallelization (by pushing down the ToParallel node), the Agg is transformed into a 2-phase aggregation:

Proj: sum(a) * 2
    Agg: sum(sum(a))
        Exchange: merge
            Agg: sum(a)
                Scan: t(a)

You may notice that the output schema of the Agg node is changed from sum(a) to sum(sum(a)). Therefore, the Proj node will throw an error when trying to resolve the physical column index of its expression sum(a).

So, in order to keep the schema unchanged, we can insert a Schema node between Proj and Agg:

Proj: sum(a) * 2
    Schema: sum(a)
        Agg: sum(sum(a))
            Exchange: merge
                Agg: sum(a)
                    Scan: t(a)

And the Schema node will be simply ignored when building executors.

wangrunji0408 · 2024-11-27T14:05:23Z

what is the definition of exchange node? is it the distribution of the child, or the expected distribution of the output node?

(exchange dist child)
where dist is the expected distribution of the output.
The child can have any distribution.

wangrunji0408 · 2024-11-27T14:12:59Z

By the way, after this optimization, the bottleneck of some queries (such as Q6) has shifted to table scan.
Next step it's critical to support parallel partition scan in the storage. 🥹

wangrunji0408 added 30 commits April 20, 2024 18:25

stash

10d7faa

Signed-off-by: Runji Wang <wangrunji0408@163.com>

basic support for converting to distributed plan

f608270

Signed-off-by: Runji Wang <wangrunji0408@163.com>

rename distributed to parallel

344a5f8

Signed-off-by: Runji Wang <wangrunji0408@163.com>

hash partition executor

d51e7c4

Signed-off-by: Runji Wang <wangrunji0408@163.com>

fix

5a25a2e

Signed-off-by: Runji Wang <wangrunji0408@163.com>

Merge remote-tracking branch 'origin/main' into wrj/mpp

c8f0b68

Signed-off-by: Runji Wang <wangrunji0408@163.com>

fix metrics and improve debug info

8fe3343

Signed-off-by: Runji Wang <wangrunji0408@163.com>

add a pragma to control parallel plan

cdea038

Signed-off-by: Runji Wang <wangrunji0408@163.com>

two-phase aggregation

fe2ee6c

Signed-off-by: Runji Wang <wangrunji0408@163.com>

update rust toolchain and dependencies

67bf63b

Signed-off-by: Runji Wang <wangrunji0408@163.com>

upgrade dependencies

be64142

Signed-off-by: Runji Wang <wangrunji0408@163.com>

fix warnings

bff93fe

Signed-off-by: Runji Wang <wangrunji0408@163.com>

support keyword completion

68f0ec3

Signed-off-by: Runji Wang <wangrunji0408@163.com>

support cursor in completed line

9710023

Signed-off-by: Runji Wang <wangrunji0408@163.com>

fix clippy

c77c0b0

Signed-off-by: Runji Wang <wangrunji0408@163.com>

Merge branch 'wrj/update-toolchain' into wrj/partition

0f6f712

Signed-off-by: Runji Wang <wangrunji0408@163.com>

Merge branch 'wrj/completion' into wrj/partition

c90782b

fix to_parallel for left outer join and DDL statements

685b148

Signed-off-by: Runji Wang <wangrunji0408@163.com>

fix hash exchange

322d8f1

Signed-off-by: Runji Wang <wangrunji0408@163.com>

replace pragma enable_parallel_execution by set variable parallelism

0961aa1

Signed-off-by: Runji Wang <wangrunji0408@163.com>

fix 2-phase count agg

0588dff

Signed-off-by: Runji Wang <wangrunji0408@163.com>

enable partitioning in unit test. fix bugs

db3a019

Signed-off-by: Runji Wang <wangrunji0408@163.com>

fix DDL to parallel

7f56f92

Signed-off-by: Runji Wang <wangrunji0408@163.com>

add unit test for Expr size

87a7fb9

Signed-off-by: Runji Wang <wangrunji0408@163.com>

Merge remote-tracking branch 'origin/main' into wrj/partition

fac57a1

Signed-off-by: Runji Wang <wangrunji0408@163.com>

fix timing

85d131b

Signed-off-by: Runji Wang <wangrunji0408@163.com>

add counted instrument

4ff9450

Signed-off-by: Runji Wang <wangrunji0408@163.com>

correctly show the time of exchange operator

a47bc49

Signed-off-by: Runji Wang <wangrunji0408@163.com>

use ahash to optimize hash

cea7429

Signed-off-by: Runji Wang <wangrunji0408@163.com>

decouple rows and time of exchange operator

35c56fd

Signed-off-by: Runji Wang <wangrunji0408@163.com>

do not eliminate duplicate exchange

4a2d2ad

Signed-off-by: Runji Wang <wangrunji0408@163.com>

wangrunji0408 requested a review from skyzh November 24, 2024 13:59

wangrunji0408 added 2 commits November 24, 2024 23:01

fix clippy

321e330

Signed-off-by: Runji Wang <wangrunji0408@163.com>

fix unit test

1bc5611

Signed-off-by: Runji Wang <wangrunji0408@163.com>

wangrunji0408 force-pushed the wrj/partition branch from 3943c98 to 1bc5611 Compare November 24, 2024 15:30

wangrunji0408 requested a review from TennyZhuang December 5, 2024 13:49

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: support intra-operator parallelism#856

feat: support intra-operator parallelism#856
wangrunji0408 wants to merge 33 commits intomainfrom
wrj/partition

wangrunji0408 commented Nov 24, 2024 •

edited

Loading

Uh oh!

skyzh commented Nov 27, 2024

Uh oh!

wangrunji0408 commented Nov 27, 2024

Uh oh!

wangrunji0408 commented Nov 27, 2024

Uh oh!

wangrunji0408 commented Nov 27, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

wangrunji0408 commented Nov 24, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

skyzh commented Nov 27, 2024

Uh oh!

wangrunji0408 commented Nov 27, 2024

Uh oh!

wangrunji0408 commented Nov 27, 2024

Uh oh!

wangrunji0408 commented Nov 27, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

wangrunji0408 commented Nov 24, 2024 •

edited

Loading