Re: A problem about partitionwise join - Mailing list pgsql-hackers
From | Richard Guo |
---|---|
Subject | Re: A problem about partitionwise join |
Date | |
Msg-id | CAN_9JTzPTG5qEJ9MerarmuRoifhiejtS3fiRe+ZEZMyz-3NZuw@mail.gmail.com Whole thread Raw |
In response to | Re: A problem about partitionwise join (Amit Langote <amitlangote09@gmail.com>) |
Responses |
Re: A problem about partitionwise join
|
List | pgsql-hackers |
On Tue, Aug 27, 2019 at 8:51 AM Amit Langote <amitlangote09@gmail.com> wrote:
Hi Richard,
On Mon, Aug 26, 2019 at 6:33 PM Richard Guo <riguo@pivotal.io> wrote:
>
> Hi All,
>
> To generate partitionwise join, we need to make sure there exists an
> equi-join condition for each pair of partition keys, which is performed
> by have_partkey_equi_join(). This makes sense and works well.
>
> But if, let's say, one certain pair of partition keys (foo.k = bar.k)
> has formed an equivalence class containing consts, no join clause would
> be generated for it, since we have already generated 'foo.k = const' and
> 'bar.k = const' and pushed them into the proper restrictions earlier.
>
> This will make partitionwise join fail to be planned if there are
> multiple partition keys and the pushed-down restrictions 'xxx = const'
> fail to prune away any partitions.
>
> Consider the examples below:
>
> create table p (k1 int, k2 int, val int) partition by range(k1,k2);
> create table p_1 partition of p for values from (1,1) to (10,100);
> create table p_2 partition of p for values from (10,100) to (20,200);
>
> If we are joining on each pair of partition keys, we can generate
> partitionwise join:
>
> # explain (costs off)
> select * from p as foo join p as bar on foo.k1 = bar.k1 and foo.k2 = bar.k2;
> QUERY PLAN
> ----------------------------------------------------------------------
> Append
> -> Hash Join
> Hash Cond: ((foo.k1 = bar.k1) AND (foo.k2 = bar.k2))
> -> Seq Scan on p_1 foo
> -> Hash
> -> Seq Scan on p_1 bar
> -> Hash Join
> Hash Cond: ((foo_1.k1 = bar_1.k1) AND (foo_1.k2 = bar_1.k2))
> -> Seq Scan on p_2 foo_1
> -> Hash
> -> Seq Scan on p_2 bar_1
> (11 rows)
>
> But if we add another qual 'foo.k2 = const', we will be unable to
> generate partitionwise join any more, because have_partkey_equi_join()
> thinks not every partition key has an equi-join condition.
>
> # explain (costs off)
> select * from p as foo join p as bar on foo.k1 = bar.k1 and foo.k2 = bar.k2 and foo.k2 = 16;
> QUERY PLAN
> -----------------------------------------
> Hash Join
> Hash Cond: (foo.k1 = bar.k1)
> -> Append
> -> Seq Scan on p_1 foo
> Filter: (k2 = 16)
> -> Seq Scan on p_2 foo_1
> Filter: (k2 = 16)
> -> Hash
> -> Append
> -> Seq Scan on p_1 bar
> Filter: (k2 = 16)
> -> Seq Scan on p_2 bar_1
> Filter: (k2 = 16)
> (13 rows)
>
> Is this a problem?
Perhaps. Maybe it has to do with the way have_partkey_equi_join() has
been coded. If it was coded such that it figured out on its own that
the equivalence (foo.k2, bar.k2, ...) does exist, then that would
allow partitionwise join to occur, which I think would be OK to do.
But maybe I'm missing something.
classes. ECs containing consts will not be considered so we cannot
generate (foo.k2 = bar.k2) for the query above.
In addition, when generating join clauses from equivalence classes, we
only select the joinclause with the 'best score', or the first
joinclause with a score of 3. This may make us miss some joinclause on
partition keys.
Check the query below as a more illustrative example:
create table p (k int, val int) partition by range(k);
create table p_1 partition of p for values from (1) to (10);
create table p_2 partition of p for values from (10) to (100);
If we use quals 'foo.k = bar.k and foo.k = bar.val', we can generate
partitionwise join:
# explain (costs off)
select * from p as foo join p as bar on foo.k = bar.k and foo.k = bar.val;
QUERY PLAN
-----------------------------------------
Append
-> Hash Join
Hash Cond: (foo.k = bar.k)
-> Seq Scan on p_1 foo
-> Hash
-> Seq Scan on p_1 bar
Filter: (k = val)
-> Hash Join
Hash Cond: (foo_1.k = bar_1.k)
-> Seq Scan on p_2 foo_1
-> Hash
-> Seq Scan on p_2 bar_1
Filter: (k = val)
(13 rows)
But if we exchange the order of the two quals to 'foo.k = bar.val and
foo.k = bar.k', then partitionwise join cannot be generated any more,
because we only have joinclause 'foo.k = bar.val' as it first reached
score of 3. We have missed the joinclause on the partition key although
it does exist.
# explain (costs off)
select * from p as foo join p as bar on foo.k = bar.val and foo.k = bar.k;
QUERY PLAN
-----------------------------------------
Hash Join
Hash Cond: (foo.k = bar.val)
-> Append
-> Seq Scan on p_1 foo
-> Seq Scan on p_2 foo_1
-> Hash
-> Append
-> Seq Scan on p_1 bar
Filter: (val = k)
-> Seq Scan on p_2 bar_1
Filter: (val = k)
(11 rows)
Thanks
Richard
pgsql-hackers by date: