A problem about partitionwise join - Mailing list pgsql-hackers

From Richard Guo
Subject A problem about partitionwise join
Date
Msg-id CAN_9JTzo_2F5dKLqXVtDX5V6dwqB0Xk+ihstpKEt3a1LT6X78A@mail.gmail.com
Whole thread Raw
Responses Re: A problem about partitionwise join
List pgsql-hackers
Hi All,

To generate partitionwise join, we need to make sure there exists an
equi-join condition for each pair of partition keys, which is performed
by have_partkey_equi_join(). This makes sense and works well.

But if, let's say, one certain pair of partition keys (foo.k = bar.k)
has formed an equivalence class containing consts, no join clause would
be generated for it, since we have already generated 'foo.k = const' and
'bar.k = const' and pushed them into the proper restrictions earlier.

This will make partitionwise join fail to be planned if there are
multiple partition keys and the pushed-down restrictions 'xxx = const'
fail to prune away any partitions.

Consider the examples below:

create table p (k1 int, k2 int, val int) partition by range(k1,k2);
create table p_1 partition of p for values from (1,1) to (10,100);
create table p_2 partition of p for values from (10,100) to (20,200);


If we are joining on each pair of partition keys, we can generate
partitionwise join:

# explain (costs off)
select * from p as foo join p as bar on foo.k1 = bar.k1 and foo.k2 = bar.k2;
                              QUERY PLAN
----------------------------------------------------------------------
 Append
   ->  Hash Join
         Hash Cond: ((foo.k1 = bar.k1) AND (foo.k2 = bar.k2))
         ->  Seq Scan on p_1 foo
         ->  Hash
               ->  Seq Scan on p_1 bar
   ->  Hash Join
         Hash Cond: ((foo_1.k1 = bar_1.k1) AND (foo_1.k2 = bar_1.k2))
         ->  Seq Scan on p_2 foo_1
         ->  Hash
               ->  Seq Scan on p_2 bar_1
(11 rows)


But if we add another qual 'foo.k2 = const', we will be unable to
generate partitionwise join any more, because have_partkey_equi_join()
thinks not every partition key has an equi-join condition.

# explain (costs off)
select * from p as foo join p as bar on foo.k1 = bar.k1 and foo.k2 = bar.k2 and foo.k2 = 16;
               QUERY PLAN
-----------------------------------------
 Hash Join
   Hash Cond: (foo.k1 = bar.k1)
   ->  Append
         ->  Seq Scan on p_1 foo
               Filter: (k2 = 16)
         ->  Seq Scan on p_2 foo_1
               Filter: (k2 = 16)
   ->  Hash
         ->  Append
               ->  Seq Scan on p_1 bar
                     Filter: (k2 = 16)
               ->  Seq Scan on p_2 bar_1
                     Filter: (k2 = 16)
(13 rows)


Is this a problem?

Thanks
Richard

pgsql-hackers by date:

Previous
From: Rafia Sabih
Date:
Subject: Re: Creating partitions automatically at least on HASH?
Next
From: Darafei "Komяpa" Praliaskouski
Date:
Subject: Re: Yet another fast GiST build