On 11/30/20 7:43 PM, Anastasia Lubennikova wrote:
> This entry was inactive during this CF, so I've marked it as returned
> with feedback. Feel free to resubmit an updated version to a future
> commitfest.
I return the patch to commitfest. My current reason differs from reason
of origin author.
This patch can open a door for more complex optimizations in the
partitionwise join push-down technique.
I mean, we can push-down join not only of two partitioned tables with
the same partition schema, but a partitioned (sharded) table with an
arbitrary subplan that is provable independent of local resources.
Example:
CREATE TABLE p(a int) PARTITION BY HASH (a);
CREATE TABLE p1 PARTITION OF p FOR VALUES WITH (MODULUS 3, REMAINDER 0);
CREATE TABLE p2 PARTITION OF p FOR VALUES WITH (MODULUS 3, REMAINDER 1);
CREATE TABLE p3 PARTITION OF p FOR VALUES WITH (MODULUS 3, REMAINDER 2);
SELECT * FROM p, (SELECT * FROM generate_series(1,2) AS a) AS s
WHERE p.a=s.a;
Hash Join
Hash Cond: (p.a = a.a)
-> Append
-> Seq Scan on p1 p_1
-> Seq Scan on p2 p_2
-> Seq Scan on p3 p_3
-> Hash
-> Function Scan on generate_series a
But with asymmetric join feature we have the plan:
Append
-> Hash Join
Hash Cond: (p_1.a = a.a)
-> Seq Scan on p1 p_1
-> Hash
-> Function Scan on generate_series a
-> Hash Join
Hash Cond: (p_2.a = a.a)
-> Seq Scan on p2 p_2
-> Hash
-> Function Scan on generate_series a
-> Hash Join
Hash Cond: (p_3.a = a.a)
-> Seq Scan on p3 p_3
-> Hash
-> Function Scan on generate_series a
In the case of FDW-sharding it means that if we can prove that the inner
relation is independent from the execution server, we can push-down
these joins and execute it in parallel.
--
regards,
Andrey Lepikhov
Postgres Professional