Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables - Mailing list pgsql-hackers

From Robert Haas
Subject Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables
Date
Msg-id CA+TgmobnhO0+CGk-V92UMy2tiOvtw+7kAVs2ZvCfdHQ+oi+pNA@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables  (Rafia Sabih <rafia.sabih@enterprisedb.com>)
List pgsql-hackers
On Sun, Mar 19, 2017 at 12:15 AM, Rafia Sabih
<rafia.sabih@enterprisedb.com> wrote:
> I was trying to play around with this patch and came across following
> case when without the patch query completes in 9 secs and with it in
> 15 secs. Theoretically, I tried to capture the case when each
> partition is having good amount of rows in output and each has to
> build their own hash, in that case the cost of building so many hashes
> comes to be more costly than having an append and then join. Thought
> it might be helpful to consider this case in better designing of the
> algorithm. Please feel free to point out if I missed something.

In the non-partitionwise plan, the query planner correctly chooses to
hash the same table (prt2) and probe from the large table (prt).  In
the partition-wise plan, it generally does the opposite.  There is a
mix of merge joins and hash joins, but of the 15 children that picked
merge joins, 14 of them hashed the larger partition (in each case,
from prt) and probed from the smaller one (in each case, from prt2),
which seems like an odd strategy.  So I think the problem is not that
building lots of hash tables is slower than building just one, but
rather that for some reason it's choosing the wrong table to hash.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: "Dominick O'Dierno"
Date:
Subject: [HACKERS] Determine if an error is transient by its error code.
Next
From: Robert Haas
Date:
Subject: Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables