Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables - Mailing list pgsql-hackers

From Rafia Sabih
Subject Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables
Date
Msg-id CAOGQiiPG2amSrd=RT3v2ZEN9fCGuEX8tpqWKoj2zJ8q-775k+g@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables  (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>)
Responses Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables
Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables
List pgsql-hackers


On Fri, Jul 21, 2017 at 12:11 PM, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote:
> On Fri, Jul 21, 2017 at 11:54 AM, Rafia Sabih
> <rafia.sabih@enterprisedb.com> wrote:
>> So, does this
>> also mean that a partitioned table will not join with an unpartitioned
>> table without append of partitions?
>>
>
> Yes. When you join an unpartitioned table with a partitioned table,
> the planner will choose to append all the partitions of the
> partitioned table and then join with the unpartitioned table.
>

I tested this set of patches for TPC-H benchmark and came across following results, 
- total 7 queries were using partition-wise join, 

- Q4 attains a speedup of around 80% compared to the partitioned setup without partition-wise join, the main reason being the poor plan choice at head for partitioned database. 
When I tried this query with forced nested-loop join then it completes in some 45 seconds at head. So, basically when no partition-wise join is present because of terrible selectivity estimation optimiser picks up a hash join plan, which results poorly as the estimated number of rows are two orders of magnitude lesser than actual.
Note that this is not the effect of [1], I tried this without that patch as well.

- other queries show a good 20-30% improvement in performance. Performance numbers are as follows,

Query| un_part_head (seconds) | part_head (seconds) | part_patch (seconds) |
3 | 76 |127 | 88 |
4 |17 | 244 | 41 |
5 | 52 | 123 | 84 |
7 | 73 | 134 | 103 |
10 | 67 | 111 | 89 |
12 | 53 | 114 | 99 |
18 | 447 | 709 | 551 |

The experimental settings used were,

Partitioning: Range partitioning on lineitem and orders on l_orderkey and o_orderkey respectively. The number and range of partitions were kept same for both the tables.

Server parameters:
work_mem - 1GB
effective_cache_size - 8GB
shared_buffers - 8GB
enable_partition_wise_join = on

TPC-H setup:
scale-factor - 20

Commit id - 42171e2cd23c8307bbe0ec64e901f58e297db1c3, also, the patch at [1] was applied in all the cases.
Query plans for the above mentioned queries is attached.

Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [HACKERS] Testlib.pm vs msys
Next
From: Rushabh Lathia
Date:
Subject: Re: [HACKERS] cache lookup failed error for partition key with custom opclass