Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables - Mailing list pgsql-hackers

From Ashutosh Bapat
Subject Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables
Date
Msg-id CAFjFpRedUZPa7tKbCLEGK3u5UWdDNQoN=eYfb7ieG5d0D1PbsQ@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables  (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>)
Responses Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Mon, Mar 20, 2017 at 10:17 PM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
>>
>> On a further testing of this patch I find another case when it is
>> showing regression, the time taken with patch is around 160 secs and
>> without it is 125 secs.
>> Another minor thing to note that is planning time is almost twice with
>> this patch, though I understand that this is for scenarios with really
>> big 'big data' so this may not be a serious issue in such cases, but
>> it'd be good if we can keep an eye on this that it doesn't exceed the
>> computational bounds for a really large number of tables.
>
> Right, planning time would be proportional to the number of partitions
> at least in the first version. We may improve upon it later.
>
>> Please find the attached .out file to check the output I witnessed and
>> let me know if anymore information is required
>> Schema and data was similar to the preciously shared schema with the
>> addition of more data for this case, parameter settings used were:
>> work_mem = 1GB
>> random_page_cost = seq_page_cost = 0.1

this doesn't look good. Why do you set both these costs to the same value?

>
> The patch does not introduce any new costing model. It costs the
> partition-wise join as sum of costs of joins between partitions. The
> method to create the paths for joins between partitions is same as
> creating the paths for joins between regular tables and then the
> method to collect paths across partition-wise joins is same as
> collecting paths across child base relations. So, there is a large
> chance that the costing for joins between partitions might have a
> problem which is showing up here. There may be some special handling
> for regular tables versus child tables that may be the root cause. But
> I have not seen that kind of code till now.
>
> Can you please provide the outputs of individual partition-joins? If
> the plans for joins between partitions are same as the ones chosen for
> partition-wise joins, we may need to fix the existing join cost
> models.

Offlist, Rafia shared the outputs of joins between partitions and join
between partitioned table. The joins between partitions look similar
to those pick up by the partition-wise join. So, it looks that some
costing error in regular joins is resulting in an costing error in
partition-wise join as suspected. Attached the SQL and the output.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

pgsql-hackers by date:

Previous
From: Etsuro Fujita
Date:
Subject: [HACKERS] postgres_fdw: correct regression test for parameterized scan forforeign table
Next
From: Ashutosh Bapat
Date:
Subject: Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables