Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables - Mailing list pgsql-hackers

From Robert Haas
Subject Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables
Date
Msg-id CA+Tgmobn_2zu7-DGBd-L1K8RkPMZAAGPW5pOiZc=6Bq08CN9+Q@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables  (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>)
Responses Re: [HACKERS] Partition-wise join for join between (declaratively) partitioned tables  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables  (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>)
List pgsql-hackers
On Tue, Apr 18, 2017 at 6:55 AM, Ashutosh Bapat
<ashutosh.bapat@enterprisedb.com> wrote:
> When we merge partition bounds from two relations with different
> partition key types, the merged partition bounds need to have some
> information abound the way those constants look like e.g. their
> length, structure etc. That's the reason we need to store partition
> key types of merged partitioning scheme. Consider a three way join (i4
> JOIN i8 ON i4.x = i8.x) JOIN i2 ON (i2.x = i.x). When we compare
> partition bounds of i4 and i8, we use operators for int4 and int8. The
> join i4 JOIN i8 will get partition bounds by merging those of i4 and
> i8. When we come to join with i2, we need to know which operators to
> use for comparing the partition bounds of the join with those of i2.
>
> So, if the partition key types of the joining relations differ (but
> they have matching partitioning schemes per strategy, natts and
> operator family) the partition bounds of the join are converted to the
> wider type among the partition key types of the joining tree.
> Actually, as I am explained earlier we could choose a wider outer type
> for an OUTER join and shorter type for inner join. This type is used
> as partition key type of the join. In the above case join between i4
> and i8 have its partition bounds converted to i8 (or i4) and then when
> it is joined with i2 the partition bounds of the join are converted to
> i8 (or i2).

I don't understand why you think that partition-wise join needs any
new logic here; if this were a non-partitionwise join, we'd similarly
need to use the correct operator, but the existing code handles that
just fine.  If the join is performed partition-wise, it should use the
same operators that would have been used by a non-partitionwise join
between the same tables.

I think the choice of operator depends only on the column types, and
that the "width" of those types has nothing to do with it.  For
example, if the user writes .WHERE A.x = B.x AND B.x = C.x, the
operator for an A/B join or a B/C join will be the one that appears in
the query; parse analysis will have identified which specific operator
is meant based on the types of the columns.  If the optimizer
subsequently decides to reorder the joins and perform the A/C join
first, it will go hunt down the operator with the same strategy number
in the same operator family that takes the type of A.x on one side and
the type of C.x on the other side.  No problem.  A partition-wise join
between A and C will use that same operator; again, no problem.

Your example involves joining the output of a join between i4 and i8
against i2, so it seems there is some ambiguity about what the input
type should be.  But, again, the planner already copes with this
problem.  In fact, the join is performed either using i4.x or i8.x --
I don't know what happens, or whether it depends on other details of
the query or the plan -- and the operator which can accept that value
on one side and i2.x on the other side is the one that gets used.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [HACKERS] Unportable implementation of background worker start
Next
From: Tom Lane
Date:
Subject: Re: [HACKERS] Partition-wise join for join between (declaratively) partitioned tables