Re: [HACKERS] Partition-wise join for join between (declaratively) partitioned tables - Mailing list pgsql-hackers

From Antonin Houska
Subject Re: [HACKERS] Partition-wise join for join between (declaratively) partitioned tables
Date
Msg-id 11888.1504782175@localhost
Whole thread Raw
In response to Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables  (Ashutosh Bapat <ashutosh.bapat@enterprisedb.com>)
Responses Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables
List pgsql-hackers
Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote:

> On Fri, Sep 1, 2017 at 6:05 PM, Antonin Houska <ah@cybertec.at> wrote:
> >
> >
> >
> > * get_partitioned_child_rels_for_join()
> >
> > I think the Assert() statement is easier to understand inside the loop, see
> > the assert.diff attachment.

> The assert at the end of function also checks that we have got
> child_rels lists for all the parents passed in.

Really? I can imagine that some instances of PartitionedChildRelInfo have the
child_rels list empty, while other ones have these lists long enough to
compensate for the empty lists.

> >
> >
> > * have_partkey_equi_join()
> >
> > As the function handles generic join, this comment doesn't seem to me
> > relevant:
> >
> >     /*
> >      * The equi-join between partition keys is strict if equi-join between
> >      * at least one partition key is using a strict operator. See
> >      * explanation about outer join reordering identity 3 in
> >      * optimizer/README
> >      */
> >     strict_op = op_strict(opexpr->opno);
>
> What in that comment is not exactly relevant?

Basically I don't understand why you mention join reordering here. The join
ordering questions must all have been resolved by the time
have_partkey_equi_join() is called.

> >
> > And I think the function can return true even if strict_op is false for all
> > the operators evaluated in the loop.
>
> I think it does that. Do you have a case where it doesn't?

Here I refer to this part of the comment above:

"... if equi-join between at least one partition key is using a strict
operator."

My understanding of the code (especially match_expr_to_partition_keys) is that
no operator actually needs to be strict as long as each operator involved in
the join matches at least one non-nullable expression on both sides of the
join.

> > * match_expr_to_partition_keys()
> >
> > I'm not sure this comment is clear enough:
> >
> >     /*
> >      * If it's a strict equi-join a NULL partition key on one side will
> >      * not join a NULL partition key on the other side. So, rows with NULL
> >      * partition key from a partition on one side can not join with those
> >      * from a non-matching partition on the other side. So, search the
> >      * nullable partition keys as well.
> >      */
> >     if (!strict_op)
> >             continue;
> >
> > My understanding of the problem of NULL values generated by outer join is:
> > these NULL values --- if evaluated by non-strict expression --- can make row
> > of N-th partition on one side of the join match row(s) of *other than* N-th
> > partition(s) on the other side. Thus the nullable input expressions may only
> > be evaluated by strict operators. I think it'd be clearer if you stressed that
> > (undesired) *match* of partition keys can be a problem, rather than mismatch
>
> Sorry, I am not able to understand this. To me it looks like my
> wording conveys what you are saying.

I just tried to expreess the idea in a way that is clearer to me. I think we
both mean the same. Not sure I should spend more effort on another version of
the comment.

> > If you insist on your wording, then I think you should at least move the
> > comment below to the part that only deals with strict operators.
>
> Done.

o.k.

> >
> > * map_and_merge_partitions()
> >
> > Besides a few changes proposed in map_and_merge_partitions.diff (a few of them
> > to suppress compiler warnings) I think that this part needs more thought:
> >
> >     {
> >             Assert(mergemap1[index1] != mergemap2[index2] &&
> >                        mergemap1[index1] >= 0 && mergemap2[index2] >= 0);
> >
> >             /*
> >              * Both the partitions map to different merged partitions. This
> >              * means that multiple partitions from one relation matches to one
> >              * partition from the other relation. Partition-wise join does not
> >              * handle this case right now, since it requires ganging multiple
> >              * partitions together (into one RelOptInfo).
> >              */
> >             merged_index = -1;
> >     }
> >
> > I could hit this path with the following test:
> >
> > CREATE TABLE a(i int) PARTITION BY LIST(i);
> > CREATE TABLE a_0 PARTITION OF a FOR VALUES IN (0, 2);
> > CREATE TABLE b(j int) PARTITION BY LIST(j);
> > CREATE TABLE b_0 PARTITION OF b FOR VALUES IN (1, 2);
> >
> > SET enable_partition_wise_join TO on;
> >
> > SELECT  *
> > FROM    a
> >         FULL JOIN
> >         b ON i = j;
> >
> > I don't think there's a reason not to join a_0 partition to b_0, is there?
>
> With the latest patchset I am seeing that partition-wise join is used
> in this case. I have started a new thread [1] for advanced partition
> matching patches.

What plan do you get, with the patches from

https://www.postgresql.org/message-id/CAFjFpRfdXpuSu0pxON3dKcr8WndJkaXLzHUVax_Laod0Tgc6UQ@mail.gmail.com

I still see the join above Append, not below:
                              QUERY PLAN
-------------------------------------------------------------------------Merge Full Join  (cost=359.57..860.00
rows=32512width=8)  Merge Cond: (a_0.i = b_0.j)  ->  Sort  (cost=179.78..186.16 rows=2550 width=4)        Sort Key:
a_0.i       ->  Append  (cost=0.00..35.50 rows=2550 width=4)              ->  Seq Scan on a_0  (cost=0.00..35.50
rows=2550width=4)  ->  Sort  (cost=179.78..186.16 rows=2550 width=4)        Sort Key: b_0.j        ->  Append
(cost=0.00..35.50rows=2550 width=4)              ->  Seq Scan on b_0  (cost=0.00..35.50 rows=2550 width=4) 

> Please post review comments about the last two patches on that thread.

ok, I'll do if I find any problem.

> [1] https://www.postgresql.org/message-id/CAFjFpRdjQvaUEV5DJX3TW6pU5eq54NCkadtxHX2JiJG_GvbrCA@mail.gmail.com

--
Antonin Houska
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de, http://www.cybertec.at



pgsql-hackers by date:

Previous
From: Alexey Chernyshov
Date:
Subject: Re: [HACKERS] [PATCH] Pageinspect - add functions on GIN and GiSTindexes from gevel
Next
From: Ashutosh Bapat
Date:
Subject: Re: [HACKERS] WIP: Aggregation push-down