Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables - Mailing list pgsql-hackers
From | Ashutosh Bapat |
---|---|
Subject | Re: [HACKERS] Partition-wise join for join between (declaratively)partitioned tables |
Date | |
Msg-id | CAFjFpRe-8P5pzfET4YZRH0Vawd0_o2TK4_zo+AjZ04mfCB3O0A@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] Partition-wise join for join between (declaratively) partitioned tables (Antonin Houska <ah@cybertec.at>) |
Responses |
Re: [HACKERS] Partition-wise join for join between (declaratively) partitioned tables
|
List | pgsql-hackers |
On Fri, Sep 1, 2017 at 6:05 PM, Antonin Houska <ah@cybertec.at> wrote: > Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote: > >> I originally thought to provide it along-with the changes to >> expand_inherited_rtentry(), but that thread is taking longer. Jeevan >> Chalke needs rebased patches for his work on aggregate pushdown and >> Thomas might need them for further review. So, here they are. > > Since I have related patch in the current commitfest > (https://commitfest.postgresql.org/14/1247/), I spent some time reviewing your > patch: > > * generate_partition_wise_join_paths() > > Right parenthesis is missing in the prologue. Thanks for pointing that out. Fixed. > > > * get_partitioned_child_rels_for_join() > > I think the Assert() statement is easier to understand inside the loop, see > the assert.diff attachment. The assert at the end of function also checks that we have got child_rels lists for all the parents passed in. That is not checked by your version. Furthermore, we would checked that each child_rels has at least one element while buildings paths for base relations. Checking the same again for joins doesn't add any value. > > > * have_partkey_equi_join() > > As the function handles generic join, this comment doesn't seem to me > relevant: > > /* > * The equi-join between partition keys is strict if equi-join between > * at least one partition key is using a strict operator. See > * explanation about outer join reordering identity 3 in > * optimizer/README > */ > strict_op = op_strict(opexpr->opno); What in that comment is not exactly relevant? > > And I think the function can return true even if strict_op is false for all > the operators evaluated in the loop. I think it does that. Do you have a case where it doesn't? > > > * match_expr_to_partition_keys() > > I'm not sure this comment is clear enough: > > /* > * If it's a strict equi-join a NULL partition key on one side will > * not join a NULL partition key on the other side. So, rows with NULL > * partition key from a partition on one side can not join with those > * from a non-matching partition on the other side. So, search the > * nullable partition keys as well. > */ > if (!strict_op) > continue; > > My understanding of the problem of NULL values generated by outer join is: > these NULL values --- if evaluated by non-strict expression --- can make row > of N-th partition on one side of the join match row(s) of *other than* N-th > partition(s) on the other side. Thus the nullable input expressions may only > be evaluated by strict operators. I think it'd be clearer if you stressed that > (undesired) *match* of partition keys can be a problem, rather than mismatch Sorry, I am not able to understand this. To me it looks like my wording conveys what you are saying. > > If you insist on your wording, then I think you should at least move the > comment below to the part that only deals with strict operators. Done. > > > * There are several places where lfirst_node() macro should be used. For > example > > rel = lfirst_node(RelOptInfo, lc); > > instead of > > rel = (RelOptInfo *) lfirst(lc); Thanks for that. > > > * map_and_merge_partitions() > > Besides a few changes proposed in map_and_merge_partitions.diff (a few of them > to suppress compiler warnings) I think that this part needs more thought: > > { > Assert(mergemap1[index1] != mergemap2[index2] && > mergemap1[index1] >= 0 && mergemap2[index2] >= 0); > > /* > * Both the partitions map to different merged partitions. This > * means that multiple partitions from one relation matches to one > * partition from the other relation. Partition-wise join does not > * handle this case right now, since it requires ganging multiple > * partitions together (into one RelOptInfo). > */ > merged_index = -1; > } > > I could hit this path with the following test: > > CREATE TABLE a(i int) PARTITION BY LIST(i); > CREATE TABLE a_0 PARTITION OF a FOR VALUES IN (0, 2); > CREATE TABLE b(j int) PARTITION BY LIST(j); > CREATE TABLE b_0 PARTITION OF b FOR VALUES IN (1, 2); > > SET enable_partition_wise_join TO on; > > SELECT * > FROM a > FULL JOIN > b ON i = j; > > I don't think there's a reason not to join a_0 partition to b_0, is there? With the latest patchset I am seeing that partition-wise join is used in this case. I have started a new thread [1] for advanced partition matching patches. Please post review comments about the last two patches on that thread. [1] https://www.postgresql.org/message-id/CAFjFpRdjQvaUEV5DJX3TW6pU5eq54NCkadtxHX2JiJG_GvbrCA@mail.gmail.com Attached patchset with above comments addressed. -- Best Wishes, Ashutosh Bapat EnterpriseDB Corporation The Postgres Database Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
pgsql-hackers by date: