Re: [HACKERS] advanced partition matching algorithm forpartition-wise join - Mailing list pgsql-hackers

From Etsuro Fujita
Subject Re: [HACKERS] advanced partition matching algorithm forpartition-wise join
Date
Msg-id CAPmGK15kXh76EnRn=B64u=+qLxZOKWROd4uMjMBnWcsZPdQ_mw@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] advanced partition matching algorithm forpartition-wise join  (Etsuro Fujita <etsuro.fujita@gmail.com>)
Responses Re: [HACKERS] advanced partition matching algorithm forpartition-wise join
List pgsql-hackers
On Fri, Nov 15, 2019 at 6:10 PM Etsuro Fujita <etsuro.fujita@gmail.com> wrote:
> On Fri, Nov 15, 2019 at 2:21 PM amul sul <sulamul@gmail.com> wrote:
> > Thank you Fujita san for the patch & the enhancements.  I am fine with your
> > delta patch.
>
> OK, I'll merge the delta patch with the main one in the next version,
> if no objections from others.
>
> > I would like to share some thought for other code:
> >
> > File: partbounds.c:
> > 3298 static void
> > 3299 get_merged_range_bounds(int partnatts, FmgrInfo *partsupfuncs,
> > 3300                         Oid *partcollations, JoinType jointype,
> > 3301                         PartitionRangeBound *left_lb,
> > 3302                         PartitionRangeBound *left_ub,
> > 3303                         PartitionRangeBound *right_lb,
> > 3304                         PartitionRangeBound *right_ub,
> > 3305                         PartitionRangeBound *merged_lb,
> > 3306                         PartitionRangeBound *merged_ub)
> >
> > Can we rename these argument as inner_*  & outer_*  like we having for the
> > functions, like 0003 patch?
>
> +1  (Actually, I too was thinking about that.)
>
> > File: partbounds.c:
> > 3322
> > 3323         case JOIN_INNER:
> > 3324         case JOIN_SEMI:
> > 3325             if (compare_range_bounds(partnatts, partsupfuncs, partcollations,
> > 3326                                      left_ub, right_ub) < 0)
> > 3327                 *merged_ub = *left_ub;
> > 3328             else
> > 3329                 *merged_ub = *right_ub;
> > 3330
> > 3331             if (compare_range_bounds(partnatts, partsupfuncs, partcollations,
> > 3332                                      left_lb, right_lb) > 0)
> > 3333                 *merged_lb = *left_lb;
> > 3334             else
> > 3335                 *merged_lb = *right_lb;
> > 3336             break;
> > 3337
> >
> > How about reusing ub_cmpval & lb_cmpval  instead of calling
> > compare_range_bounds() inside get_merged_range_bounds(), like 0004 patch?
>
> Good idea!  So, +1
>
> > Apart from this, I would like to propose 0005 cleanup patch where I have
> > rearranged function arguments & code to make sure the arguments & the code
> > related to lower bound should come first and then the upper bound.
>
> +1
>
> I'll merge these changes as well into the main patch, if no objections
> of others.

Done.  I modified compare_range_partitions() as well to match its the
variable names with others.  Attached is a new version of the patch.

I reviewed the rest of the partbounds.c changes.  Here are my review comments:

* I don't think this analysis is correct:

+/*
+ * merge_null_partitions
+ *
+ * Merge NULL partitions, i.e. a partition that can hold NULL values for a lis\
t
+ * partitioned table, if any. Find the index of merged partition to which the
+ * NULL values would belong in the join result. If one joining relation has a
+ * NULL partition but not the other, try matching it with the default partitio\
n
+ * from the other relation since the default partition may have rows with NULL
+ * partition key. We can eliminate a NULL partition when it appears only on th\
e
+ * inner side of the join and the outer side doesn't have a default partition.
+ *
+ * When the equality operator used for join is strict, two NULL values will no\
t
+ * be considered as equal, and thus a NULL partition can be eliminated for an
+ * inner join. But we don't check the strictness operator here.
+ */

First of all, I think we can assume here that the equality operator is
*strict*, because 1) have_partkey_equi_join(), which is executed
before calling merge_null_partitions(), requires that the
corresponding clause is mergejoinable, and 2) currently, we assume
that mergejoinable operators are strict (see MJEvalOuterValues() and
MJEvalInnerValues()).  So I don't think it's needed to try merging a
NULL partition on one side with the default partition on the other
side as above.  (merge_null_partitions() tries merging NULL partitions
as well, but in some cases I don't think we need to do that, either.)
So I rewrote merge_null_partitions() completely.  Another change is
that I eliminated the NULL partition of the joinrel in more cases.
Attached is a patch (v28-0002-modify-merge_null_partitions.patch) for
that created on top of the main patch.  I might be missing something,
though.

Other changes in that patch:

* I fixed memory leaks in partition_list_bounds_merge() and
partition_range_bounds_merge().

* I noticed this in merge_default_partitions():

+       Assert(outer_has_default && inner_has_default);
+
+       *default_index = map_and_merge_partitions(outer_map,
+                                                 inner_map,
+                                                 outer_default,
+                                                 inner_default,
+                                                 next_index);
+       if (*default_index == -1)
+           return false;

I think the merging should be done successfully, because of 1) this in
process_outer_partition():

+   if (inner_has_default)
+   {
+       Assert(inner_default >= 0);
+
+       /*
+        * If the outer side has the default partition as well, we need to
+        * merge the default partitions (see merge_default_partitions()); give
+        * up on it.
+        */
+       if (outer_has_default)
+           return false;

and 2) this in process_inner_partition():

+   if (outer_has_default)
+   {
+       Assert(outer_default >= 0);
+
+       /*
+        * If the inner side has the default partition as well, we need to
+        * merge the default partitions (see merge_default_partitions()); give
+        * up on it.
+        */
+       if (inner_has_default)
+           return false;

So I removed the above if test (ie, *default_index == -1).  I also
modified that function a bit further, including comments.

* I simplified process_outer_partition() and process_inner_partition()
a bit, changing the APIs to match that of map_and_merge_partitions().
Also, I think this in these functions is a bit ugly;

+           /* Don't add this index to the list of merged indexes. */
+           *merged_index = -1;

so I removed it, and modified the condition on whether or not we add
merged bounds to the lists in partition_list_bounds_merge() and
partition_range_bounds_merge(), instead.

That's it.  Sorry for the delay.

Best regards,
Etsuro Fujita

Attachment

pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: base backup client as auxiliary backend process
Next
From: Andrew Dunstan
Date:
Subject: Re: TAP tests aren't using the magic words for Windows file access