Re: [HACKERS] advanced partition matching algorithm forpartition-wise join - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: [HACKERS] advanced partition matching algorithm forpartition-wise join
Date
Msg-id 20200325190513.rmxr7swainc64io5@development
Whole thread Raw
In response to Re: [HACKERS] advanced partition matching algorithm forpartition-wise join  (Etsuro Fujita <etsuro.fujita@gmail.com>)
Responses Re: [HACKERS] advanced partition matching algorithm forpartition-wise join
Re: [HACKERS] advanced partition matching algorithm forpartition-wise join
List pgsql-hackers
Hi,

I've started reviewing the patch a couple days ago. I haven't done any
extensive testing, but I do have a bunch of initial comments that I can
share now.

1) I wonder if this needs to update src/backend/optimizer/README, which
does have a section about partitionwise joins. It seems formulated in a
way that that probably covers even this more advanced algorithm, but
maybe it should mention handling of default partitions etc.?

There certainly needs to be some description of the algorithm somewhere,
either in a README or before a suitable function. It doesn't have to be
particularly detailed, a rough outline of the matching would be enough,
so that readers don't have to rebuild the knowledge from pieces
scattered around various comments.

2) Do we need another GUC enabling this more complex algorithm? In PG11
the partitionwise join is disabled by default, on the grounds that it's
expensive and not worth it by default. How much more expensive is this?
Maybe it makes sense to allow enabling only the "simple" approach?

3) This comment in try_partitionwise_join is now incorrect, because the
condition may be true even for partitioned tables with (nparts == 0).

     /* Nothing to do, if the join relation is not partitioned. */
     if (joinrel->part_scheme == NULL || joinrel->nparts == 0)
         return;

Moreover, the condition used to be 

     if (!IS_PARTITIONED_REL(joinrel))
         return;

which is way more readable. I think it's net negative to replace these
"nice" macros with clear meaning with complex conditions. If needed, we
can invent new macros. There are many other places where the patch
replaces macros with less readable conditions.


4) I'm a bit puzzled how we could get here with non-partitioned rels?

     /*
      * We can not perform partitionwise join if either of the joining relations
      * is not partitioned.
      */
     if (!IS_PARTITIONED_REL(rel1) || !IS_PARTITIONED_REL(rel2))
         return;

5) I find the "merged" flag in RelOptInfo rather unclear, because it
does not clearly indicate what was merged. Maybe something like
partbounds_merged would be better?

6) The try_partitionwise_join function is getting a bit too long and
harder to understand. The whole block in

     if (joinrel->nparts == -1)
     {
         ...
     }

seems rather well isolated, so I propose to move it to a separate
function responsible only for the merging. We can simply call it on the
joinrel, and make it return right away if (joinrel->nparts == -1).

7) I'd suggest not to reference exact functions in comments unless
abolutely necessary, because it's harder to maintain and it does not
really explain purpose of the struct/code. E.g. consider this:

     /* Per-partitioned-relation data for merge_list_bounds()/merge_range_bounds() */
     typedef struct PartitionMap
     { ... }

Why does it matter where is the struct used? That's pretty trivial to
find using 'git grep' or something. Instead the comment should explain
the purpose of the struct.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



pgsql-hackers by date:

Previous
From: Jeff Davis
Date:
Subject: Re: AllocSetEstimateChunkSpace()
Next
From: Pavel Stehule
Date:
Subject: Re: plan cache overhead on plpgsql expression