Home > mailing lists

Re: [HACKERS] advanced partition matching algorithm forpartition-wise join - Mailing list pgsql-hackers

From	Ashutosh Bapat
Subject	Re: [HACKERS] advanced partition matching algorithm forpartition-wise join
Date	April 1, 2020 20:12:01
Msg-id	CAG-ACPVkw+cLM0HNsZACJxq=M8JFn7cfH-eLLzBRZDkD9ZN+2A@mail.gmail.com Whole thread Raw
In response to	Re: [HACKERS] advanced partition matching algorithm forpartition-wise join (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses	Re: [HACKERS] advanced partition matching algorithm forpartition-wise join (Etsuro Fujita <etsuro.fujita@gmail.com>)
List	pgsql-hackers

Tree view

On Thu, 26 Mar 2020 at 00:35, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:

Hi,

I've started reviewing the patch a couple days ago. I haven't done any
extensive testing, but I do have a bunch of initial comments that I can
share now.

1) I wonder if this needs to update src/backend/optimizer/README, which
does have a section about partitionwise joins. It seems formulated in a
way that that probably covers even this more advanced algorithm, but
maybe it should mention handling of default partitions etc.?

Done. Please check the wording. It might need some word smithy.

There certainly needs to be some description of the algorithm somewhere,
either in a README or before a suitable function. It doesn't have to be
particularly detailed, a rough outline of the matching would be enough,
so that readers don't have to rebuild the knowledge from pieces
scattered around various comments.

The algorithm for list and range partitioned tables is slightly different. So, I have added separate prologue to each list_merge_bounds() and range_merge_bounds(). Please check if that serves the purpose.

2) Do we need another GUC enabling this more complex algorithm? In PG11
the partitionwise join is disabled by default, on the grounds that it's
expensive and not worth it by default. How much more expensive is this?
Maybe it makes sense to allow enabling only the "simple" approach?

We have reduced the complexity of merging bounds quite a bit so this shouldn't be costly. Further more we handle the usual case of equal bounds quickly using the merge flag so most of the cases should be fine. It's only when two partitioned tables with same partition scheme are joined but do not have merge-able bounds that this algorithm would not yield useful result - but that would be rare in the field. enable_partitionwise_join = false should take care of such scenarios easily. I am not in favour of adding another GUC which we set to false by default and then take another few releases to make it true by default.

3) This comment in try_partitionwise_join is now incorrect, because the
condition may be true even for partitioned tables with (nparts == 0).

/* Nothing to do, if the join relation is not partitioned. */
if (joinrel->part_scheme == NULL || joinrel->nparts == 0)
return;

If part_scheme = NULL, npart should be 0, fixed that in build_joinrel_partition_info(). If partscheme != NULL but bounds can not be merged, nparts = 0. So this condition is correct. Encapsulated in a macro IS_JOINREL_NOT_PARTITITIONED(). and added comments for the macro. Given that the macro is used exactly at one place, it may not be necessary to define it but it looks *nice*.

Moreover, the condition used to be

if (!IS_PARTITIONED_REL(joinrel))
return;

which is way more readable. I think it's net negative to replace these
"nice" macros with clear meaning with complex conditions. If needed, we
can invent new macros. There are many other places where the patch
replaces macros with less readable conditions.

The only other place where we have replaced a *nice* macro is in build_joinrel_partition_info(). But I think it's a legit replacement. I have added a comment there.

4) I'm a bit puzzled how we could get here with non-partitioned rels?

/*
* We can not perform partitionwise join if either of the joining relations
* is not partitioned.
*/
if (!IS_PARTITIONED_REL(rel1) || !IS_PARTITIONED_REL(rel2))
return;

See the comment I have added in build_joinrel_partition_info(). Not all joining pairs for a given relation are partitioned.

5) I find the "merged" flag in RelOptInfo rather unclear, because it
does not clearly indicate what was merged. Maybe something like
partbounds_merged would be better?

Done.

6) The try_partitionwise_join function is getting a bit too long and
harder to understand. The whole block in

if (joinrel->nparts == -1)
{
...
}

seems rather well isolated, so I propose to move it to a separate
function responsible only for the merging. We can simply call it on the
joinrel, and make it return right away if (joinrel->nparts == -1).

Looks like you have already taken care of this one in one of your patches.

7) I'd suggest not to reference exact functions in comments unless
abolutely necessary, because it's harder to maintain and it does not
really explain purpose of the struct/code. E.g. consider this:

/* Per-partitioned-relation data for merge_list_bounds()/merge_range_bounds() */
typedef struct PartitionMap
{ ... }

Why does it matter where is the struct used? That's pretty trivial to
find using 'git grep' or something. Instead the comment should explain
the purpose of the struct.

Adjusted names and comments a bit.

Best Wishes,

Ashutosh

pgsql-hackers by date:

From: Peter Geoghegan
Date: 01 April 2020, 20:03:37
Subject: Re: snapshot too old issues, first around wraparound and then more.

From: Robert Haas
Date: 01 April 2020, 20:27:56
Subject: Re: snapshot too old issues, first around wraparound and then more.

Re: [HACKERS] advanced partition matching algorithm forpartition-wise join - Mailing list pgsql-hackers

Previous

Next