Re: [HACKERS] advanced partition matching algorithm forpartition-wise join - Mailing list pgsql-hackers

From Dmitry Dolgov
Subject Re: [HACKERS] advanced partition matching algorithm forpartition-wise join
Date
Msg-id CA+q6zcWRaHhW7ZDrdJ0UHKJ_xpSyMEjE9f0F_t=vOq-AzcvvvQ@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] advanced partition matching algorithm forpartition-wise join  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: [HACKERS] advanced partition matching algorithm forpartition-wise join
List pgsql-hackers
> On Fri, 27 Jul 2018 at 20:13, Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Fri, Jul 27, 2018 at 3:17 AM, Ashutosh Bapat
> <ashutosh.bapat@enterprisedb.com> wrote:
> > Apart from the complexity there's also a possibility that this
> > skipping will reduce the efficiency actually in normal cases. Consider
> > a case where A and B have exactly matching partitions. Current
> > partition matching algorithm compare any given range/bound only once
> > and we will have O(n) algorithm. If we use a binary search, however,
> > for every range comparison, it will be O(n log n) algorithm. There
> > will be unnecessary comparisons during binary search. The current
> > algorithm is always O(n), whereas binary search would be O(n log(n))
> > with a possibility of having sub-O(n) complexity in some cases. I
> > would go for an algorithm which has a consistent time complexity
> > always and which is efficient in normal cases, rather than worrying
> > about some cases which are not practical.
>
> Yeah, I think that's a good point.  The normal case here will be that
> the partition bounds are equal, or that there are a few extra
> partitions on one side that don't exist on the other.  We don't want
> other cases to be crazily inefficient, but I suspect in practice that
> if the partitioning bounds aren't pretty close to matching up exactly,
> we're going to end up failing to be able to do a partition-wise join
> at all.  It's not very likely that somebody happens to have a case
> where they've partitioned two tables in randomly different ways, but
> then they decide to join them anyway, but then it turns out that the
> partition bounds happen to be compatible enough that this algorithm
> works.

> On Mon, 23 Jul 2018 at 10:38, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote:
> 1. those cases are rare enough that we can ignore those right now. How
> many times we would encounter the case you have quoted, for example?
> Usually the ranges will be continuous only differing in the first or
> last partition e.g time-series data.

Ok, but what about list partitioning? I believe the area of application for it
can be more diverse than mostly just for time-series, and in the patch almost
the same approach is used to merge list partitions.

Other than that everything seems fine from functionality point of view, and so
far I couldn't find any situation that produces a wrong plan. Some of the joins
were not that intuitive from the first glance, but at the end everything was
according to the documentation.
Taking this into account I wonder if it's possible somehow to give any hints in
an explain result about why it wasn't possible to apply partition wise join? If
something wasn't clear for me I ended up looking at the value of "merged" flag,
and to figure out actual reason one needs to trace the entire algorithm.

Besides that I checked the performance in some simple cases, no problems on
this side (but I'll also do some checks for more complicated joins).

And I still have some complaints about readability, although I can imagine that
it's just me:

* Many functions carry some unrelated arguments just to pass them through,
  which obscures the purpose of a function.

* I want to suggest to introduce a new data structure for partitions mapping
  instead of just keeping them in arrays (was it discussed already before?).

* What is the reason that almost everywhere in the patch there is a naming
  convention "outer/inner" partition maps, and sometimes (e.g. in
  generate_matching_part_pairs) it's "map1/map2"?


pgsql-hackers by date:

Previous
From: Fabien COELHO
Date:
Subject: Re: [HACKERS] proposal: schema variables
Next
From: David Rowley
Date:
Subject: Make executor's Range Table an array instead of a List