Re: [HACKERS] advanced partition matching algorithm forpartition-wise join - Mailing list pgsql-hackers
From | Dmitry Dolgov |
---|---|
Subject | Re: [HACKERS] advanced partition matching algorithm forpartition-wise join |
Date | |
Msg-id | CA+q6zcWRaHhW7ZDrdJ0UHKJ_xpSyMEjE9f0F_t=vOq-AzcvvvQ@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] advanced partition matching algorithm forpartition-wise join (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: [HACKERS] advanced partition matching algorithm forpartition-wise join
|
List | pgsql-hackers |
> On Fri, 27 Jul 2018 at 20:13, Robert Haas <robertmhaas@gmail.com> wrote: > > On Fri, Jul 27, 2018 at 3:17 AM, Ashutosh Bapat > <ashutosh.bapat@enterprisedb.com> wrote: > > Apart from the complexity there's also a possibility that this > > skipping will reduce the efficiency actually in normal cases. Consider > > a case where A and B have exactly matching partitions. Current > > partition matching algorithm compare any given range/bound only once > > and we will have O(n) algorithm. If we use a binary search, however, > > for every range comparison, it will be O(n log n) algorithm. There > > will be unnecessary comparisons during binary search. The current > > algorithm is always O(n), whereas binary search would be O(n log(n)) > > with a possibility of having sub-O(n) complexity in some cases. I > > would go for an algorithm which has a consistent time complexity > > always and which is efficient in normal cases, rather than worrying > > about some cases which are not practical. > > Yeah, I think that's a good point. The normal case here will be that > the partition bounds are equal, or that there are a few extra > partitions on one side that don't exist on the other. We don't want > other cases to be crazily inefficient, but I suspect in practice that > if the partitioning bounds aren't pretty close to matching up exactly, > we're going to end up failing to be able to do a partition-wise join > at all. It's not very likely that somebody happens to have a case > where they've partitioned two tables in randomly different ways, but > then they decide to join them anyway, but then it turns out that the > partition bounds happen to be compatible enough that this algorithm > works. > On Mon, 23 Jul 2018 at 10:38, Ashutosh Bapat <ashutosh.bapat@enterprisedb.com> wrote: > 1. those cases are rare enough that we can ignore those right now. How > many times we would encounter the case you have quoted, for example? > Usually the ranges will be continuous only differing in the first or > last partition e.g time-series data. Ok, but what about list partitioning? I believe the area of application for it can be more diverse than mostly just for time-series, and in the patch almost the same approach is used to merge list partitions. Other than that everything seems fine from functionality point of view, and so far I couldn't find any situation that produces a wrong plan. Some of the joins were not that intuitive from the first glance, but at the end everything was according to the documentation. Taking this into account I wonder if it's possible somehow to give any hints in an explain result about why it wasn't possible to apply partition wise join? If something wasn't clear for me I ended up looking at the value of "merged" flag, and to figure out actual reason one needs to trace the entire algorithm. Besides that I checked the performance in some simple cases, no problems on this side (but I'll also do some checks for more complicated joins). And I still have some complaints about readability, although I can imagine that it's just me: * Many functions carry some unrelated arguments just to pass them through, which obscures the purpose of a function. * I want to suggest to introduce a new data structure for partitions mapping instead of just keeping them in arrays (was it discussed already before?). * What is the reason that almost everywhere in the patch there is a naming convention "outer/inner" partition maps, and sometimes (e.g. in generate_matching_part_pairs) it's "map1/map2"?
pgsql-hackers by date: