Re: 9.3 Pre-proposal: Range Merge Join - Mailing list pgsql-hackers

From Tom Lane
Subject Re: 9.3 Pre-proposal: Range Merge Join
Date
Msg-id 6227.1334559170@sss.pgh.pa.us
Whole thread Raw
In response to 9.3 Pre-proposal: Range Merge Join  (Jeff Davis <pgsql@j-davis.com>)
Responses Re: 9.3 Pre-proposal: Range Merge Join
Re: 9.3 Pre-proposal: Range Merge Join
List pgsql-hackers
Jeff Davis <pgsql@j-davis.com> writes:
> Proposed solution: a modified merge join that can handle ranges.

>  1. Order the ranges on both sides by the lower bound, then upper bound.
> Empty ranges can be excluded entirely.
>  2. Left := first range on left, Right := first range on right
>  3. If Left or Right is empty, terminate.
>  4. If lower(Left) > upper(Right), discard Right, goto 2
>  5. If lower(Right) > upper(Left), discard Left, goto 2
>  6. return (Left, Right) as joined tuple
>  7. Right := next range on right
>  8. goto 3

This is surely not correct in detail.  As written, it will be impossible
for any tuple on the right side to be joined to more than one left-side
tuple.  You will need something analogous to the mark-and-restore
rewinding logic in standard mergejoin to make this work.

I can't escape the feeling that we could do this with more-or-less
standard mergejoin logic paying attention to some fuzzy notion of
equality, with the question of whether a given pair of ranges actually
overlap being deferred to a filter condition.  Not quite sure how to
make that work though.

>  * Is this idea sane? -- that is, are ranges important enough that
> people are willing to maintain a new operator?

Dunno.  It might be easier to sell the idea of adding support for range
joins in a couple of years, after we've seen how much use ranges get.

I will note that mergejoin is not only one of the more complicated
executor modules, but it accounts for a huge percentage of the planner's
complexity; which doesn't exactly make me sympathize with the notion of
let's-copy-and-paste-all-that.  It'd be a lot better if we could build
it as a small tweak to mergejoin rather than an independent thing.

(Having said that, I have no idea how the planner support could work
at all, because its treatment of mergejoins is intimately tied to
mergejoinable equality and EquivalenceClasses and PathKeys, which don't
seem readily extensible to the range situation.)
        regards, tom lane


pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: 9.3 Pre-proposal: Range Merge Join
Next
From: Nikhil Sontakke
Date:
Subject: Re: how to create a non-inherited CHECK constraint in CREATE TABLE