Re: 9.3 Pre-proposal: Range Merge Join - Mailing list pgsql-hackers

From Robert Haas
Subject Re: 9.3 Pre-proposal: Range Merge Join
Date
Msg-id CA+TgmoYSKvnGDB7ZRHaus-AEHq_dLUKkGxOcgALRpMr=it1PHw@mail.gmail.com
Whole thread Raw
In response to Re: 9.3 Pre-proposal: Range Merge Join  (Jeff Davis <pgsql@j-davis.com>)
Responses Re: 9.3 Pre-proposal: Range Merge Join
Re: 9.3 Pre-proposal: Range Merge Join
List pgsql-hackers
On Mon, Apr 16, 2012 at 1:43 PM, Jeff Davis <pgsql@j-davis.com> wrote:
> On Mon, 2012-04-16 at 02:52 -0400, Tom Lane wrote:
>> Jeff Davis <pgsql@j-davis.com> writes:
>> >  1. Order the ranges on both sides by the lower bound, then upper bound.
>> > Empty ranges can be excluded entirely.
>> >  2. Left := first range on left, Right := first range on right
>> >  3. If Left or Right is empty, terminate.
>> >  4. If lower(Left) > upper(Right), discard Right, goto 2
>> >  5. If lower(Right) > upper(Left), discard Left, goto 2
>> >  6. return (Left, Right) as joined tuple
>> >  7. Right := next range on right
>> >  8. goto 3
>>
>> This is surely not correct in detail.  As written, it will be impossible
>> for any tuple on the right side to be joined to more than one left-side
>> tuple.  You will need something analogous to the mark-and-restore
>> rewinding logic in standard mergejoin to make this work.
>
> Every time you discard a tuple on the left, you go to step 2, which
> rewinds the right side back to the first non-discarded tuple.
>
> So, implemented using mark and restore:
>
>  * start off with the marks at the first tuple on each side
>  * "discard" means move the mark down a tuple
>  * setting it back to the first range means restoring to the mark
>  * going to the next range (step 7) just means getting another
>    tuple, without changing the mark
>
> Only one side really needs the mark and restore logic, but it was easier
> to write the pseudocode in a symmetrical way (except step 7).

I'm actually not sure these are equivalent formulations.  Suppose one
side has [i,i] where i ranges from 1 to 10000000 and the other side
the exact same thing plus [1,10000000].  That one really big range
will come up second on the right side, and you will not be able to
discard it until you reach the end of the join.  If you just keep
rewinding the right side, you're going to end up with O(n^2) behavior,
whereas if you can discard tuples "from the middle" on the right side,
then you will get O(n) behavior, which is a big difference.  In other
words, in your original algorithm the tuples that you discard in step
4 or 5 need not be the first remaining tuple on whichever side of the
join we're talking about.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Greg Smith
Date:
Subject: Re: Bug tracker tool we need
Next
From: Robert Haas
Date:
Subject: Re: 9.3 Pre-proposal: Range Merge Join