Re: New style of hash join proposal - Mailing list pgsql-hackers

From Decibel!
Subject Re: New style of hash join proposal
Date
Msg-id 2E697850-E210-46F4-8602-BA68E33E5D29@decibel.org
Whole thread Raw
In response to New style of hash join proposal  (Gregory Stark <stark@enterprisedb.com>)
Responses Re: New style of hash join proposal  (Bruce Momjian <bruce@momjian.us>)
List pgsql-hackers
On Dec 13, 2007, at 7:13 AM, Gregory Stark wrote:
> We currently execute a lot of joins as Nested Loops which would be  
> more
> efficient if we could batch together all the outer keys and execute  
> a single
> inner bitmap index scan for all of them together.
>
> Essentially what I'm saying is that we're missing a trick with Hash  
> Joins
> which currently require that we can execute the inner side once  
> without any
> parameters from the outer side.
>
> Instead what we could do is build up the hash table, then scan the  
> hash table
> building up an array of keys and pass them as a parameter to the  
> inner side.
> The inner side could do a bitmap index scan to fetch them all at  
> once and
> start returning them just as normal to the hash join.
>
> There are a couple details:
>
> 1) Batched hash joins. Actually I think this would be fairly  
> straightforward.
>    You want to rescan the inner side once for each batch. That  
> would actually
>    be easier than what we currently do with saving tuples to files  
> and all
>    that.
>
> 2) How to pass the keys. This could be a bit tricky especially for
>    multi-column keys. My first thought was to build up an actually  
> Array node
>    but that only really works for single-column keys I think.  
> Besides it would
>    be more efficient to somehow arrange to pass over a reference to  
> the whole
>    hash.
>
> I fear the real complexity would be (as always) in the planner  
> rather than the
> executor. I haven't really looked into what it would take to  
> arrange this or
> how to decide when to do it.

TODO?
-- 
Decibel!, aka Jim C. Nasby, Database Architect  decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828



pgsql-hackers by date:

Previous
From: Gregory Stark
Date:
Subject: Re: Sorting Improvements for 8.4
Next
From: "Dann Corbit"
Date:
Subject: Re: Sorting Improvements for 8.4