Re: a few crazy ideas about hash joins - Mailing list pgsql-hackers

From Greg Stark
Subject Re: a few crazy ideas about hash joins
Date
Msg-id 4136ffa0904031003vc29044dp71bf0cd3964088ca@mail.gmail.com
Whole thread Raw
In response to Re: a few crazy ideas about hash joins  (Simon Riggs <simon@2ndQuadrant.com>)
Responses Re: a few crazy ideas about hash joins  (Simon Riggs <simon@2ndQuadrant.com>)
Re: a few crazy ideas about hash joins  ("Lawrence, Ramon" <ramon.lawrence@ubc.ca>)
List pgsql-hackers
On Fri, Apr 3, 2009 at 5:41 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>
> I would be especially interested in using a shared memory hash table
> that *all* backends can use - if the table is mostly read-only, as
> dimension tables often are in data warehouse applications. That would
> give zero startup cost and significantly reduced memory.

I think that's a non-starter due to visibility issues and handling
inserts and updates. Even just reusing a hash from one execution in a
later execution of the same plan would be tricky since we would have
to expire it if the snapshot changes.

Alternately, you could say that what you describe is addressed by hash
indexes. The fact that they're not great performers compared to
in-memory hashes comes down to dealing with updates and vaccum which
is pretty much the same issue.

Hm.  I wonder if we need a whole class of index algorithms to deal
specifically with read-only tables. A hash table on a read-only table
could spend a lot of time to generate a perfect or near-perfect hash
function and then pack the hash table very densely without any bucket
chains. That might make it a big winner over a dynamic structure which
has to deal with handling inserts and so on. I'm assuming that if you
mark the table read-write it just marks the index invalid and you have
to rebuild it later once you've marked the table read-only again.

-- 
greg


pgsql-hackers by date:

Previous
From: Greg Stark
Date:
Subject: Re: a few crazy ideas about hash joins
Next
From: Tom Lane
Date:
Subject: Re: can't load plpython