Re: a few crazy ideas about hash joins - Mailing list pgsql-hackers

From Lawrence, Ramon
Subject Re: a few crazy ideas about hash joins
Date
Msg-id 6EEA43D22289484890D119821101B1DF05190DEF@exchange20.mercury.ad.ubc.ca
Whole thread Raw
In response to Re: a few crazy ideas about hash joins  (Greg Stark <stark@enterprisedb.com>)
Responses Re: a few crazy ideas about hash joins  (Grzegorz Jaskiewicz <gj@pointblue.com.pl>)
List pgsql-hackers
> > I would be especially interested in using a shared memory hash table
> > that *all* backends can use - if the table is mostly read-only, as
> > dimension tables often are in data warehouse applications. That
would
> > give zero startup cost and significantly reduced memory.
>
> I think that's a non-starter due to visibility issues and handling
> inserts and updates. Even just reusing a hash from one execution in a
> later execution of the same plan would be tricky since we would have
> to expire it if the snapshot changes.

If your data set is nearly read-only, materialized views would be a
better way to go and would require no hash join changes.

The idea of perfect hash functions for dimension tables is very
interesting.  If the data set is near static, it is possible to compute
them once in a few minutes time for a million tuple table and then
re-use them until they change.  The research has shown it is possible,
but I do not know if anyone has actually implemented it in a real DBMS.
An implementation could be something to try if there is interest.

--
Ramon Lawrence


pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: can't load plpython
Next
From: Tom Lane
Date:
Subject: Re: can't load plpython