Home > mailing lists

Re: a few crazy ideas about hash joins - Mailing list pgsql-hackers

From	Simon Riggs
Subject	Re: a few crazy ideas about hash joins
Date	April 3, 2009 13:41:15
Msg-id	1238776919.5444.201.camel@ebony.2ndQuadrant Whole thread Raw
In response to	a few crazy ideas about hash joins (Robert Haas <robertmhaas@gmail.com>)
Responses	Re: a few crazy ideas about hash joins
List	pgsql-hackers

Tree view

On Thu, 2009-04-02 at 22:08 -0400, Robert Haas wrote:

> 3. Avoid building the exact same hash table twice in the same query.
> This happens more often you'd think.  For example, a table may have
> two columns creator_id and last_updater_id which both reference person
> (id).  If you're considering a hash join between paths A and B, you
> could conceivably check whether what is essentially a duplicate of B
> has already been hashed somewhere within path A.  If so, you can reuse
> that same hash table at zero startup-cost.

This is also interesting because there is potential to save memory
through that approach, which allows us to allocate work_mem higher and
avoid multi-batch altogether.

I would be especially interested in using a shared memory hash table
that *all* backends can use - if the table is mostly read-only, as
dimension tables often are in data warehouse applications. That would
give zero startup cost and significantly reduced memory.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support

pgsql-hackers by date:

From: Tom Lane
Date: 03 April 2009, 13:02:56
Subject: Re: 8.4 open items list

From: Greg Stark
Date: 03 April 2009, 13:55:18
Subject: Re: a few crazy ideas about hash joins

Re: a few crazy ideas about hash joins - Mailing list pgsql-hackers

Previous

Next