Re: Proposed Patch to Improve Performance of Multi-Batch Hash Join for Skewed Data Sets - Mailing list pgsql-hackers

From Joshua Tolley
Subject Re: Proposed Patch to Improve Performance of Multi-Batch Hash Join for Skewed Data Sets
Date
Msg-id e7e0a2570811061522g63a06fa8o4f02972a607840eb@mail.gmail.com
Whole thread Raw
In response to Re: Proposed Patch to Improve Performance of Multi-Batch Hash Join for Skewed Data Sets  (Simon Riggs <simon@2ndQuadrant.com>)
Responses Re: Proposed Patch to Improve Performance of Multi-Batch Hash Join for Skewed Data Sets
List pgsql-hackers
On Thu, Nov 6, 2008 at 3:52 PM, Simon Riggs <simon@2ndquadrant.com> wrote:
>
> On Thu, 2008-11-06 at 15:33 -0700, Joshua Tolley wrote:
>
>> Stay tuned.
>
> Minor question on this patch. AFAICS there is another patch that seems
> to be aiming at exactly the same use case. Jonah's Bloom filter patch.
>
> Shouldn't we have a dust off to see which one is best? Or at least a
> discussion to test whether they overlap? Perhaps you already did that
> and I missed it because I'm not very tuned in on this thread.
>
> --
>  Simon Riggs           www.2ndQuadrant.com
>  PostgreSQL Training, Services and Support

We haven't had that discussion AFAIK, and definitely should. First
glance suggests they could coexist peacefully, with proper coaxing. If
I understand things properly, Jonah's patch filters tuples early in
the join process, and this patch tries to ensure that hash join
batches are kept in RAM when they're most likely to be used. So
they're orthogonal in purpose, and the patches actually apply *almost*
cleanly together. Jonah, any comments? If I continue to have some time
to devote, and get through all I think I can do to review this patch,
I'll gladly look at Jonah's too, FWIW.

- Josh


pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: Proposed Patch to Improve Performance of Multi-Batch Hash Join for Skewed Data Sets
Next
From: Greg Smith
Date:
Subject: Re: [WIP] In-place upgrade