Home > mailing lists

Re: A better way than tweaking NTUP_PER_BUCKET - Mailing list pgsql-hackers

From	Stephen Frost
Subject	Re: A better way than tweaking NTUP_PER_BUCKET
Date	June 23, 2013 16:41:26
Msg-id	CAOuzzgq0H2-CcMUy-xZwY0D-6od5JMdH8nvfR=mO_KJJveotKA@mail.gmail.com Whole thread Raw
In response to	Re: A better way than tweaking NTUP_PER_BUCKET (Simon Riggs <simon@2ndQuadrant.com>)
List	pgsql-hackers

Tree view

On Sunday, June 23, 2013, Simon Riggs wrote:

On 23 June 2013 03:16, Stephen Frost <sfrost@snowman.net> wrote:

> Will think on it more.

Some other thoughts related to this...

* Why are we building a special kind of hash table? Why don't we just
use the hash table code that we in every other place in the backend.
If that code is so bad why do we use it everywhere else? That is
extensible, so we could try just using that. (Has anyone actually
tried?)

I've not looked at the hash table in the rest of the backend.

* We're not thinking about cache locality and set correspondence
either. If the join is expected to hardly ever match, then we should
be using a bitmap as a bloom filter rather than assuming that a very
large hash table is easily accessible.

That's what I was suggesting earlier, though I don't think it's technically a bloom filter- doesn't that require multiple hash functions?I don't think we want to require every data type to provide multiple hash functions.

* The skew hash table will be hit frequently and would show good L2
cache usage. I think I'll try adding the skew table always to see if
that improves the speed of the hash join.

The skew tables is just for common values though... To be honest, I have some doubts about that structure really being a terribly good approach for anything which is completely in memory.

Thanks,

Stephen

pgsql-hackers by date:

From: Stephen Frost
Date: 23 June 2013, 16:41:01
Subject: Re: A better way than tweaking NTUP_PER_BUCKET

From: Dean Rasheed
Date: 23 June 2013, 16:44:37
Subject: Re: FILTER for aggregates [was Re: Department of Redundancy Department: makeNode(FuncCall) division]

Re: A better way than tweaking NTUP_PER_BUCKET - Mailing list pgsql-hackers

Previous

Next