Home > mailing lists

Re: [HACKERS] Performance degradation in TPC-H Q18 - Mailing list pgsql-hackers

From	Robert Haas
Subject	Re: [HACKERS] Performance degradation in TPC-H Q18
Date	March 3, 2017 06:11:52
Msg-id	CA+TgmoY=NYXxYg4WQsXV5auap+e3ezPF0n3OBYqiCzeW0LXHYw@mail.gmail.com Whole thread
In response to	Re: [HACKERS] Performance degradation in TPC-H Q18 (Andres Freund <andres@anarazel.de>)
Responses	Re: [HACKERS] Performance degradation in TPC-H Q18
List	pgsql-hackers

Tree view

On Fri, Mar 3, 2017 at 1:22 AM, Andres Freund <andres@anarazel.de> wrote:
> the resulting hash-values aren't actually meaningfully influenced by the
> IV. Because we just xor with the IV, most hash-value that without the IV
> would have fallen into a single hash-bucket, fall into a single
> hash-bucket afterwards as well; just somewhere else in the hash-range.

Wow, OK.  I had kind of assumed (without looking) that setting the
hash IV did something a little more useful than that.  Maybe we should
do something like struct blah { int iv; int hv; }; newhv =
hash_any(&blah, sizeof(blah)).

> In addition to that it seems quite worthwhile to provide an iterator
> that's not vulnerable to this.  An approach that I am, seemingly
> successfully, testing is to iterate the hashtable in multiple (in my
> case 23, because why not) passes, accessing only every nth element. That
> allows the data to be inserted in a lot less "dense" fashion.  But
> that's more an optimization, so I'll just push something like the patch
> mentioned in the thread already.
>
> Makes some sense?

Yep.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

pgsql-hackers by date:

From: Andreas Karlsson
Date: 03 March 2017, 05:44:49
Subject: Re: [HACKERS] REINDEX CONCURRENTLY 2.0

From: Robert Haas
Date: 03 March 2017, 06:43:13
Subject: Re: [HACKERS] error detail when partition not found

Re: [HACKERS] Performance degradation in TPC-H Q18 - Mailing list pgsql-hackers

Previous

Next