Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays - Mailing list pgsql-hackers

From James Coleman
Subject Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays
Date
Msg-id CAAaqYe956a-zbm1qR8pqz=iLbF8LW5vBrQGrzXVHXdLk3at5_Q@mail.gmail.com
Whole thread Raw
In response to Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays  (James Coleman <jtc331@gmail.com>)
Responses Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
I cc'd Andres given his commit introduced simplehash, so I figured
he'd probably have a few pointers on when each one might be useful.

On Tue, Apr 28, 2020 at 8:39 AM James Coleman <jtc331@gmail.com> wrote:
...
> > Any particular reasons to pick dynahash over simplehash? ISTM we're
> > using simplehash elsewhere in the executor (grouping, tidbitmap, ...),
> > while there are not many places using dynahash for simple short-lived
> > hash tables. Of course, that alone is a weak reason to insist on using
> > simplehash here, but I suppose there were reasons for not using dynahash
> > and we'll end up facing the same issues here.
>
> No particular reason; it wasn't clear to me that there was a reason to
> prefer one or the other (and I'm not acquainted with the codebase
> enough to know the differences), so I chose dynahash because it was
> easier to find examples to follow for implementation.

Do you have any thoughts on what the trade-offs/use-cases etc. are for
dynahash versus simple hash?

From reading the commit message in b30d3ea824c it seems like simple
hash is faster and optimized for CPU cache benefits. The comments at
the top of simplehash.h also discourage it's use in non
performance/space sensitive uses, but there isn't anything I can see
that explicitly tries to discuss when dynahash is useful, etc.

Given the performance notes in that commit message, I thinking
switching to simple hash is worth it.

But I also wonder if there might be some value in a README or comments
addition that would be a guide to what the various hash
implementations are useful for. If there's interest, I could try to
type something short up so that we have something to make the code
base a bit more discoverable.

James



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: More efficient RI checks - take 2
Next
From: Tomas Vondra
Date:
Subject: Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays