Re: Change GUC hashtable to use simplehash? - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Change GUC hashtable to use simplehash?
Date
Msg-id 20231122212221.ahmlio2yfx7jla6z@awork3.anarazel.de
Whole thread Raw
In response to Re: Change GUC hashtable to use simplehash?  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Change GUC hashtable to use simplehash?
List pgsql-hackers
Hi,

On 2023-11-22 15:56:21 -0500, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2023-11-21 16:42:55 +0700, John Naylor wrote:
> >> The strlen call required for hashbytes() is not free. The lack of
> >> mixing in the (probably inlined after 0001) previous hash function can
> >> remedied directly, as in the attached:
>
> > I doubt this is a good hashfunction. For short strings, sure, but after
> > that...  I don't think it makes sense to reduce the internal state of a hash
> > function to something this small.
>
> GUC names are just about always short, though, so I'm not sure you've
> made your point?

With short I meant <= 6 characters (32 / 5 = 6.x). After that you're
overwriting bits that you previously set, without dispersing the "overwritten"
bits throughout the hash state.

It's pretty easy to create conflicts this way, even just on paper. E.g. I
think abcdefgg and cbcdefgw would have the same hash, because the accumulated
value passed to murmurhash32 is the same.

The fact that this happens when a large part of the string is the same
is bad, because it makes it more likely that prefixed strings trigger such
conflicts, and they're obviously common with GUC strings.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Partial aggregates pushdown
Next
From: Tom Lane
Date:
Subject: Re: Change GUC hashtable to use simplehash?