Re: Change GUC hashtable to use simplehash? - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Change GUC hashtable to use simplehash?
Date
Msg-id 20231118001003.n2d7tvdgeckun2y6@awork3.anarazel.de
Whole thread Raw
In response to Re: Change GUC hashtable to use simplehash?  (Jeff Davis <pgsql@j-davis.com>)
Responses Re: Change GUC hashtable to use simplehash?
List pgsql-hackers
Hi,

On 2023-11-17 16:01:31 -0800, Jeff Davis wrote:
> On Fri, 2023-11-17 at 15:27 -0800, Andres Freund wrote:
> > At
> > first I thought that that's largely because you aren't using
> > SH_STORE_HASH.
>
> I might want to use that in the search_path cache, then. The lookup
> wasn't showing up much in the profile the last I checked, but I'll take
> a second look.

It also matters for insertions, fwiw.


> > Then I noticed that memory usage was too large when creating many
> > GUCs - a bit
> > of debugging later, I figured out that that's due to guc_name_hash()
> > being
> > terrifyingly bad. There's no bit mixing whatsoever!
>
> Wow.
>
> It seems like hash_combine() could be more widely used in other places,
> too?

I don't think hash_combine() alone helps that much - you need to actually use
a hash function for the values you are combining. Using a character value
alone as a 32bit hash value unsurprisingly leads to very distribution of bits
set in hashvalues.


> Here it seems like a worse problem because strings really need
> mixing, and maybe ExecHashGetHashValue doesn't. But it seems easier to
> use hash_combine() everywhere so that we don't have to think about
> strange cases.

Yea.


> > I think, independent of this patch, it might be worth requiring that
> > hash
> > table lookups applied the transformation before the lookup. A
> > comparison
> > function this expensive is not great...
>
> The requested name is already case-folded in most contexts. We can do a
> lookup first, and if that fails, case-fold and try again. I'll hack up
> a patch -- I believe that would be measurable for the proconfigs.

I'd just always case fold before lookups. The expensive bit of the case
folding imo is that you need to do awkward things during hash lookups.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: On non-Windows, hard depend on uselocale(3)
Next
From: Andres Freund
Date:
Subject: Re: long-standing data loss bug in initial sync of logical replication