Hi,
On 2023-11-17 14:08:56 -0800, Jeff Davis wrote:
> On Fri, 2023-11-17 at 17:04 -0500, Tom Lane wrote:
> > I can't imagine wanting to convert *every* hashtable in the system
> > to simplehash; the added code bloat would be unreasonable. So yeah,
> > I think we'll have two mechanisms indefinitely. That's not to say
> > that we might not rewrite hsearch. But simplehash was never meant
> > to be a universal solution.
>
> OK, I will withdraw the patch until/unless it provides a concrete
> benefit.
It might already in the space domain:
SELECT count(*), sum(total_bytes) total_bytes, sum(total_nblocks) total_nblocks, sum(free_bytes) free_bytes,
sum(free_chunks)free_chunks, sum(used_bytes) used_bytes
FROM pg_backend_memory_contexts
WHERE name LIKE 'GUC%';
HEAD:
┌───────┬─────────────┬───────────────┬────────────┬─────────────┬────────────┐
│ count │ total_bytes │ total_nblocks │ free_bytes │ free_chunks │ used_bytes │
├───────┼─────────────┼───────────────┼────────────┼─────────────┼────────────┤
│ 2 │ 57344 │ 5 │ 25032 │ 10 │ 32312 │
└───────┴─────────────┴───────────────┴────────────┴─────────────┴────────────┘
your patch:
┌───────┬─────────────┬───────────────┬────────────┬─────────────┬────────────┐
│ count │ total_bytes │ total_nblocks │ free_bytes │ free_chunks │ used_bytes │
├───────┼─────────────┼───────────────┼────────────┼─────────────┼────────────┤
│ 1 │ 36928 │ 3 │ 12360 │ 3 │ 24568 │
└───────┴─────────────┴───────────────┴────────────┴─────────────┴────────────┘
However, it fares less well at larger number of GUCs, performance wise. At
first I thought that that's largely because you aren't using SH_STORE_HASH.
With that, it's slower when creating a large number of GUCs, but a good bit
faster retrieving them. But that slowness didn't seem right.
Then I noticed that memory usage was too large when creating many GUCs - a bit
of debugging later, I figured out that that's due to guc_name_hash() being
terrifyingly bad. There's no bit mixing whatsoever! Which leads to very large
numbers of hash conflicts - which simplehash tries to defend against a bit by
making the table larger.
(gdb) p guc_name_hash("andres.c2")
$14 = 3798554171
(gdb) p guc_name_hash("andres.c3")
$15 = 3798554170
Fixing that makes simplehash always faster, but still doesn't win on memory
usage at the upper end - the two pointers in GUCHashEntry make it too big.
I think, independent of this patch, it might be worth requiring that hash
table lookups applied the transformation before the lookup. A comparison
function this expensive is not great...
Greetings,
Andres Freund