Re: Use simplehash.h instead of dynahash in SMgr - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Use simplehash.h instead of dynahash in SMgr
Date
Msg-id 20210426195846.2jl4nsuohemowytm@alap3.anarazel.de
Whole thread Raw
In response to Re: Use simplehash.h instead of dynahash in SMgr  (Yura Sokolov <y.sokolov@postgrespro.ru>)
List pgsql-hackers
Hi,

On 2021-04-26 22:44:13 +0300, Yura Sokolov wrote:
> Even for Robin Hood hashing 0.9 fill factor is too high. It leads to too
> much movements on insertion/deletion and longer average collision chain.

That's true for modification heavy cases - but most hash tables in PG,
including the smgr one, are quite read heavy. For workloads where
there's a lot of smgr activity, the other overheads in relation
creation/drop handling are magnitudes more expensive than the collision
handling.

Note that simplehash.h also grows when the distance gets too big and
when there are too many elements to move, not just based on the fill
factor.


I kinda wish we had a chained hashtable implementation with the same
interface as simplehash. It's very use-case dependent which approach is
better, and right now we might be forcing some users to choose linear
probing because simplehash.h is still faster than dynahash, even though
chaining would actually be more appropriate.


> Note that Robin Hood doesn't optimize average case.

Right.


> See discussion on Rust hash table fill factor when it were Robin Hood:
> https://github.com/rust-lang/rust/issues/38003

The first sentence actually confirms my point above, about it being a
question of read vs write heavy.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: ALTER TABLE .. DETACH PARTITION CONCURRENTLY
Next
From: Alvaro Herrera
Date:
Subject: Re: tab-complete for ALTER TABLE .. DETACH PARTITION CONCURRENTLY