Re: Shared hash table allocations - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Shared hash table allocations
Date
Msg-id a47e1b92-2e88-4554-b4d3-61934173222d@iki.fi
Whole thread Raw
In response to Re: Shared hash table allocations  (Matthias van de Meent <boekewurm+postgres@gmail.com>)
Responses Re: Shared hash table allocations
List pgsql-hackers
On 02/04/2026 13:24, Matthias van de Meent wrote:
> On Tue, 31 Mar 2026 at 23:25, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>>
>> 0003: In patch 0003 I removed that flexibility by marking them both with
>> HASH_FIXED_SIZE, and making init_size equal to max_size. That also stops
>> the hash tables from using any of the other remaining wiggle room,
>> making them truly fixed-size.
> 
> I think this patch finally gave me a good reason why PROCLOCK would've
> needed to be allocated with double the sizes of LOCK:
> 
> LOCK is (was) initialized with only 50% of its max capacity. If
> PROCLOCK was initialized with the same parameters and all spare shmem
> is then allocated to other processes, then backends wouldn't be able
> to safely use max_locks_per_transaction. To guarantee no OOMs when all
> backends use max_locks_per_transaction, PROCLOCK's size must be
> doubled to make sure PROCLOCK has sufficient space. (The same isn't
> usually an issue for LOCK, because it's very likely backends will
> operate on the same tables, and thus will be able to share most of the
> LOCK structs.)

Hmm, I don't know if that makes sense. It can happen that you have a lot 
of backends acquiring the same, smaller set of locks, growing PROCLOCK 
so that it uses up all the available wiggle room, and LOCK can never 
grow from its initial size, 1/2 * max_locks_per_transactions * 
MaxBackends. If the workload then changes so that every backend tries to 
acquire exactly max_locks_per_transactions locks, but this time each 
lock is on a different object, you will run out of shared memory at 1/2 
the size of what you expected.

The opposite can't happen, because PROCLOCK is always at least as large 
as LOCK. It doesn't matter what you set PROCLOCK's initial size to, it 
will grow together with LOCK, and you will not run out of shared memory 
before PROCLOCK has grown up to max_locks_per_transactions * MaxBackends 
anyway.

> Now that LOCK is fully allocated, I think the size doubling can be
> removed, or possibly parameterized for those that need it.

I don't think that follows. The 2x factor is pretty arbitrary, but it's 
still a fair assumption that many backends will be acquiring locks on 
the same objects so you need more space in PROCLOCK than in LOCK.

I don't know how true that assumption is. It feels right for OLTP 
applications. But the situation where I've hit max_locks_per_transaction 
is when I've tried to create one table with thousands or partitions. Or 
rather, when I try to *drop* that table. In that situation, there's just 
one transaction acquiring all the locks, so the PROCLOCK / LOCK ratio is 1.

We could parameterize it, but I feel that's probably overkill and 
exposing too much detail to users. At the end of the day, if you hit the 
limit, you just bump up max_locks_per_transactions. If there are two 
settings, it's more complicated; which one do you change? You probably 
don't mind wasting the few MB of memory that you could gain by carefully 
tuning the LOCK / PROCLOCK factor.
- Heikki




pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: Change default of jit to off
Next
From: "Matheus Alcantara"
Date:
Subject: Re: Eager aggregation, take 3