Hi,
On 2025-02-26 11:46:45 +0300, Maxim Orlov wrote:
> On Tue, 25 Feb 2025 at 22:44, Ekaterina Sokolova <e.sokolova@postgrespro.ru>
> wrote:
>
> > Hi, hackers!
> >
> > Historically, the checkpointer process use palloc() into
> > AbsorbSyncRequests() function. Therefore, the checkpointer does not
> > expect to receive a request larger than 1 GB.
>
>
> Yeah. And the most unpleasant thing is it won't simply fail with an error
> or helpful message suggesting a workaround (reduce the amount of shared
> memory). Checkpointer will just "stuck".
>
> AFAICS, we have a few options:
> 1. Leave it as it is, but fatal on allocation of the chunk more than 1G.
> 2. Use palloc_extended with MCXT_ALLOC_HUGE flag.
> 3. Do not use any allocation and use CheckpointerShmem->requests directly
> in case of > 1G size of the required allocation.
4) Do compaction incrementally, instead of doing it for all requests at once.
That'd probably be better, because
a) it'll take some time to to compact 10s to 100s of million requests, which
makes it much more likely that backends will have to perform syncs
themselves and the lock will be held for an extended period of time
b) allocating gigabytes of memory obviously makes it more likely that you'll
fail with out-of-memory at runtime or evne get OOM killed.
Greetings,
Andres Freund