Thanks Dmitry. Right, the coordination mechanism in v4-0006 works as expected in various tests (sorry, I misunderstood
somedetails initially).
I also want to report a couple of minor issues found during testing (which you may be aware of already):
1. For memory segments other the first one ('main'), the start address passed to mmap may not be aligned to 4KB or huge
pagesize (since reserved_offset may not be aligned) and cause mmap to fail.
2. Since the ratio for main/desc/iocv/checkpt/strategy in SHMEM_RESIZE_RATIO are relatively small, I think we need to
guardagainst the case where 'max_available_memory' is too small for the required sizes of these segments (from
CalculateShmemSize).
Like when max_available_memory=default and shared_numbers=128kB, 'main' still needs ~109MB, but since only 10% of
max_available_memoryis reserved for it (~102MB) and start address of the next segment is calculated based on
reserved_offset,this would cause the mappings to overlap and memory problems later (I hit this after fixing 1.)
I suppose we can change the minimum value of max_available_memory to be large enough, and may also adjust the ratios in
SHMEM_RESIZE_RATIOto ensure the reserved space of those segments are sufficient.
Regards,
Jack Ng
-----Original Message-----
From: Dmitry Dolgov <9erthalion6@gmail.com>
Sent: Monday, April 21, 2025 5:33 AM
To: Ni Ku <jakkuniku@gmail.com>
Cc: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>; pgsql-hackers@postgresql.org; Robert Haas <robertmhaas@gmail.com>
Subject: Re: Changing shared_buffers without restart
> On Thu, Apr 17, 2025 at 07:05:36PM GMT, Ni Ku wrote:
> I also have a related question about how ftruncate() is used in the patch.
> In my testing I also see that when using ftruncate to shrink a shared
> segment, the memory is freed immediately after the call, even if other
> processes still have that memory mapped, and they will hit SIGBUS if
> they try to access that memory again as the manpage says.
>
> So am I correct to think that, to support the bufferpool shrinking
> case, it would not be safe to call ftruncate in AnonymousShmemResize
> as-is, since at that point other processes may still be using pages
> that belong to the truncated memory?
> It appears that for shrinking we should only call ftruncate when we're
> sure no process will access those pages again (eg, all processes have
> handled the resize interrupt signal barrier). I suppose this can be
> done by the resize coordinator after synchronizing with all the other processes.
> But in that case it seems we cannot use the postmaster as the
> coordinator then? b/c I see some code comments saying the postmaster
> does not have waiting infrastructure... (maybe even if the postmaster
> has waiting infra we don't want to use it anyway since it can be
> blocked for a long time and won't be able to serve other requests).
There is already a coordination infrastructure, implemented in the patch 0006, which will take care of this and prevent
accessto the shared memory until everything is resized.