Thread: problems with "Shared Memory and Semaphores" section of docs

problems with "Shared Memory and Semaphores" section of docs

From
Nathan Bossart
Date:
(moving to a new thread)

On Thu, May 16, 2024 at 09:16:46PM -0500, Nathan Bossart wrote:
> On Thu, May 16, 2024 at 04:37:10PM +0000, Imseih (AWS), Sami wrote:
>> Also, Not sure if I am mistaken here, but the "+ 5" in the existing docs
>> seems wrong.
>>  
>> If it refers to NUM_AUXILIARY_PROCS defined in 
>> include/storage/proc.h, it should a "6"
>> 
>> #define NUM_AUXILIARY_PROCS             6
>> 
>> This is not a consequence of this patch, and can be dealt with
>> In a separate thread if my understanding is correct.
> 
> Ha, I think it should actually be "+ 7"!  The value is calculated as
> 
>     MaxConnections + autovacuum_max_workers + 1 + max_worker_processes + max_wal_senders + 6
> 
> Looking at the history, this documentation tends to be wrong quite often.
> In v9.2, the checkpointer was introduced, and these formulas were not
> updated.  In v9.3, background worker processes were introduced, and the
> formulas were still not updated.  Finally, in v9.6, it was fixed in commit
> 597f7e3.  Then, in v14, the archiver process was made an auxiliary process
> (commit d75288f), making the formulas out-of-date again.  And in v17, the
> WAL summarizer was added.
> 
> On top of this, IIUC you actually need even more semaphores if your system
> doesn't support atomics, and from a quick skim this doesn't seem to be
> covered in this documentation.

A couple of other problems I noticed:

* max_wal_senders is missing from this sentence:

    When using System V semaphores,
    <productname>PostgreSQL</productname> uses one semaphore per allowed connection
    (<xref linkend="guc-max-connections"/>), allowed autovacuum worker process
    (<xref linkend="guc-autovacuum-max-workers"/>) and allowed background
    process (<xref linkend="guc-max-worker-processes"/>), in sets of 16.

* AFAICT the discussion about the formulas in the paragraphs following the
  table doesn't explain the reason for the constant.

* IMHO the following sentence is difficult to decipher, and I can't tell if
  it actually matches the formula in the table:

    The maximum number of semaphores in the system
    is set by <varname>SEMMNS</varname>, which consequently must be at least
    as high as <varname>max_connections</varname> plus
    <varname>autovacuum_max_workers</varname> plus <varname>max_wal_senders</varname>,
    plus <varname>max_worker_processes</varname>, plus one extra for each 16
    allowed connections plus workers (see the formula in <xref
    linkend="sysvipc-parameters"/>).

At a bare minimum, we should probably fix the obvious problems, but I
wonder if we could simplify this section a bit, too.  If the exact values
are important, maybe we could introduce more GUCs like
shared_memory_size_in_huge_pages that can be consulted (instead of
requiring users to break out their calculators).

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Nathan Bossart <nathandbossart@gmail.com> writes:
> [ many, many problems in documented formulas ]

> At a bare minimum, we should probably fix the obvious problems, but I
> wonder if we could simplify this section a bit, too.

Yup.  "The definition of insanity is doing the same thing over and
over and expecting different results."  Time to give up on documenting
these things in such detail.  Anybody who really wants to know can
look at the source code.

> If the exact values
> are important, maybe we could introduce more GUCs like
> shared_memory_size_in_huge_pages that can be consulted (instead of
> requiring users to break out their calculators).

I don't especially like shared_memory_size_in_huge_pages, and I don't
want to introduce more of those.  GUCs are not the right way to expose
values that you can't actually set.  (Yeah, I'm guilty of some of the
existing ones like that, but it's still not a good thing.)  Maybe it's
time to introduce a system view for such things?  It could be really
simple, with name and value, or we could try to steal some additional
ideas such as units from pg_settings.

            regards, tom lane



Re: problems with "Shared Memory and Semaphores" section of docs

From
Nathan Bossart
Date:
On Fri, May 17, 2024 at 01:09:55PM -0400, Tom Lane wrote:
> Nathan Bossart <nathandbossart@gmail.com> writes:
>> [ many, many problems in documented formulas ]
> 
>> At a bare minimum, we should probably fix the obvious problems, but I
>> wonder if we could simplify this section a bit, too.
> 
> Yup.  "The definition of insanity is doing the same thing over and
> over and expecting different results."  Time to give up on documenting
> these things in such detail.  Anybody who really wants to know can
> look at the source code.

Cool.  I'll at least fix the back-branches as-is, but I'll see about
revamping this stuff for v18.

>> If the exact values
>> are important, maybe we could introduce more GUCs like
>> shared_memory_size_in_huge_pages that can be consulted (instead of
>> requiring users to break out their calculators).
> 
> I don't especially like shared_memory_size_in_huge_pages, and I don't
> want to introduce more of those.  GUCs are not the right way to expose
> values that you can't actually set.  (Yeah, I'm guilty of some of the
> existing ones like that, but it's still not a good thing.)  Maybe it's
> time to introduce a system view for such things?  It could be really
> simple, with name and value, or we could try to steal some additional
> ideas such as units from pg_settings.

The advantage of the GUC is that its value could be seen before trying to
actually start the server.  I don't dispute that it's not the right way to
surface this information, though.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: problems with "Shared Memory and Semaphores" section of docs

From
"Imseih (AWS), Sami"
Date:
>>> If the exact values
>>> are important, maybe we could introduce more GUCs like
>>> shared_memory_size_in_huge_pages that can be consulted (instead of
>>> requiring users to break out their calculators).
>>
>> I don't especially like shared_memory_size_in_huge_pages, and I don't
>> want to introduce more of those. GUCs are not the right way to expose
>> values that you can't actually set. (Yeah, I'm guilty of some of the
>> existing ones like that, but it's still not a good thing.) Maybe it's
>> time to introduce a system view for such things? It could be really
>> simple, with name and value, or we could try to steal some additional
>> ideas such as units from pg_settings.

I always found some of the preset GUCs [1] to be useful for writing SQLs used by
DBAs, particularly block_size, wal_block_size, server_version and server_version_num.

> The advantage of the GUC is that its value could be seen before trying to
> actually start the server. 

Only if they have a sample in postgresql.conf file, right? 
A GUC like shared_memory_size_in_huge_pages will not be.


[1] https://www.postgresql.org/docs/current/runtime-config-preset.html


Regards,

Sami 


Re: problems with "Shared Memory and Semaphores" section of docs

From
Nathan Bossart
Date:
On Fri, May 17, 2024 at 12:48:37PM -0500, Nathan Bossart wrote:
> On Fri, May 17, 2024 at 01:09:55PM -0400, Tom Lane wrote:
>> Nathan Bossart <nathandbossart@gmail.com> writes:
>>> At a bare minimum, we should probably fix the obvious problems, but I
>>> wonder if we could simplify this section a bit, too.
>> 
>> Yup.  "The definition of insanity is doing the same thing over and
>> over and expecting different results."  Time to give up on documenting
>> these things in such detail.  Anybody who really wants to know can
>> look at the source code.
> 
> Cool.  I'll at least fix the back-branches as-is, but I'll see about
> revamping this stuff for v18.

Attached is probably the absolute least we should do for the back-branches.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment

Re: problems with "Shared Memory and Semaphores" section of docs

From
Nathan Bossart
Date:
On Fri, May 17, 2024 at 06:30:08PM +0000, Imseih (AWS), Sami wrote:
>> The advantage of the GUC is that its value could be seen before trying to
>> actually start the server. 
> 
> Only if they have a sample in postgresql.conf file, right? 
> A GUC like shared_memory_size_in_huge_pages will not be.

shared_memory_size_in_huge_pages is computed at runtime and can be viewed
with "postgres -C" before actually trying to start the server [0].

[0] https://www.postgresql.org/docs/devel/kernel-resources.html#LINUX-HUGE-PAGES

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com



Re: problems with "Shared Memory and Semaphores" section of docs

From
Andres Freund
Date:
Hi,

On 2024-05-17 18:30:08 +0000, Imseih (AWS), Sami wrote:
> > The advantage of the GUC is that its value could be seen before trying to
> > actually start the server.
>
> Only if they have a sample in postgresql.conf file, right?
> A GUC like shared_memory_size_in_huge_pages will not be.

You can query gucs with -C. E.g.

postgres -D pgdev-dev -c shared_buffers=16MB -C shared_memory_size_in_huge_pages
13
postgres -D pgdev-dev -c shared_buffers=16MB -c huge_page_size=1GB -C shared_memory_size_in_huge_pages
1

Which is very useful to be able to actually configure that number of huge
pages. I don't think a system view or such would not help here.

Greetings,

Andres Freund



Re: problems with "Shared Memory and Semaphores" section of docs

From
"Imseih (AWS), Sami"
Date:
> postgres -D pgdev-dev -c shared_buffers=16MB -C shared_memory_size_in_huge_pages
> 13
> postgres -D pgdev-dev -c shared_buffers=16MB -c huge_page_size=1GB -C shared_memory_size_in_huge_pages
> 1


> Which is very useful to be able to actually configure that number of huge
> pages. I don't think a system view or such would not help here.

Oops. Totally missed the -C flag. Thanks for clarifying!

Regards,

Sami 


Re: problems with "Shared Memory and Semaphores" section of docs

From
Nathan Bossart
Date:
On Fri, May 17, 2024 at 02:21:23PM -0500, Nathan Bossart wrote:
> On Fri, May 17, 2024 at 12:48:37PM -0500, Nathan Bossart wrote:
>> Cool.  I'll at least fix the back-branches as-is, but I'll see about
>> revamping this stuff for v18.
> 
> Attached is probably the absolute least we should do for the back-branches.

Any concerns with doing something like this [0] for the back-branches?  The
constant would be 6 instead of 7 on v14 through v16.

I wrote a quick sketch for what a runtime-computed GUC might look like for
v18.  We don't have agreement on this approach, but I figured I'd post
something while we search for a better one.

[0] https://postgr.es/m/attachment/160360/v1-0001-fix-kernel-resources-docs-on-back-branches.patch

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Attachment

Re: problems with "Shared Memory and Semaphores" section of docs

From
"Imseih (AWS), Sami"
Date:
> Any concerns with doing something like this [0] for the back-branches? The
> constant would be 6 instead of 7 on v14 through v16.

As far as backpatching the present inconsistencies in the docs,
[0] looks good to me.

[0] https://postgr.es/m/attachment/160360/v1-0001-fix-kernel-resources-docs-on-back-branches.patch
<https://postgr.es/m/attachment/160360/v1-0001-fix-kernel-resources-docs-on-back-branches.patch>


Regards,

Sami



Re: problems with "Shared Memory and Semaphores" section of docs

From
Nathan Bossart
Date:
On Tue, May 21, 2024 at 11:15:14PM +0000, Imseih (AWS), Sami wrote:
>> Any concerns with doing something like this [0] for the back-branches? The
>> constant would be 6 instead of 7 on v14 through v16.
> 
> As far as backpatching the present inconsistencies in the docs,
> [0] looks good to me.

Committed.

-- 
nathan



Re: problems with "Shared Memory and Semaphores" section of docs

From
Nathan Bossart
Date:
On Mon, Jun 03, 2024 at 12:18:21PM -0500, Nathan Bossart wrote:
> On Tue, May 21, 2024 at 11:15:14PM +0000, Imseih (AWS), Sami wrote:
>> As far as backpatching the present inconsistencies in the docs,
>> [0] looks good to me.
> 
> Committed.

Of course, as soon as I committed this, I noticed another missing reference
to max_wal_senders in the paragraph about POSIX semaphores.  I plan to
commit/back-patch the attached patch within the next couple days.

-- 
nathan

Attachment

Re: problems with "Shared Memory and Semaphores" section of docs

From
Nathan Bossart
Date:
On Mon, Jun 03, 2024 at 02:04:19PM -0500, Nathan Bossart wrote:
> Of course, as soon as I committed this, I noticed another missing reference
> to max_wal_senders in the paragraph about POSIX semaphores.  I plan to
> commit/back-patch the attached patch within the next couple days.

Committed.

-- 
nathan



Re: problems with "Shared Memory and Semaphores" section of docs

From
Nathan Bossart
Date:
Here is a rebased version of the patch for v18 that adds a runtime-computed
GUC.  As I noted earlier, there still isn't a consensus on this approach.

-- 
nathan

Attachment

Re: problems with "Shared Memory and Semaphores" section of docs

From
Robert Haas
Date:
On Thu, Jun 6, 2024 at 3:21 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
> Here is a rebased version of the patch for v18 that adds a runtime-computed
> GUC.  As I noted earlier, there still isn't a consensus on this approach.

I don't really like making this a GUC, but what's the other option?
It's reasonable for people to want to ask the server how many
resources it will need to start, and -C is the only tool we have for
that right now. So I feel like this is a fair thing to do.

I do think the name could use some more thought, though.
semaphores_required would end up being the same kind of thing as
shared_memory_size_in_huge_pages, but the names seem randomly
different. If semaphores_required is right here, why isn't
shared_memory_required used there? Seems more like we ought to call
this semaphores or os_semaphores or num_semaphores or
num_os_semaphores or something.

--
Robert Haas
EDB: http://www.enterprisedb.com



Re: problems with "Shared Memory and Semaphores" section of docs

From
Nathan Bossart
Date:
On Thu, Jun 06, 2024 at 03:31:53PM -0400, Robert Haas wrote:
> I don't really like making this a GUC, but what's the other option?
> It's reasonable for people to want to ask the server how many
> resources it will need to start, and -C is the only tool we have for
> that right now. So I feel like this is a fair thing to do.

Yeah, this is how I feel, too.

> I do think the name could use some more thought, though.
> semaphores_required would end up being the same kind of thing as
> shared_memory_size_in_huge_pages, but the names seem randomly
> different. If semaphores_required is right here, why isn't
> shared_memory_required used there? Seems more like we ought to call
> this semaphores or os_semaphores or num_semaphores or
> num_os_semaphores or something.

I'm fine with any of your suggestions.  If I _had_ to pick one, I'd
probably choose num_os_semaphores because it's the most descriptive.

-- 
nathan



Re: problems with "Shared Memory and Semaphores" section of docs

From
Nathan Bossart
Date:
On Thu, Jun 06, 2024 at 02:51:42PM -0500, Nathan Bossart wrote:
> On Thu, Jun 06, 2024 at 03:31:53PM -0400, Robert Haas wrote:
>> I do think the name could use some more thought, though.
>> semaphores_required would end up being the same kind of thing as
>> shared_memory_size_in_huge_pages, but the names seem randomly
>> different. If semaphores_required is right here, why isn't
>> shared_memory_required used there? Seems more like we ought to call
>> this semaphores or os_semaphores or num_semaphores or
>> num_os_semaphores or something.
> 
> I'm fine with any of your suggestions.  If I _had_ to pick one, I'd
> probably choose num_os_semaphores because it's the most descriptive.

Here's a new version of the patch with the GUC renamed to
num_os_semaphores.

-- 
nathan

Attachment