Thread: Accommodate startup process in a separate ProcState array slot instead of in MaxBackends slots.

Hi,

While working on [1], it is found that currently the ProcState array
doesn't have entries for auxiliary processes, it does have entries for
MaxBackends. But the startup process is eating up one slot from
MaxBackends. We need to increase the size of the ProcState array by 1
at least for the startup process. The startup process uses ProcState
slot via InitRecoveryTransactionEnvironment->SharedInvalBackendInit.
The procState array size is initialized to MaxBackends in
SInvalShmemSize.

The consequence of not fixing this issue is that the database may hit
the error "sorry, too many clients already" soon in
SharedInvalBackendInit.

Attaching a patch to fix this issue. Thoughts?

[1] https://www.postgresql.org/message-id/2222ab6f-46b1-d5c0-603d-8f6680739db4%40oss.nttdata.com

Regards,
Bharath Rupireddy.

Attachment

On 2021/10/12 4:07, Bharath Rupireddy wrote:
> Hi,
> 
> While working on [1], it is found that currently the ProcState array
> doesn't have entries for auxiliary processes, it does have entries for
> MaxBackends. But the startup process is eating up one slot from
> MaxBackends. We need to increase the size of the ProcState array by 1
> at least for the startup process. The startup process uses ProcState
> slot via InitRecoveryTransactionEnvironment->SharedInvalBackendInit.
> The procState array size is initialized to MaxBackends in
> SInvalShmemSize.
> 
> The consequence of not fixing this issue is that the database may hit
> the error "sorry, too many clients already" soon in
> SharedInvalBackendInit.
> 
> Attaching a patch to fix this issue. Thoughts?

Thanks for making the patch! LGTM.
Barring any objection, I will commit it.

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION



On Tue, Oct 12, 2021 at 5:37 AM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>
> On 2021/10/12 4:07, Bharath Rupireddy wrote:
> > Hi,
> >
> > While working on [1], it is found that currently the ProcState array
> > doesn't have entries for auxiliary processes, it does have entries for
> > MaxBackends. But the startup process is eating up one slot from
> > MaxBackends. We need to increase the size of the ProcState array by 1
> > at least for the startup process. The startup process uses ProcState
> > slot via InitRecoveryTransactionEnvironment->SharedInvalBackendInit.
> > The procState array size is initialized to MaxBackends in
> > SInvalShmemSize.
> >
> > The consequence of not fixing this issue is that the database may hit
> > the error "sorry, too many clients already" soon in
> > SharedInvalBackendInit.
> >
> > Attaching a patch to fix this issue. Thoughts?
>
> Thanks for making the patch! LGTM.
> Barring any objection, I will commit it.

Thanks for reviewing. I've made a CF entry for this, just to ensure
the tests on different CF bot server passes(and yes no failures) -
https://commitfest.postgresql.org/35/3355/

Regards,
Bharath Rupireddy.




On 2021/10/12 15:46, Bharath Rupireddy wrote:
> On Tue, Oct 12, 2021 at 5:37 AM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
>>
>> On 2021/10/12 4:07, Bharath Rupireddy wrote:
>>> Hi,
>>>
>>> While working on [1], it is found that currently the ProcState array
>>> doesn't have entries for auxiliary processes, it does have entries for
>>> MaxBackends. But the startup process is eating up one slot from
>>> MaxBackends. We need to increase the size of the ProcState array by 1
>>> at least for the startup process. The startup process uses ProcState
>>> slot via InitRecoveryTransactionEnvironment->SharedInvalBackendInit.
>>> The procState array size is initialized to MaxBackends in
>>> SInvalShmemSize.
>>>
>>> The consequence of not fixing this issue is that the database may hit
>>> the error "sorry, too many clients already" soon in
>>> SharedInvalBackendInit.

On second thought, I wonder if this error could not happen in practice. No?
Because autovacuum doesn't work during recovery and the startup process
can safely use the ProcState entry for autovacuum worker process.
Also since the minimal allowed value of autovacuum_max_workers is one,
the ProcState array guarantees to have at least one entry for autovacuum worker.

If this understanding is right, we don't need to enlarge the array and
can just update the comment. I don't strongly oppose to enlarge
the array in the master, but I'm not sure it's worth doing that
in back branches if the issue can cause no actual error.


>>>
>>> Attaching a patch to fix this issue. Thoughts?
>>
>> Thanks for making the patch! LGTM.
>> Barring any objection, I will commit it.
> 
> Thanks for reviewing. I've made a CF entry for this, just to ensure
> the tests on different CF bot server passes(and yes no failures) -
> https://commitfest.postgresql.org/35/3355/

Thanks!

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION



On Thu, Oct 14, 2021 at 10:56 AM Fujii Masao
<masao.fujii@oss.nttdata.com> wrote:
> On 2021/10/12 15:46, Bharath Rupireddy wrote:
> > On Tue, Oct 12, 2021 at 5:37 AM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
> >>
> >> On 2021/10/12 4:07, Bharath Rupireddy wrote:
> >>> Hi,
> >>>
> >>> While working on [1], it is found that currently the ProcState array
> >>> doesn't have entries for auxiliary processes, it does have entries for
> >>> MaxBackends. But the startup process is eating up one slot from
> >>> MaxBackends. We need to increase the size of the ProcState array by 1
> >>> at least for the startup process. The startup process uses ProcState
> >>> slot via InitRecoveryTransactionEnvironment->SharedInvalBackendInit.
> >>> The procState array size is initialized to MaxBackends in
> >>> SInvalShmemSize.
> >>>
> >>> The consequence of not fixing this issue is that the database may hit
> >>> the error "sorry, too many clients already" soon in
> >>> SharedInvalBackendInit.
>
> On second thought, I wonder if this error could not happen in practice. No?
> Because autovacuum doesn't work during recovery and the startup process
> can safely use the ProcState entry for autovacuum worker process.
> Also since the minimal allowed value of autovacuum_max_workers is one,
> the ProcState array guarantees to have at least one entry for autovacuum worker.
>
> If this understanding is right, we don't need to enlarge the array and
> can just update the comment. I don't strongly oppose to enlarge
> the array in the master, but I'm not sure it's worth doing that
> in back branches if the issue can cause no actual error.

Yes, the issue can't happen. The comment in the SInvalShmemSize,
mentioning about the startup process always having an extra slot
because the autovacuum worker is not active during recovery, looks
okay. But, is it safe to assume that always? Do we have a way to
specify that in the form an Assert(when_i_am_startup_proc &&
autovacuum_not_running) (this looks a bit dirty though)? Instead, we
can just enlarge the array in the master and be confident about the
fact that the startup process always has one dedicated slot.

Regards,
Bharath Rupireddy.



В Сб, 16/10/2021 в 16:37 +0530, Bharath Rupireddy пишет:
> On Thu, Oct 14, 2021 at 10:56 AM Fujii Masao
> <masao.fujii@oss.nttdata.com> wrote:
> > On 2021/10/12 15:46, Bharath Rupireddy wrote:
> > > On Tue, Oct 12, 2021 at 5:37 AM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
> > > > On 2021/10/12 4:07, Bharath Rupireddy wrote:
> > > > > Hi,
> > > > > 
> > > > > While working on [1], it is found that currently the ProcState array
> > > > > doesn't have entries for auxiliary processes, it does have entries for
> > > > > MaxBackends. But the startup process is eating up one slot from
> > > > > MaxBackends. We need to increase the size of the ProcState array by 1
> > > > > at least for the startup process. The startup process uses ProcState
> > > > > slot via InitRecoveryTransactionEnvironment->SharedInvalBackendInit.
> > > > > The procState array size is initialized to MaxBackends in
> > > > > SInvalShmemSize.
> > > > > 
> > > > > The consequence of not fixing this issue is that the database may hit
> > > > > the error "sorry, too many clients already" soon in
> > > > > SharedInvalBackendInit.
> > 
> > On second thought, I wonder if this error could not happen in practice. No?
> > Because autovacuum doesn't work during recovery and the startup process
> > can safely use the ProcState entry for autovacuum worker process.
> > Also since the minimal allowed value of autovacuum_max_workers is one,
> > the ProcState array guarantees to have at least one entry for autovacuum worker.
> > 
> > If this understanding is right, we don't need to enlarge the array and
> > can just update the comment. I don't strongly oppose to enlarge
> > the array in the master, but I'm not sure it's worth doing that
> > in back branches if the issue can cause no actual error.
> 
> Yes, the issue can't happen. The comment in the SInvalShmemSize,
> mentioning about the startup process always having an extra slot
> because the autovacuum worker is not active during recovery, looks
> okay. But, is it safe to assume that always? Do we have a way to
> specify that in the form an Assert(when_i_am_startup_proc &&
> autovacuum_not_running) (this looks a bit dirty though)? Instead, we
> can just enlarge the array in the master and be confident about the
> fact that the startup process always has one dedicated slot.

But this slot wont be used for most of cluster life. It will be just
waste.

And `Assert(there_is_startup_proc && autovacuum_not_running)` has
value on its own, hasn't it? So why doesn't add it with comment.

regards,
Yura Sokolov




On Fri, Feb 11, 2022 at 7:56 PM Yura Sokolov <y.sokolov@postgrespro.ru> wrote:
>
> В Сб, 16/10/2021 в 16:37 +0530, Bharath Rupireddy пишет:
> > On Thu, Oct 14, 2021 at 10:56 AM Fujii Masao
> > <masao.fujii@oss.nttdata.com> wrote:
> > > On 2021/10/12 15:46, Bharath Rupireddy wrote:
> > > > On Tue, Oct 12, 2021 at 5:37 AM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
> > > > > On 2021/10/12 4:07, Bharath Rupireddy wrote:
> > > > > > Hi,
> > > > > >
> > > > > > While working on [1], it is found that currently the ProcState array
> > > > > > doesn't have entries for auxiliary processes, it does have entries for
> > > > > > MaxBackends. But the startup process is eating up one slot from
> > > > > > MaxBackends. We need to increase the size of the ProcState array by 1
> > > > > > at least for the startup process. The startup process uses ProcState
> > > > > > slot via InitRecoveryTransactionEnvironment->SharedInvalBackendInit.
> > > > > > The procState array size is initialized to MaxBackends in
> > > > > > SInvalShmemSize.
> > > > > >
> > > > > > The consequence of not fixing this issue is that the database may hit
> > > > > > the error "sorry, too many clients already" soon in
> > > > > > SharedInvalBackendInit.
> > >
> > > On second thought, I wonder if this error could not happen in practice. No?
> > > Because autovacuum doesn't work during recovery and the startup process
> > > can safely use the ProcState entry for autovacuum worker process.
> > > Also since the minimal allowed value of autovacuum_max_workers is one,
> > > the ProcState array guarantees to have at least one entry for autovacuum worker.
> > >
> > > If this understanding is right, we don't need to enlarge the array and
> > > can just update the comment. I don't strongly oppose to enlarge
> > > the array in the master, but I'm not sure it's worth doing that
> > > in back branches if the issue can cause no actual error.
> >
> > Yes, the issue can't happen. The comment in the SInvalShmemSize,
> > mentioning about the startup process always having an extra slot
> > because the autovacuum worker is not active during recovery, looks
> > okay. But, is it safe to assume that always? Do we have a way to
> > specify that in the form an Assert(when_i_am_startup_proc &&
> > autovacuum_not_running) (this looks a bit dirty though)? Instead, we
> > can just enlarge the array in the master and be confident about the
> > fact that the startup process always has one dedicated slot.
>
> But this slot wont be used for most of cluster life. It will be just
> waste.

Correct. In the standby autovacuum launcher and worker are not started
so, the startup process will always have a slot free for it to use.

> And `Assert(there_is_startup_proc && autovacuum_not_running)` has
> value on its own, hasn't it? So why doesn't add it with comment.

Assertion doesn't make sense to me now. Because the postmaster ensures
that the autovacuum launcher/workers will not get started in standby
mode and we can't reliably know in InitRecoveryTransactionEnvironment
(startup process) whether or not autovacuum launcher process has been
started.

FWIW, here's a patch just adding a comment on how the startup process
can get a free procState array slot even when SInvalShmemSize hasn't
accounted for it.

Regards,
Bharath Rupireddy.

Attachment
В Сб, 12/02/2022 в 16:56 +0530, Bharath Rupireddy пишет:
> On Fri, Feb 11, 2022 at 7:56 PM Yura Sokolov <y.sokolov@postgrespro.ru> wrote:
> > В Сб, 16/10/2021 в 16:37 +0530, Bharath Rupireddy пишет:
> > > On Thu, Oct 14, 2021 at 10:56 AM Fujii Masao
> > > <masao.fujii@oss.nttdata.com> wrote:
> > > > On 2021/10/12 15:46, Bharath Rupireddy wrote:
> > > > > On Tue, Oct 12, 2021 at 5:37 AM Fujii Masao <masao.fujii@oss.nttdata.com> wrote:
> > > > > > On 2021/10/12 4:07, Bharath Rupireddy wrote:
> > > > > > > Hi,
> > > > > > > 
> > > > > > > While working on [1], it is found that currently the ProcState array
> > > > > > > doesn't have entries for auxiliary processes, it does have entries for
> > > > > > > MaxBackends. But the startup process is eating up one slot from
> > > > > > > MaxBackends. We need to increase the size of the ProcState array by 1
> > > > > > > at least for the startup process. The startup process uses ProcState
> > > > > > > slot via InitRecoveryTransactionEnvironment->SharedInvalBackendInit.
> > > > > > > The procState array size is initialized to MaxBackends in
> > > > > > > SInvalShmemSize.
> > > > > > > 
> > > > > > > The consequence of not fixing this issue is that the database may hit
> > > > > > > the error "sorry, too many clients already" soon in
> > > > > > > SharedInvalBackendInit.
> > > > 
> > > > On second thought, I wonder if this error could not happen in practice. No?
> > > > Because autovacuum doesn't work during recovery and the startup process
> > > > can safely use the ProcState entry for autovacuum worker process.
> > > > Also since the minimal allowed value of autovacuum_max_workers is one,
> > > > the ProcState array guarantees to have at least one entry for autovacuum worker.
> > > > 
> > > > If this understanding is right, we don't need to enlarge the array and
> > > > can just update the comment. I don't strongly oppose to enlarge
> > > > the array in the master, but I'm not sure it's worth doing that
> > > > in back branches if the issue can cause no actual error.
> > > 
> > > Yes, the issue can't happen. The comment in the SInvalShmemSize,
> > > mentioning about the startup process always having an extra slot
> > > because the autovacuum worker is not active during recovery, looks
> > > okay. But, is it safe to assume that always? Do we have a way to
> > > specify that in the form an Assert(when_i_am_startup_proc &&
> > > autovacuum_not_running) (this looks a bit dirty though)? Instead, we
> > > can just enlarge the array in the master and be confident about the
> > > fact that the startup process always has one dedicated slot.
> > 
> > But this slot wont be used for most of cluster life. It will be just
> > waste.
> 
> Correct. In the standby autovacuum launcher and worker are not started
> so, the startup process will always have a slot free for it to use.
> 
> > And `Assert(there_is_startup_proc && autovacuum_not_running)` has
> > value on its own, hasn't it? So why doesn't add it with comment.
> 
> Assertion doesn't make sense to me now. Because the postmaster ensures
> that the autovacuum launcher/workers will not get started in standby
> mode and we can't reliably know in InitRecoveryTransactionEnvironment
> (startup process) whether or not autovacuum launcher process has been
> started.
> 
> FWIW, here's a patch just adding a comment on how the startup process
> can get a free procState array slot even when SInvalShmemSize hasn't
> accounted for it.

I think, comment is a good thing.
Marked as "Ready for committer".

> 
> Regards,
> Bharath Rupireddy.




On Sat, Feb 12, 2022 at 6:26 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
> FWIW, here's a patch just adding a comment on how the startup process
> can get a free procState array slot even when SInvalShmemSize hasn't
> accounted for it.

I don't think the positioning of this code comment is very good,
because it's commenting on 0 lines of code. Perhaps that problem could
be fixed by making it the second paragraph of the immediately
preceding comment instead of a separate block, but I think the right
place to comment on this sort of thing is actually in the code that
sizes the data structure - i.e. SInvalShmemSize. If someone looks at
that function and says "hey, this uses GetMaxBackends(), that's off by
one!" they are not ever going to find this comment explaining the
reasoning.

-- 
Robert Haas
EDB: http://www.enterprisedb.com



On Sat, Mar 26, 2022 at 1:20 AM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Sat, Feb 12, 2022 at 6:26 AM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
> > FWIW, here's a patch just adding a comment on how the startup process
> > can get a free procState array slot even when SInvalShmemSize hasn't
> > accounted for it.
>
> I don't think the positioning of this code comment is very good,
> because it's commenting on 0 lines of code. Perhaps that problem could
> be fixed by making it the second paragraph of the immediately
> preceding comment instead of a separate block, but I think the right
> place to comment on this sort of thing is actually in the code that
> sizes the data structure - i.e. SInvalShmemSize. If someone looks at
> that function and says "hey, this uses GetMaxBackends(), that's off by
> one!" they are not ever going to find this comment explaining the
> reasoning.

Thanks. It makes sense to put the comment in SInvalShmemSize.
Attaching v2 patch. Please review it.

Regards,
Bharath Rupireddy.

Attachment
On Sat, Mar 26, 2022 at 2:23 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
> Thanks. It makes sense to put the comment in SInvalShmemSize.
> Attaching v2 patch. Please review it.

How about this version, which I have edited lightly for grammar?

-- 
Robert Haas
EDB: http://www.enterprisedb.com

Attachment
On Tue, Mar 29, 2022 at 12:47 AM Robert Haas <robertmhaas@gmail.com> wrote:
>
> On Sat, Mar 26, 2022 at 2:23 AM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
> > Thanks. It makes sense to put the comment in SInvalShmemSize.
> > Attaching v2 patch. Please review it.
>
> How about this version, which I have edited lightly for grammar?

Thanks. LGTM.

Regards,
Bharath Rupireddy.



On Tue, Mar 29, 2022 at 3:21 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
> Thanks. LGTM.

Committed.

-- 
Robert Haas
EDB: http://www.enterprisedb.com