On Thu, Apr 15, 2021 at 5:28 AM Thomas Munro <thomas.munro@gmail.com> wrote:
Thanks a lot for the detailed explanation.
> On Thu, Apr 15, 2021 at 2:06 AM Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote:
> > 1) Is it really harmful to use pg_usleep in a postmaster child process
> > as it doesn't let the child process detect postmaster death?
>
> Yeah, that's a bad idea. Any long-term waiting (including short waits
> in a loop) should ideally be done with the latch infrastructure.
Agree. Along with short waits in a loop, I think we also should
replace pg_usleep with WaitLatch that has a user configurable
parameter like below:
pg_usleep(VACUUM_TRUNCATE_LOCK_WAIT_INTERVAL * 1000L);
pg_usleep(PostAuthDelay * 1000000L);
pg_usleep(CommitDelay);
> 4) Are there any places where we need to replace pg_usleep with
> > WaitLatch/equivalent of pg_sleep to detect the postmaster death
> > properly?
>
> We definitely have replaced a lot of sleeps with latch.c primitives
> over the past few years, since we got WL_EXIT_ON_PM_DEATH and
> condition variables. There may be many more to improve... You
> mentioned autovacuum... yeah, Stephen fixed one of these with commit
> 4753ef37, but yeah it's not great to have those others in there...
I have not looked at the commit 4753ef37 previously, but it
essentially addresses the problem with pg_usleep for vacuum delay. I'm
thinking we can also replace pg_usleep in below places based on the
fact that pg_usleep should be avoided in 1) short waits in a loop 2)
when wait time is dependent on user configurable parameters. And using
WaitLatch may require us to add wait event types to WaitEventTimeout
enum, but that's okay.
1) pg_usleep(VACUUM_TRUNCATE_LOCK_WAIT_INTERVAL * 1000L); in lazy_truncate_heap
2) pg_usleep(CommitDelay); in XLogFlush
3) pg_usleep(10000L); in CreateCheckPoint
4) pg_usleep(1000000L); in do_pg_stop_backup
5) pg_usleep(1000L); in read_local_xlog_page
6) pg_usleep(PostAuthDelay * 1000000L); in AutoVacLauncherMain,
AutoVacWorkerMain, StartBackgroundWorker, InitPostgres
7) pg_usleep(100000L); in RequestCheckpoint
8) pg_usleep(1000000L); in pgarch_ArchiverCopyLoop
9) pg_usleep(PGSTAT_RETRY_DELAY * 1000L); in backend_read_statsfile
10) pg_usleep(PreAuthDelay * 1000000L); in BackendInitialize
11) pg_usleep(10000L); in WalSndWaitStopping
12) pg_usleep(standbyWait_us); in WaitExceedsMaxStandbyDelay
13) pg_usleep(10000L); in RegisterSyncRequest
I'm sure we won't be changing in all of the above places. It will be
good to review and correct the above list.
With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com