Unduly short fuse in RequestCheckpoint - Mailing list pgsql-hackers

From Tom Lane
Subject Unduly short fuse in RequestCheckpoint
Date
Msg-id 27830.1552752475@sss.pgh.pa.us
Whole thread Raw
Responses Re: Unduly short fuse in RequestCheckpoint
List pgsql-hackers
I noticed an odd buildfarm failure today:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sungazer&dt=2019-03-16%2012%3A12%3A20

of which the key bit seems to be

2019-03-16 15:20:43.835 UTC [10879304] 003_promote.pl LOG:  received replication command: BASE_BACKUP LABEL
'pg_basebackupbase backup'    NOWAIT    
2019-03-16 15:20:45.857 UTC [10879304] 003_promote.pl ERROR:  could not request checkpoint because checkpointer not
running
2019-03-16 15:20:47.227 UTC [61604144] LOG:  received immediate shutdown request

Digging in the buildfarm archives finds seven other occurrences of the
same error in the past three months (I didn't look back further).

The cause of this error is that RequestCheckpoint will give up and fail
after just 2 seconds, which evidently is not long enough on slow or
heavily loaded machines.  Since there isn't any good reason why the
checkpointer wouldn't be running, I'm inclined to swing a large hammer
and kick this timeout up to 60 seconds.  Thoughts?

            regards, tom lane


pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: [HACKERS] PATCH: multivariate histograms and MCV lists
Next
From: Dmitry Dolgov
Date:
Subject: Re: Index Skip Scan