Re: [Patch] ALTER SYSTEM READ ONLY - Mailing list pgsql-hackers
From | Amul Sul |
---|---|
Subject | Re: [Patch] ALTER SYSTEM READ ONLY |
Date | |
Msg-id | CAAJ_b94rpE8s3xGoKnhEdMVgMVPriw73O+MHAJw9re_ybmREcQ@mail.gmail.com Whole thread Raw |
In response to | Re: [Patch] ALTER SYSTEM READ ONLY (Andres Freund <andres@anarazel.de>) |
Responses |
Re: [Patch] ALTER SYSTEM READ ONLY
|
List | pgsql-hackers |
On Thu, Dec 10, 2020 at 6:04 AM Andres Freund <andres@anarazel.de> wrote: > > Hi, > > On 2020-12-09 16:13:06 -0500, Robert Haas wrote: > > That's not good. On a typical busy system, a system is going to be in > > the middle of a checkpoint most of the time, and the checkpoint will > > take a long time to finish - maybe minutes. > > Or hours, even. Due to the cost of FPWs it can make a lot of sense to > reduce the frequency of that cost... > > > > We want this feature to respond within milliseconds or a few seconds, > > not minutes. So we need something better here. > > Indeed. > > > > I'm inclined to think > > that we should try to CompleteWALProhibitChange() at the same places > > we AbsorbSyncRequests(). We know from experience that bad things > > happen if we fail to absorb sync requests in a timely fashion, so we > > probably have enough calls to AbsorbSyncRequests() to make sure that > > we always do that work in a timely fashion. So, if we do this work in > > the same place, then it will also be done in a timely fashion. > > Sounds sane, without having looked in detail. > Understood & agreed that we need to change the system state as soon as possible. I can see AbsorbSyncRequests() is called from 4 routing as CheckpointWriteDelay(), ProcessSyncRequests(), SyncPostCheckpoint() and CheckpointerMain(). Out of 4, the first three executes with an interrupt is on hod which will cause a problem when we do emit barrier and wait for those barriers absorption by all the process including itself and will cause an infinite wait. I think that can be fixed by teaching WaitForProcSignalBarrier(), do not wait on self to absorb barrier. Let that get absorbed at a later point in time when the interrupt is resumed. I assumed that we cannot do barrier processing right away since there could be other barriers (maybe in the future) including ours that should not process while the interrupt is on hold. > > > I'm not 100% sure whether that introduces any other problems. > > Certainly, we're not going to be able to finish the checkpoint once > > we've gone read-only, so we'll fail when we try to write the WAL > > record for that, or maybe earlier if there's anything else that tries > > to write WAL. Either the checkpoint needs to error out, like any other > > attempt to write WAL, and we can attempt a new checkpoint if and when > > we go read/write, or else we need to finish writing stuff out to disk > > but not actually write the checkpoint completion record (or any other > > WAL) unless and until the system goes back into read/write mode - and > > then at that point the previously-started checkpoint will finish > > normally. The latter seems better if we can make it work, but the > > former is probably also acceptable. What you've got right now is not. > > I mostly wonder which of those two has which implications for how many > FPWs we need to redo. Presumably stalling but not cancelling the current > checkpoint is better? > Also, I like to uphold this idea of stalling a checkpointer's work in the middle instead of canceling it. But here, we need to take care of shutdown requests and death of postmaster cases that can cancel this stalling. If that happens we need to make sure that no unwanted wal insertion happens afterward and for that LocalXLogInsertAllowed flag needs to be updated correctly since the wal prohibits barrier processing was skipped for the checkpointer since it emits that barrier as mentioned above. Regards, Amul
pgsql-hackers by date: