Thread: [HACKERS] Fast promotion not used when doing a recovery_target PITR restore?
[HACKERS] Fast promotion not used when doing a recovery_target PITR restore?
From
Andres Freund
Date:
Hi, When doing a PITR style recovery, with recovery target set, we're currently not doing a fast promotion, in contrast to the handling when doing a pg_ctl or trigger file based promotion. That can prolong making the server available for writes. I can't really see a reason for this? Greetings, Andres Freund
Re: [HACKERS] Fast promotion not used when doing a recovery_targetPITR restore?
From
Michael Paquier
Date:
On Thu, Jun 22, 2017 at 3:04 AM, Andres Freund <andres@anarazel.de> wrote: > When doing a PITR style recovery, with recovery target set, we're > currently not doing a fast promotion, in contrast to the handling when > doing a pg_ctl or trigger file based promotion. That can prolong making > the server available for writes. > > I can't really see a reason for this? Yes, you are right. I see no reason either why this cannot be done. Why not just switching fast_promote to true in when using RECOVERY_TARGET_ACTION_PROMOTE? That's a bug, not a critical one though. -- Michael -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
Re: [HACKERS] Fast promotion not used when doing a recovery_targetPITR restore?
From
Andres Freund
Date:
On 2017-06-22 14:04:42 +0900, Michael Paquier wrote: > On Thu, Jun 22, 2017 at 3:04 AM, Andres Freund <andres@anarazel.de> wrote: > > When doing a PITR style recovery, with recovery target set, we're > > currently not doing a fast promotion, in contrast to the handling when > > doing a pg_ctl or trigger file based promotion. That can prolong making > > the server available for writes. > > > > I can't really see a reason for this? > > Yes, you are right. I see no reason either why this cannot be done. > Why not just switching fast_promote to true in when using > RECOVERY_TARGET_ACTION_PROMOTE? That's a bug, not a critical one > though. I don't think it's really a bug - just a missed optimization. I'd personally not be in favor of backpatching this - it'll have some chance of screwing things up, even if I hope that chance is fairly small. As a wider discussion, I wonder if we should keep non-fast promotion for anything but actual crash recovery? And even there it might actually be a pretty good idea to not force a full checkpoint - getting up fast after a crash is kinda important.. Andres Freund
Re: [HACKERS] Fast promotion not used when doing a recovery_targetPITR restore?
From
Michael Paquier
Date:
On Fri, Jun 23, 2017 at 2:34 AM, Andres Freund <andres@anarazel.de> wrote: > I don't think it's really a bug - just a missed optimization. I'd > personally not be in favor of backpatching this - it'll have some chance > of screwing things up, even if I hope that chance is fairly small. It would be better to wait until the branch for PG11 opens then. > As a wider discussion, I wonder if we should keep non-fast promotion for > anything but actual crash recovery? Yes, I would push a bit forward and remove fallback_promote. > And even there it might actually be > a pretty good idea to not force a full checkpoint - getting up fast > after a crash is kinda important.. But not that. Crash recovery is designed to be simple and robust, with only the postmaster and the startup processes running when doing so. Not having the startup process doing by itself checkpoints would require the need of the bgwriter, which increases the likelihood of bugs. In short, I don't think that improving performance is the matter for crash recovery, robustness and simplicity are. -- Michael
Re: [HACKERS] Fast promotion not used when doing a recovery_targetPITR restore?
From
Andres Freund
Date:
On 2017-06-23 10:56:07 +0900, Michael Paquier wrote: > > And even there it might actually be > > a pretty good idea to not force a full checkpoint - getting up fast > > after a crash is kinda important.. > > But not that. Crash recovery is designed to be simple and robust, with > only the postmaster and the startup processes running when doing so. > Not having the startup process doing by itself checkpoints would > require the need of the bgwriter, which increases the likelihood of > bugs. In short, I don't think that improving performance is the matter > for crash recovery, robustness and simplicity are. I'm far from convinced by this. By now WAL replay with checkpointer, bgwriter, etc. active is actually *more* tested than the cases without it. The likelihood of bugs is higher in the less frequently exercised paths, and given that replication exercises the situation with all those processes active on a continuous basis, I'm fairly unconvinced by your argument. - Andres
Re: [HACKERS] Fast promotion not used when doing a recovery_targetPITR restore?
From
Michael Paquier
Date:
On Wed, Jun 28, 2017 at 3:44 AM, Andres Freund <andres@anarazel.de> wrote: > I'm far from convinced by this. By now WAL replay with checkpointer, > bgwriter, etc. active is actually *more* tested than the cases without > it. The likelihood of bugs is higher in the less frequently exercised > paths, and given that replication exercises the situation with all those > processes active on a continuous basis, I'm fairly unconvinced by your > argument. Crash recovery is the last thing where failures should never happen. Don't you think that it should remain simple as it has been designed originally? It seems to me that the argument for keeping things simple has higher priority than performance in being able to reconnect by delaying the checkpoint. -- Michael
Re: [HACKERS] Fast promotion not used when doing a recovery_targetPITR restore?
From
Andres Freund
Date:
On 2017-06-28 06:04:23 +0900, Michael Paquier wrote: > On Wed, Jun 28, 2017 at 3:44 AM, Andres Freund <andres@anarazel.de> wrote: > > I'm far from convinced by this. By now WAL replay with checkpointer, > > bgwriter, etc. active is actually *more* tested than the cases without > > it. The likelihood of bugs is higher in the less frequently exercised > > paths, and given that replication exercises the situation with all those > > processes active on a continuous basis, I'm fairly unconvinced by your > > argument. > > Crash recovery is the last thing where failures should never happen. > Don't you think that it should remain simple as it has been designed > originally? It seems to me that the argument for keeping things simple > has higher priority than performance in being able to reconnect by > delaying the checkpoint. You seem to completely argue besides my point that the replication path is *more* robust by now? And there's plenty scenarios where a faster startup is quite crucial for performance. The difference between an immediate shutdown + recovery without checkpoint to a fast shutdown can be very large, and that matters a lot for faster postgres updates etc. Andres
Re: [HACKERS] Fast promotion not used when doing a recovery_targetPITR restore?
From
Michael Paquier
Date:
On Wed, Jun 28, 2017 at 6:13 AM, Andres Freund <andres@anarazel.de> wrote: > You seem to completely argue besides my point that the replication path > is *more* robust by now? And there's plenty scenarios where a faster > startup is quite crucial for performance. The difference between an > immediate shutdown + recovery without checkpoint to a fast shutdown can > be very large, and that matters a lot for faster postgres updates etc. If you go that way, it seems safer to me if users had some control with a switch, defaulting to the previous behavior. And a complete switch to the newer behavior could be done later on depending on what has been found. -- Michael