Thread: Getting rid of wal_level=archive and default to hot_standby + wal_senders
Hi, I think these days there's no reason for the split between the archive and hot_standby wal levels. The split was made out of volume and stability concerns. I think we can by now be confident about the wal_level = hot_standby changes (note I'm not proposing hot_standby = on). So let's remove the split. It just gives users choice between two options that don't have a meaningful difference. Additionally I think we should change the default for wal_level to hot_standby and max_wal_senders (maybe to 5). That way users can use pg_basebackup and setup streaming standbys without having to restart the primary. I think that'd be a important step in making setup easier. Previously there have been arguments against changing the default of wal_level because it'd mean the regression tests wouldn't exercise minimal anymore. That might be true, but then right now we just don't exercise the more complex levels. If we're really concerned we can just force a different value during the tests, just as we do for prepared xacts. Comments? Additionally, more complex and further into the future, I wonder if we couldn't also get rid of wal_level = logical by automatically switching to it whenever logical slots are active. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: Getting rid of wal_level=archive and default to hot_standby + wal_senders
From
Magnus Hagander
Date:
On Tue, Feb 3, 2015 at 1:43 PM, Andres Freund <andres@2ndquadrant.com> wrote:
Hi,
I think these days there's no reason for the split between the archive
and hot_standby wal levels. The split was made out of volume and
stability concerns. I think we can by now be confident about the
wal_level = hot_standby changes (note I'm not proposing hot_standby =
on).
So let's remove the split. It just gives users choice between two options
that don't have a meaningful difference.
+1.
Additionally I think we should change the default for wal_level to
hot_standby and max_wal_senders (maybe to 5). That way users can use
pg_basebackup and setup streaming standbys without having to restart the
primary. I think that'd be a important step in making setup easier.
Yes, please!
Those who want to optimize their WAL size can set it back to minimal, but let's make the default the one that makes life *easy* for people.
The other option, which would be more complicated (I have a semi-finished patch that I never got time to clean up) would be for pg_basebackup to be able to dynamically raise the value of wal_level during it's run. It would not help with the streaming standby part, but it would help with pg_basebackup. That could be useful independent - for those who prefer using wal_level=minimal and also pg_basebackup..
Previously there have been arguments against changing the default of
wal_level because it'd mean the regression tests wouldn't exercise
minimal anymore. That might be true, but then right now we just don't
exercise the more complex levels. If we're really concerned we can just
force a different value during the tests, just as we do for prepared
xacts.
Seems we should focus our tests on the stuff that people actually use in reality? :) And if we change the default, then even more people will use that level.
But it would definitely be a good idea to have some buildfarm animals set up to test each one.
Comments?
Additionally, more complex and further into the future, I wonder if we
couldn't also get rid of wal_level = logical by automatically switching
to it whenever logical slots are active.
If it can be safely done online, I definitely think that would be a good goal to have. If we could do the same for hot_standby if you had physical slots, that might also be a good idea?
Re: Getting rid of wal_level=archive and default to hot_standby + wal_senders
From
Michael Paquier
Date:
On Tue, Feb 3, 2015 at 9:43 PM, Andres Freund <andres@2ndquadrant.com> wrote: > I think these days there's no reason for the split between the archive > and hot_standby wal levels. The split was made out of volume and > stability concerns. I think we can by now be confident about the > wal_level = hot_standby changes (note I'm not proposing hot_standby = > on). +1. > So let's remove the split. It just gives users choice between two options > that don't have a meaningful difference. The last time I mentioned something similar (purely removing archive from wal_level CA+TgmoaTG9U4=A_bs8SbdEMM2+faPQhzUjhJ7F-nPFy+BNs_zA@mail.gmail.com), there were two additional suggestions done as well: - Keep archive and make it mean archive <=> hot_standby - Do nothing to still let the users what they think is better and not what we think is better. Perhaps times have changed since... I guess that you mean making both values become equivalent, right? -- Michael
Re: Getting rid of wal_level=archive and default to hot_standby + wal_senders
From
Andres Freund
Date:
On 2015-02-03 13:51:25 +0100, Magnus Hagander wrote: > Those who want to optimize their WAL size can set it back to minimal, but > let's make the default the one that makes life *easy* for people. Precisely. New users won't usually have tremendous stuff to load in the specific circumstances in which minimal makes a differences. And pretty much everyone uses base backups & replication outside of initial loading anyway. > The other option, which would be more complicated (I have a semi-finished > patch that I never got time to clean up) would be for pg_basebackup to be > able to dynamically raise the value of wal_level during it's run. It would > not help with the streaming standby part, but it would help with > pg_basebackup. That could be useful independent - for those who prefer > using wal_level=minimal and also pg_basebackup.. There's some ugly corner cases in making that happen. I'm sure we could do it, but I'm really not convinced it's worth the effort. > Previously there have been arguments against changing the default of > > wal_level because it'd mean the regression tests wouldn't exercise > > minimal anymore. That might be true, but then right now we just don't > > exercise the more complex levels. If we're really concerned we can just > > force a different value during the tests, just as we do for prepared > > xacts. > Seems we should focus our tests on the stuff that people actually use in > reality? :) And if we change the default, then even more people will use > that level. Agreed. It's not my argument ;) > But it would definitely be a good idea to have some buildfarm animals set > up to test each one. .oO(make it random in the buildfarm code) > > Additionally, more complex and further into the future, I wonder if we > > couldn't also get rid of wal_level = logical by automatically switching > > to it whenever logical slots are active. > If it can be safely done online, I definitely think that would be a good > goal to have. I think it could be at least made work on the primary. I don't really see how on a standby (which we don't support yet anyway). > If we could do the same for hot_standby if you had physical slots, > that might also be a good idea? I think it's slightly more complicated there. We'd have to delay slot creation till after a checkpoint and such, which doesn't seem that desirable. I think it's more interesting for logical anyway - there is some common workloads where it actually does imply some (not large, mind you) overhead. So we can't just change to it as a default. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: Getting rid of wal_level=archive and default to hot_standby + wal_senders
From
Andres Freund
Date:
On 2015-02-03 21:58:44 +0900, Michael Paquier wrote: > On Tue, Feb 3, 2015 at 9:43 PM, Andres Freund <andres@2ndquadrant.com> wrote: > > I think these days there's no reason for the split between the archive > > and hot_standby wal levels. The split was made out of volume and > > stability concerns. I think we can by now be confident about the > > wal_level = hot_standby changes (note I'm not proposing hot_standby = > > on). > +1. > > > So let's remove the split. It just gives users choice between two options > > that don't have a meaningful difference. > > The last time I mentioned something similar (purely removing archive > from wal_level CA+TgmoaTG9U4=A_bs8SbdEMM2+faPQhzUjhJ7F-nPFy+BNs_zA@mail.gmail.com), > there were two additional suggestions done as well: > - Keep archive and make it mean archive <=> hot_standby That's actually what I was thinking of implementing. I.e. accept archive, but output hot_standby. > - Do nothing to still let the users what they think is better and not > what we think is better. I'd rather remove the supporting code that takes different branches based on those. > Perhaps times have changed since... I don't think the arguments back then made all that much sense. But perhaps they're also less important if we change the default. For me removing an option that doesn't have an effect but saving a couple bytes every now and then and adding to the testing matrix isn't 'nannyism'. It's removing pointless choice. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: Getting rid of wal_level=archive and default to hot_standby + wal_senders
From
Petr Jelinek
Date:
On 03/02/15 13:51, Magnus Hagander wrote: > On Tue, Feb 3, 2015 at 1:43 PM, Andres Freund <andres@2ndquadrant.com > <mailto:andres@2ndquadrant.com>> wrote: > > Hi, > > I think these days there's no reason for the split between the archive > and hot_standby wal levels. The split was made out of volume and > stability concerns. I think we can by now be confident about the > wal_level = hot_standby changes (note I'm not proposing hot_standby = > on). > > So let's remove the split. It just gives users choice between two > options > that don't have a meaningful difference. > > > +1. > +1 too > > Additionally I think we should change the default for wal_level to > hot_standby and max_wal_senders (maybe to 5). That way users can use > pg_basebackup and setup streaming standbys without having to restart the > primary. I think that'd be a important step in making setup easier. > > > Yes, please! > > Those who want to optimize their WAL size can set it back to minimal, > but let's make the default the one that makes life *easy* for people. > > The other option, which would be more complicated (I have a > semi-finished patch that I never got time to clean up) would be for > pg_basebackup to be able to dynamically raise the value of wal_level > during it's run. It would not help with the streaming standby part, but > it would help with pg_basebackup. That could be useful independent - for > those who prefer using wal_level=minimal and also pg_basebackup.. > > This is not that easy to do, let's do it one step at a time. > > Comments? > > Additionally, more complex and further into the future, I wonder if we > couldn't also get rid of wal_level = logical by automatically switching > to it whenever logical slots are active. > > > > If it can be safely done online, I definitely think that would be a good > goal to have. If we could do the same for hot_standby if you had > physical slots, that might also be a good idea? > +many for the logical, physical would be nice but I think it's again in the category of not so easy and maybe better as next step if at all. -- Petr Jelinek http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes: > Additionally I think we should change the default for wal_level to > hot_standby and max_wal_senders (maybe to 5). That way users can use > pg_basebackup and setup streaming standbys without having to restart the > primary. I think that'd be a important step in making setup easier. I always thought the reason for defaulting to "minimal" was performance. I'd like to see proof that the impact of wal_level = hot_standby is negligible before we consider doing this. The argument that having to change one more GUC is an undue burden while configuring hot standby seems ridiculous from here. HS is not nearly "push the EASY button and you're done", and this change wouldn't make it so. regards, tom lane
Re: Getting rid of wal_level=archive and default to hot_standby + wal_senders
From
Andres Freund
Date:
On 2015-02-03 10:41:04 -0500, Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: > > Additionally I think we should change the default for wal_level to > > hot_standby and max_wal_senders (maybe to 5). That way users can use > > pg_basebackup and setup streaming standbys without having to restart the > > primary. I think that'd be a important step in making setup easier. > > I always thought the reason for defaulting to "minimal" was performance. > I'd like to see proof that the impact of wal_level = hot_standby is > negligible before we consider doing this. Well, it really depends on what you're doing. The cases where minimal is beneficial is when you COPY into a table that's been created in the same (sub)xact or rewrite it.. Other than that there's not really a difference? > The argument that having to change one more GUC is an undue burden while > configuring hot standby seems ridiculous from here. HS is not nearly > "push the EASY button and you're done", and this change wouldn't make > it so. But pg_basebackup is pretty close to being that easy, isn't it? There's still pg_hba.conf to deal with, but other than that... And with a littlebit more work (safely autocreate replication slots, so wal files aren't getting removed prematurely), we can make amke HS simpler as well. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: Getting rid of wal_level=archive and default to hot_standby + wal_senders
From
Robert Haas
Date:
On Tue, Feb 3, 2015 at 7:43 AM, Andres Freund <andres@2ndquadrant.com> wrote: > I think these days there's no reason for the split between the archive > and hot_standby wal levels. The split was made out of volume and > stability concerns. I think we can by now be confident about the > wal_level = hot_standby changes (note I'm not proposing hot_standby = > on). > > So let's remove the split. It just gives users choice between two options > that don't have a meaningful difference. > > Additionally I think we should change the default for wal_level to > hot_standby and max_wal_senders (maybe to 5). That way users can use > pg_basebackup and setup streaming standbys without having to restart the > primary. I think that'd be a important step in making setup easier. > > Previously there have been arguments against changing the default of > wal_level because it'd mean the regression tests wouldn't exercise > minimal anymore. That might be true, but then right now we just don't > exercise the more complex levels. If we're really concerned we can just > force a different value during the tests, just as we do for prepared > xacts. > > Comments? > > Additionally, more complex and further into the future, I wonder if we > couldn't also get rid of wal_level = logical by automatically switching > to it whenever logical slots are active. I think my vote is to maintain the status quo. What you're basically proposing to do is ship the system half-configured for replication, and I don't see the point of that. The people who want replication still have to do the rest of the setup anyway, and the people who don't want replication are losing the benefits of wal_level=minimal for no real gain. In particular, they are losing the ability to skip WAL-logging when bulk-loading a just-created table, which is not a small thing. I'm fairly sure we have customers who benefit significantly from that behavior. I agree that wal_level=archive doesn't serve much purpose at this point. I guess I wouldn't object to removing that, although I can't see much benefit to doing so, either. Crazy ideas: Could we make wal_level something other than PGC_POSTMASTER? PGC_SIGHUP would be nice... Could we, maybe, even make it a derived value rather than one that is explicitly configured?Like, if you set max_wal_senders>0, you automaticallyget wal_level=hot_standby? If you register a logical replication slot, you automatically get wal_level=logical? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: Getting rid of wal_level=archive and default to hot_standby + wal_senders
From
Andres Freund
Date:
On 2015-02-03 11:00:43 -0500, Robert Haas wrote: > On Tue, Feb 3, 2015 at 7:43 AM, Andres Freund <andres@2ndquadrant.com> wrote: > > I think these days there's no reason for the split between the archive > > and hot_standby wal levels. The split was made out of volume and > > stability concerns. I think we can by now be confident about the > > wal_level = hot_standby changes (note I'm not proposing hot_standby = > > on). > > > > So let's remove the split. It just gives users choice between two options > > that don't have a meaningful difference. > > > > Additionally I think we should change the default for wal_level to > > hot_standby and max_wal_senders (maybe to 5). That way users can use > > pg_basebackup and setup streaming standbys without having to restart the > > primary. I think that'd be a important step in making setup easier. > > > > Previously there have been arguments against changing the default of > > wal_level because it'd mean the regression tests wouldn't exercise > > minimal anymore. That might be true, but then right now we just don't > > exercise the more complex levels. If we're really concerned we can just > > force a different value during the tests, just as we do for prepared > > xacts. > > > > Comments? > > > > Additionally, more complex and further into the future, I wonder if we > > couldn't also get rid of wal_level = logical by automatically switching > > to it whenever logical slots are active. > > I think my vote is to maintain the status quo. What you're basically > proposing to do is ship the system half-configured for replication, > and I don't see the point of that. Not only replication, but also hot backup. I think we should actually should ship it fully configured for that in the long term. This is the biggest step towards that. At the moment it's really hard to get there for a beginner. Usually it goes like 1) Try to create a base backup. Fails because of max_wal_senders. 2) Try to adjust max_wal_senders, fails because of wal_level. Set to archive. 3) New base backup is created. 4) Try to start the new base backup with hot_standby enabled, fails because of wal_level. 5) Enable wal_level=hot_standby, restart master 6) Restart standby. Still fails because it's trying to start from the checkpoint with wal_level still archive. 7) Give up here. If not earlier. I think our out of the box experience is ridiculously tuned towards corner cases that aren't very frequent. And those need to tune further anyway. Many, many solutions out there are much easier to setup initially. > The people who want replication still have to do the rest of the setup > anyway, and the people who don't want replication are losing the > benefits of wal_level=minimal for no real gain. How many people, with a big enough data directory that it matters, want neither base backups nor replication? Except during initial load that's got to be a minority these days. > In particular, they are losing the ability to skip WAL-logging when > bulk-loading a just-created table, which is not a small thing. Aside from restoring from a dump it's something that's not that easy to actually benefit from. The restore is a good use case - but for it you usually want/need to tune lots of other things. Without fsync=off the increased number of fsyncs and such can actually end up hurting you... > I'm fairly sure we have customers who benefit significantly from that > behavior. Sure, same here. But I'll bet it's a much smaller number than those that'd have benefited from a simpler setup of backups and replication. I'm not out to get rid of minimal, I just don't think there's much point in having it as default these days. > Crazy ideas: Could we make wal_level something other than > PGC_POSTMASTER? PGC_SIGHUP would be nice... Unfortunately not easy, because it effectively takes a checkpoint to make it active. But yea, I've wondered about it as well. > Could we, maybe, even make it a derived value rather than one that is > explicitly configured? Like, if you set max_wal_senders>0, you automatically get > wal_level=hot_standby? Our experience with derived gucs isn't that great. Remember the whole effective_cache_size mess? Maybe we just need to find a better way to implement that though, instead of avoiding it from here on. > If you register a logical replication slot, you automatically get > wal_level=logical? That actually shouldn't be very hard, if the level is hot_standby beforehand. At least not on the primary, on the standby it obviously can't work (not that we support decoding there yet). Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
Re: Getting rid of wal_level=archive and default to hot_standby + wal_senders
From
Robert Haas
Date:
On Wed, Feb 4, 2015 at 8:44 AM, Andres Freund <andres@2ndquadrant.com> wrote: >> I think my vote is to maintain the status quo. What you're basically >> proposing to do is ship the system half-configured for replication, >> and I don't see the point of that. > > Not only replication, but also hot backup. > > I think we should actually should ship it fully configured for that in > the long term. This is the biggest step towards that. > > At the moment it's really hard to get there for a beginner. Usually it > goes like > 1) Try to create a base backup. Fails because of max_wal_senders. > 2) Try to adjust max_wal_senders, fails because of wal_level. Set to > archive. > 3) New base backup is created. > 4) Try to start the new base backup with hot_standby enabled, fails > because of wal_level. > 5) Enable wal_level=hot_standby, restart master > 6) Restart standby. Still fails because it's trying to start from the > checkpoint with wal_level still archive. > 7) Give up here. If not earlier. In my opinion, the solution to this problem is to make this stuff simpler to configure. You might be right that the wal_level=minimal configuration is mostly useful for the initial load, but the initial load is the step a lot of people do first. I don't think we should just dismiss that as unimportant. >> Could we, maybe, even make it a derived value rather than one that is >> explicitly configured? Like, if you set max_wal_senders>0, you automatically get >> wal_level=hot_standby? > > Our experience with derived gucs isn't that great. Remember the whole > effective_cache_size mess? Maybe we just need to find a better way to > implement that though, instead of avoiding it from here on. The only thing I remember about effective_cache_size is that Bruce had a theory that multiplying a constant by the size of shared_buffers would give you an estimate of the total memory of the system, but since few people run with shared_buffers>8GB and many people have RAM>32GB, that was a lame way to estimate it. I think the auto-tuning of wal_buffers has been pretty successful. Anyway, I'm not talking about deriving the GUC, I'm talking about deriving the WAL level which is currently controlled solely by the GUC. We do something like this for full-page writes. Even if you in general have full_page_writes=off, trying to take a hot backup forces it on. This is smart. I think we could do something similar for replication & hot backup. Suppose we remove the wal_level GUC altogether, but there's a control file property that indicates whether replication (broadly construed to include hot backup and PITR) is enabled. Actually, more specifically, we store an LSN. If it's 0, replication features are disabled; if it's the location of the previous checkpoint, we're in the process of enabling replication features; if it precedes the location of the previous checkpoint, replication features are enabled. Then, we add a command like this: ALTER SYSTEM REPLICATION ENABLE; When you do that, it sets the LSN in the control file to the location of the most recent checkpoint, and then triggers a checkpoint. When the checkpoint is complete, it returns. You can shut it off again by saying: ALTER SYSTEM REPLICATION DISABLE; ...which just zeroes out the LSN in the control file. A copy of the control-file LSN is stored in shared memory; when it's non-zero, backends behave like wal_level=hot_standby; when zero, they behave like wal_level=minimal. Figuring out how to make sure they notice the change in a timely fashion is, uh, left as an exercise for the student. Hopefully that's a solveable problem. max_wal_senders can be set to a non-zero value even when replication is disabled, but connections are refused with a suitable error message until you enable it. We ship with a default value of, say, max_wal_senders=3. Then your above outline gets simplified to this: 1. Try to create a base backup. Fails with "ERROR: replication must be enabled to create a base backup; HINT: Use ALTER SYSTEM REPLICATION ENABLE to enable this feature". 2. Run the command from the HINT. 3. Try again; it works. >> If you register a logical replication slot, you automatically get >> wal_level=logical? > > That actually shouldn't be very hard, if the level is hot_standby > beforehand. At least not on the primary, on the standby it obviously > can't work (not that we support decoding there yet). If hot_standby_feedback=on, it would be reasonable for the standby to let the master know that it now needs wal_level=logical; or there could be ALTER SYSTEM REPLICATION ENABLE LOGICAL or whatever. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Andres Freund <andres@2ndquadrant.com> writes: > On 2015-02-03 11:00:43 -0500, Robert Haas wrote: >> Could we, maybe, even make it a derived value rather than one that is >> explicitly configured? Like, if you set max_wal_senders>0, you automatically get >> wal_level=hot_standby? > Our experience with derived gucs isn't that great. Remember the whole > effective_cache_size mess? Maybe we just need to find a better way to > implement that though, instead of avoiding it from here on. We've proven that it's a bad idea to have a GUC whose default value depends on another one. However, I thought the proposal here was to get rid of wal_level as a user-visible knob altogether. That seems like a fine idea if we can drive the decision off other GUCs instead. regards, tom lane
Re: Getting rid of wal_level=archive and default to hot_standby + wal_senders
From
Josh Berkus
Date:
On 02/04/2015 06:48 AM, Robert Haas wrote: > Anyway, I'm not talking about deriving the GUC, I'm talking about > deriving the WAL level which is currently controlled solely by the > GUC. We do something like this for full-page writes. Even if you in > general have full_page_writes=off, trying to take a hot backup forces > it on. This is smart. I think we could do something similar for > replication & hot backup. Suppose we remove the wal_level GUC > altogether, but there's a control file property that indicates whether > replication (broadly construed to include hot backup and PITR) is > enabled. Actually, more specifically, we store an LSN. If it's 0, > replication features are disabled; if it's the location of the > previous checkpoint, we're in the process of enabling replication > features; if it precedes the location of the previous checkpoint, > replication features are enabled. > > Then, we add a command like this: > > ALTER SYSTEM REPLICATION ENABLE; > > When you do that, it sets the LSN in the control file to the location > of the most recent checkpoint, and then triggers a checkpoint. When > the checkpoint is complete, it returns. You can shut it off again by > saying: > > ALTER SYSTEM REPLICATION DISABLE; This would be awesome, and a huge step forward in usability. Question, though: do we want to distinguish between "hot_standby" and "logical" levels? Does this depend on anything other than the WAL log volume/speed? If not, we can do some tests. If Robert's suggestion proves prohibitively difficult, I also +1 the idea of not having a user-visible wal_level setting at all. Specifically, I'd love the following behavior: if logical_replication_slots > 0wal_level = logical elif max_wal_senders > 0wal_level = hot_standby elif archiving = onwal_level = archive (or hot_standby) elsewal_level = minimal Given that this decision tree is the only possible decision tree, it makes you kind of wonder why we have an explicit GUC in the first place. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Re: Getting rid of wal_level=archive and default to hot_standby + wal_senders
From
Peter Eisentraut
Date:
On 2/3/15 11:00 AM, Robert Haas wrote: > Crazy ideas: Could we make wal_level something other than > PGC_POSTMASTER? PGC_SIGHUP would be nice... Could we, maybe, even > make it a derived value rather than one that is explicitly configured? > Like, if you set max_wal_senders>0, you automatically get > wal_level=hot_standby? If you register a logical replication slot, > you automatically get wal_level=logical? We could probably make wal_level changeable at run-time if we somehow recorded to the point at which it was changed, as you describe later (or even brute-force it by forcing a checkpoint every time it is changed, which is not worse than what we require now (or even just write out a warning that the setting is not effective until after a checkpoint)). But that still leaves max_wal_senders (and arguably max_replication_slots) requiring a restart before replication can start.I don't see a great plan for those on the horizon. To me, the restart requirement is the killer. That there are so many interwoven settings isn't great either, but there will always be more options, and all we can do is manage it.