Thread: Bump default wal_level to logical
Hi, I'd like to propose $subject, as embodied in the attached patch. This makes it possible to discover and fulfill a need for logical replication that can arise at a time when bouncing the server has become impractical, i.e. when there is already high demand on it. Best, David. -- David Fetter <david(at)fetter(dot)org> http://fetter.org/ Phone: +1 415 235 3778 Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Attachment
On Mon, Jun 8, 2020 at 10:08 AM David Fetter <david@fetter.org> wrote: > > Hi, > > I'd like to propose $subject, as embodied in the attached patch. This > makes it possible to discover and fulfill a need for logical > replication that can arise at a time when bouncing the server has > become impractical, i.e. when there is already high demand on it. > I think we should first do performance testing to see what is the overhead of making this default. I think pgbench read-write at various scale factors would be a good starting point. Also, we should see how much additional WAL it generates as compared to current default. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
On Mon, Jun 08, 2020 at 11:59:14AM +0530, Amit Kapila wrote: > I think we should first do performance testing to see what is the > overhead of making this default. I think pgbench read-write at > various scale factors would be a good starting point. Also, we should > see how much additional WAL it generates as compared to current > default. +1. I recall that changing wal_level to logical has been discussed in the past and performance was the actual take to debate on. -- Michael
Attachment
On Mon, Jun 8, 2020 at 8:46 AM Michael Paquier <michael@paquier.xyz> wrote:
On Mon, Jun 08, 2020 at 11:59:14AM +0530, Amit Kapila wrote:
> I think we should first do performance testing to see what is the
> overhead of making this default. I think pgbench read-write at
> various scale factors would be a good starting point. Also, we should
> see how much additional WAL it generates as compared to current
> default.
+1. I recall that changing wal_level to logical has been discussed in
the past and performance was the actual take to debate on.
That was at least the discussion (long-going and multi-repeated) before we upped it from minimal to replica. There were some pretty extensive benchmarking done to prove that the difference was very small, and this was weighed against the ability to take basic backups of the system (which arguably is more important than being able to do logical replication).
I agree that we should consider changing it *if* it does not come with a substantial overhead, but that has to be shown.
Of course, what would be even neater would be if it could be changed so you don't have to bounce the server to change the wal_level. That's a bigger change though, but perhaps it is now possible once we have the "global barriers" in 13?
On Mon, Jun 08, 2020 at 11:10:38AM +0200, Magnus Hagander wrote: >On Mon, Jun 8, 2020 at 8:46 AM Michael Paquier <michael@paquier.xyz> wrote: > >> On Mon, Jun 08, 2020 at 11:59:14AM +0530, Amit Kapila wrote: >> > I think we should first do performance testing to see what is the >> > overhead of making this default. I think pgbench read-write at >> > various scale factors would be a good starting point. Also, we should >> > see how much additional WAL it generates as compared to current >> > default. >> >> +1. I recall that changing wal_level to logical has been discussed in >> the past and performance was the actual take to debate on. >> > >That was at least the discussion (long-going and multi-repeated) before we >upped it from minimal to replica. There were some pretty extensive >benchmarking done to prove that the difference was very small, and this was >weighed against the ability to take basic backups of the system (which >arguably is more important than being able to do logical replication). > >I agree that we should consider changing it *if* it does not come with a >substantial overhead, but that has to be shown. > I agree performance evaluation is necessary, and I'm willing to spend some time on it. But I don't think the difference will be much worse than for the wal_level=replica, at least for common workloads. It's certainly possible to construct workloads with significant impact, due to the extra stuff (assignments, cache invalidations and so on). In general I think the case is somewhat weaker compared to the replica case, which was required for such basic things like physical backups. >Of course, what would be even neater would be if it could be changed so >you don't have to bounce the server to change the wal_level. That's a >bigger change though, but perhaps it is now possible once we have the >"global barriers" in 13? > Yeah. That would sidestep a lot of the performance concerns, because if switching from replica to logical is fairly easy / without restart, we could keep the current default. Not sure if it's sufficient, though, because switching to logical may require bumping up number of slots, walsenders, etc. At which point you actually need a restart. Not to mention that extensions using logical decoding (like pglogical) need to allocate shared memory etc. But for the built-in logical replication that is not an issue, ofc. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Mon, Jun 08, 2020 at 11:10:38AM +0200, Magnus Hagander wrote: > On Mon, Jun 8, 2020 at 8:46 AM Michael Paquier <michael@paquier.xyz> wrote: > > > On Mon, Jun 08, 2020 at 11:59:14AM +0530, Amit Kapila wrote: > > > I think we should first do performance testing to see what is the > > > overhead of making this default. I think pgbench read-write at > > > various scale factors would be a good starting point. Also, we should > > > see how much additional WAL it generates as compared to current > > > default. > > > > +1. I recall that changing wal_level to logical has been discussed in > > the past and performance was the actual take to debate on. > > > > That was at least the discussion (long-going and multi-repeated) before we > upped it from minimal to replica. There were some pretty extensive > benchmarking done to prove that the difference was very small, and this was > weighed against the ability to take basic backups of the system (which > arguably is more important than being able to do logical replication). I'd argue this a different direction. Logical replication has been at fundamental to how a lot of systems operate since Slony came out for the very good reason that it was far and away the simplest way to accomplish a bunch of design goals. There are now, and have been for some years, both free and proprietary systems whose sole purpose is change data capture. PostgreSQL can play nicely with those systems with this switch flipped on by default. Looking into the future of PostgreSQL itself, there are things we've been unable to do thus far that logical replication makes tractable. These include: - Zero downtime version changes - Substantive changes to our on-disk representations between versions because upgrading in place places sharp limits on what we could do. > I agree that we should consider changing it *if* it does not come > with a substantial overhead, but that has to be shown. What overhead would be substantial enough to require more work than changing the default, and under what circumstances? I ask this because on a heavily loaded system, the kind where differences could be practical as opposed to merely statistically significant, statement logging at even the most basic level is a much bigger burden than the maxed-out WALs are. Any overhead those WALs might impose simply disappears in the noise. The difference is even more stark in systems subject to audit. > Of course, what would be even neater would be if it could be changed > so you don't have to bounce the server to change the wal_level. > That's a bigger change though, but perhaps it is now possible once > we have the "global barriers" in 13? As much as I would love to have this capability, I was hoping to keep the scope of this contained. As pointed out down-thread, there's lots more to doing this dynamically that just turning up the wal_level. Best, David. -- David Fetter <david(at)fetter(dot)org> http://fetter.org/ Phone: +1 415 235 3778 Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
On 2020-Jun-08, Tomas Vondra wrote: > Not sure if it's sufficient, though, because switching to logical may > require bumping up number of slots, walsenders, etc. At which point you > actually need a restart. Not to mention that extensions using logical > decoding (like pglogical) need to allocate shared memory etc. But for > the built-in logical replication that is not an issue, ofc. I think it's reasonable to push our default limits for slots, walsenders, max_bgworkers etc a lot higher than current value (say 10 -> 100). An unused slot wastes essentially no resources; an unused walsender is just one PGPROC entry. If we did that, and also allowed wal_level to be changed on the fly, we wouldn't need to restart in order to enable logical replication, so there would be little or no pressure to change the wal_level default. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > I think it's reasonable to push our default limits for slots, > walsenders, max_bgworkers etc a lot higher than current value (say 10 -> > 100). An unused slot wastes essentially no resources; an unused > walsender is just one PGPROC entry. If we did that, and also allowed > wal_level to be changed on the fly, we wouldn't need to restart in order > to enable logical replication, so there would be little or no pressure > to change the wal_level default. Unused PGPROC entries will still consume semaphores, which is problematic on at least some OSes. It's not really clear to me why the default for walsenders would need to be O(100) anyway. The existing default of 10 already ought to be enough to cover approximately 99.999% of use cases. If we can allow wal_level to be changed on the fly, I agree that would help reduce the pressure to make the default setting more expensive. I don't recall why it's PGC_POSTMASTER right now, but I suppose there was a reason for that ... regards, tom lane
On Mon, Jun 8, 2020 at 1:16 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > I think it's reasonable to push our default limits for slots, > walsenders, max_bgworkers etc a lot higher than current value (say 10 -> > 100). An unused slot wastes essentially no resources; an unused > walsender is just one PGPROC entry. If we did that, and also allowed > wal_level to be changed on the fly, we wouldn't need to restart in order > to enable logical replication, so there would be little or no pressure > to change the wal_level default. Wouldn't having a whole bunch of extra PGPROC entries have negative implications for the performance of GetSnapshotData() and other things that don't scale well at high connection counts? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Jun 08, 2020 at 02:58:03PM -0400, Robert Haas wrote: > On Mon, Jun 8, 2020 at 1:16 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > > I think it's reasonable to push our default limits for slots, > > walsenders, max_bgworkers etc a lot higher than current value (say 10 -> > > 100). An unused slot wastes essentially no resources; an unused > > walsender is just one PGPROC entry. If we did that, and also allowed > > wal_level to be changed on the fly, we wouldn't need to restart in order > > to enable logical replication, so there would be little or no pressure > > to change the wal_level default. > > Wouldn't having a whole bunch of extra PGPROC entries have negative > implications for the performance of GetSnapshotData() and other things > that don't scale well at high connection counts? > +1 I think just having the defaults raised enough to allow even a couple DB replication slots would be advantageous and allow it to be used to address spur of the moment needs for systems that need to stay up. It does seem wasteful to by default support large numbers of slots and seems to be contrary to the project stance on initial limits. Regards, Ken
On Mon, Jun 8, 2020 at 5:11 AM Magnus Hagander <magnus@hagander.net> wrote: > I agree that we should consider changing it *if* it does not come with a substantial overhead, but that has to be shown. I think the big overhead is that you log the old version of each row's primary key (or whatever the replica identity is) when performing an UPDATE or DELETE. So if you test it with integer keys probably it's not bad, and I suspect (though I haven't looked) that we don't do the extra logging when they key hasn't changed. But if you have wide text columns as keys and you update them a lot then things might not look so good. I think in the bad cases for this feature the overhead is vastly more than going from minimal to replica. As many people here probably know, I am in general skeptical of this kind of change. It's based on the premise that reconfiguring the system is either too hard for users to figure out, or too disruptive because they'll need a restart. I tend to feel that the first problem should be solved by making it easier to figure out, and the second one should be solved by not requiring a restart. I don't think that's easy engineering, because while I think barriers help, they only address one relatively small aspect of what's probably a pretty difficult engineering problem. But the real-life analogue of what's being proposed here seems to be "the people who are buying this house might not be able to figure out how to turn the lights on if they need more light, so let's just turn on all the lights before we hand over the keys, and that way if they just leave them on forever it'll be cool." To which any reasonable person would say - "if your electrical switches are that hard to locate and use, that house has got a serious design problem." -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Jun 8, 2020 at 12:09 PM Robert Haas <robertmhaas@gmail.com> wrote: > I think the big overhead is that you log the old version of each row's > primary key (or whatever the replica identity is) when performing an > UPDATE or DELETE. So if you test it with integer keys probably it's > not bad, and I suspect (though I haven't looked) that we don't do the > extra logging when they key hasn't changed. But if you have wide text > columns as keys and you update them a lot then things might not look > so good. I think in the bad cases for this feature the overhead is > vastly more than going from minimal to replica. > > As many people here probably know, I am in general skeptical of this > kind of change. It's based on the premise that reconfiguring the > system is either too hard for users to figure out, or too disruptive > because they'll need a restart. I completely agree with your analysis, and your conclusions. -- Peter Geoghegan
On 2020-Jun-08, Robert Haas wrote: > On Mon, Jun 8, 2020 at 1:16 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > > I think it's reasonable to push our default limits for slots, > > walsenders, max_bgworkers etc a lot higher than current value (say 10 -> > > 100). An unused slot wastes essentially no resources; an unused > > walsender is just one PGPROC entry. If we did that, and also allowed > > wal_level to be changed on the fly, we wouldn't need to restart in order > > to enable logical replication, so there would be little or no pressure > > to change the wal_level default. > > Wouldn't having a whole bunch of extra PGPROC entries have negative > implications for the performance of GetSnapshotData() and other things > that don't scale well at high connection counts? On a quantum-mechanics level, sure, but after Andres's snapshot scalability patches, will it be measurable? (Besides, if your workload is so high that you're measurably affected by the additional unused PGPROC entries, you can always tune it lower.) -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Mon, Jun 8, 2020 at 12:28 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > On a quantum-mechanics level, sure, but after Andres's snapshot > scalability patches, will it be measurable? (Besides, if your workload > is so high that you're measurably affected by the additional unused > PGPROC entries, you can always tune it lower.) The point that Robert went on to make about the increased WAL volume from logging old primary key (or replica identity) values was a stronger argument, IMV. -- Peter Geoghegan
Hi, On 2020-06-08 14:58:03 -0400, Robert Haas wrote: > On Mon, Jun 8, 2020 at 1:16 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > > I think it's reasonable to push our default limits for slots, > > walsenders, max_bgworkers etc a lot higher than current value (say 10 -> > > 100). An unused slot wastes essentially no resources; an unused > > walsender is just one PGPROC entry. If we did that, and also allowed > > wal_level to be changed on the fly, we wouldn't need to restart in order > > to enable logical replication, so there would be little or no pressure > > to change the wal_level default. > > Wouldn't having a whole bunch of extra PGPROC entries have negative > implications for the performance of GetSnapshotData() and other things > that don't scale well at high connection counts? Some things yes, but I don't think it should have a significant effect on GetSnapshotData(): We currently don't touch unused PGPROCs for it (by virtue of procArray->pgprocnos), and we wouldn't with my proposed / pending changes (because the relevant arrays contain data for connected backends at the "front"). You can have some effect if you have temporary spikes to very high connection counts, and then reduce that again. That can lead to a lot of unused PGXACT entries being interleaved with used ones, leading to higher cache miss ratios (data cache as well as tlb). But that cannot become a problem without those PGPROCs ever being used, because IIRC we otherwise ensure they're used "densely". There are a few places where we actually look over all PGPROCs, or size resources according to that, but I think most of those shouldn't be in hot paths. There are also effects like the lock hashtables being sized larger, which then can reduce the cache hit ratio. I guess we could check whether it'd make sense to charge less than max_locks_per_transaction for everything but user processes, but I'm a bit doubtful it's worth it. Greetings, Andres Freund
Hi, On 2020-06-08 13:27:50 -0400, Tom Lane wrote: > If we can allow wal_level to be changed on the fly, I agree that would > help reduce the pressure to make the default setting more expensive. > I don't recall why it's PGC_POSTMASTER right now, but I suppose there > was a reason for that ... There's reasons, but IIRC they're all solvable with reasonable effort. I think most of it boils down to only being able to rely on the new wal_level after a while. For minimal->recovery we basically need a checkpoint started after the change in configuration, and for recovery->logical we need to wait until all sessions have a) read the new config setting b) finished the transaction that used the old setting. Greetings, Andres Freund
On 2020-06-08 23:32, Andres Freund wrote: > On 2020-06-08 13:27:50 -0400, Tom Lane wrote: >> If we can allow wal_level to be changed on the fly, I agree that would >> help reduce the pressure to make the default setting more expensive. >> I don't recall why it's PGC_POSTMASTER right now, but I suppose there >> was a reason for that ... > > There's reasons, but IIRC they're all solvable with reasonable effort. I > think most of it boils down to only being able to rely on the new > wal_level after a while. For minimal->recovery we basically need a > checkpoint started after the change in configuration, and for > recovery->logical we need to wait until all sessions have a) read the > new config setting b) finished the transaction that used the old > setting. The best behavior from a user's perspective would be if the WAL level automatically switched to logical if logical replication slots are present. You might not even need 'logical' as an actual value of wal_level anymore, you just need to keep a flag in shared memory that records whether at least one logical slot exists. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
At Tue, 9 Jun 2020 08:52:24 +0200, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote in > On 2020-06-08 23:32, Andres Freund wrote: > > On 2020-06-08 13:27:50 -0400, Tom Lane wrote: > >> If we can allow wal_level to be changed on the fly, I agree that would > >> help reduce the pressure to make the default setting more expensive. > >> I don't recall why it's PGC_POSTMASTER right now, but I suppose there > >> was a reason for that ... > > There's reasons, but IIRC they're all solvable with reasonable > > effort. I > > think most of it boils down to only being able to rely on the new > > wal_level after a while. For minimal->recovery we basically need a > > checkpoint started after the change in configuration, and for > > recovery->logical we need to wait until all sessions have a) read the > > new config setting b) finished the transaction that used the old > > setting. > > The best behavior from a user's perspective would be if the WAL level > automatically switched to logical if logical replication slots are > present. You might not even need 'logical' as an actual value of > wal_level anymore, you just need to keep a flag in shared memory that > records whether at least one logical slot exists. Currently logical slots cannot be created while wal_level < logical. Thus a database that has a logical slot must have been once executed with wal_level >= logical before the creation of the slot. It seems to me setting wal_level = logical would be better than creating a dummy logical slot while initdb.. Coulnd't we add an option to speicfy wal_level for initdb? Or an option to append arbitrary config lines to postgresql.conf? regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On Tue, Jun 9, 2020 at 10:53 AM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
At Tue, 9 Jun 2020 08:52:24 +0200, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote in
> On 2020-06-08 23:32, Andres Freund wrote:
> > On 2020-06-08 13:27:50 -0400, Tom Lane wrote:
> >> If we can allow wal_level to be changed on the fly, I agree that would
> >> help reduce the pressure to make the default setting more expensive.
> >> I don't recall why it's PGC_POSTMASTER right now, but I suppose there
> >> was a reason for that ...
> > There's reasons, but IIRC they're all solvable with reasonable
> > effort. I
> > think most of it boils down to only being able to rely on the new
> > wal_level after a while. For minimal->recovery we basically need a
> > checkpoint started after the change in configuration, and for
> > recovery->logical we need to wait until all sessions have a) read the
> > new config setting b) finished the transaction that used the old
> > setting.
>
> The best behavior from a user's perspective would be if the WAL level
> automatically switched to logical if logical replication slots are
> present. You might not even need 'logical' as an actual value of
> wal_level anymore, you just need to keep a flag in shared memory that
> records whether at least one logical slot exists.
Currently logical slots cannot be created while wal_level <
logical. Thus a database that has a logical slot must have been once
executed with wal_level >= logical before the creation of the slot.
It seems to me setting wal_level = logical would be better than
creating a dummy logical slot while initdb..
I don't think Peter is suggesting a dummy slot. What he's suggesting is allow the creation of a logical slot even when wal_level is not set to logical, and instead automatically raise the wal level to the equivalent of logical when you do. That way, the operation becomes transparent to the user.
Coulnd't we add an option to speicfy wal_level for initdb? Or an
option to append arbitrary config lines to postgresql.conf?
That wouldn't solve the problem David raised initially. The whole reason for being able to do this is that you *didn't*k now when you did initdb that you were going to need logical replication at a later stage. So you are already in front of a running cluster with wal_level=replica, and now the cost of turning it to logical includes restarting the db and kicking all sessions out. If you know when you're setting the system up that you're going to need it, then the work of setting it isn't that big.
It might be useful to have a functionality to append arbitrary config lines in initdb, but then that's not all that different from just copying in a pre-made postgresql.auto.conf file into the newly initialized cluster after it's done -- I'm not sure it buys very much.
On Tue, Jun 9, 2020 at 2:31 PM Magnus Hagander <magnus@hagander.net> wrote: > > On Tue, Jun 9, 2020 at 10:53 AM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: >> >> At Tue, 9 Jun 2020 08:52:24 +0200, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote in >> > On 2020-06-08 23:32, Andres Freund wrote: >> > > On 2020-06-08 13:27:50 -0400, Tom Lane wrote: >> > >> If we can allow wal_level to be changed on the fly, I agree that would >> > >> help reduce the pressure to make the default setting more expensive. >> > >> I don't recall why it's PGC_POSTMASTER right now, but I suppose there >> > >> was a reason for that ... >> > > There's reasons, but IIRC they're all solvable with reasonable >> > > effort. I >> > > think most of it boils down to only being able to rely on the new >> > > wal_level after a while. For minimal->recovery we basically need a >> > > checkpoint started after the change in configuration, and for >> > > recovery->logical we need to wait until all sessions have a) read the >> > > new config setting b) finished the transaction that used the old >> > > setting. >> > >> > The best behavior from a user's perspective would be if the WAL level >> > automatically switched to logical if logical replication slots are >> > present. You might not even need 'logical' as an actual value of >> > wal_level anymore, you just need to keep a flag in shared memory that >> > records whether at least one logical slot exists. >> >> Currently logical slots cannot be created while wal_level < >> logical. Thus a database that has a logical slot must have been once >> executed with wal_level >= logical before the creation of the slot. >> I think the creation of slot would take a lot more time in that case as it needs to wait for existing transactions to finish which I feel could be confusing to users. Sure, the cost would have to be incurred the first time but still the user might tempt to cancel such an operation if he is not aware of the internals. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
On Tue, Jun 9, 2020 at 1:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
On Tue, Jun 9, 2020 at 2:31 PM Magnus Hagander <magnus@hagander.net> wrote:
>
> On Tue, Jun 9, 2020 at 10:53 AM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
>>
>> At Tue, 9 Jun 2020 08:52:24 +0200, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote in
>> > On 2020-06-08 23:32, Andres Freund wrote:
>> > > On 2020-06-08 13:27:50 -0400, Tom Lane wrote:
>> > >> If we can allow wal_level to be changed on the fly, I agree that would
>> > >> help reduce the pressure to make the default setting more expensive.
>> > >> I don't recall why it's PGC_POSTMASTER right now, but I suppose there
>> > >> was a reason for that ...
>> > > There's reasons, but IIRC they're all solvable with reasonable
>> > > effort. I
>> > > think most of it boils down to only being able to rely on the new
>> > > wal_level after a while. For minimal->recovery we basically need a
>> > > checkpoint started after the change in configuration, and for
>> > > recovery->logical we need to wait until all sessions have a) read the
>> > > new config setting b) finished the transaction that used the old
>> > > setting.
>> >
>> > The best behavior from a user's perspective would be if the WAL level
>> > automatically switched to logical if logical replication slots are
>> > present. You might not even need 'logical' as an actual value of
>> > wal_level anymore, you just need to keep a flag in shared memory that
>> > records whether at least one logical slot exists.
>>
>> Currently logical slots cannot be created while wal_level <
>> logical. Thus a database that has a logical slot must have been once
>> executed with wal_level >= logical before the creation of the slot.
>>
I think the creation of slot would take a lot more time in that case
as it needs to wait for existing transactions to finish which I feel
could be confusing to users. Sure, the cost would have to be incurred
the first time but still the user might tempt to cancel such an
operation if he is not aware of the internals.
Yeah, I am unsure if this is doable, but I think that's what Peter was trying to explain, because that's what would be most user-friendly. But it may definitely not be worth the complexity, I'm guessing.
Being able to change wal_level on reload instead of restart would be less user friendly than that, but more user friendly than what we have now.
On Tue, Jun 9, 2020 at 4:58 PM Magnus Hagander <magnus@hagander.net> wrote: > > On Tue, Jun 9, 2020 at 1:20 PM Amit Kapila <amit.kapila16@gmail.com> wrote: >> >> On Tue, Jun 9, 2020 at 2:31 PM Magnus Hagander <magnus@hagander.net> wrote: >> > >> > On Tue, Jun 9, 2020 at 10:53 AM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: >> >> >> >> At Tue, 9 Jun 2020 08:52:24 +0200, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote in >> >> > On 2020-06-08 23:32, Andres Freund wrote: >> >> > > On 2020-06-08 13:27:50 -0400, Tom Lane wrote: >> >> > >> If we can allow wal_level to be changed on the fly, I agree that would >> >> > >> help reduce the pressure to make the default setting more expensive. >> >> > >> I don't recall why it's PGC_POSTMASTER right now, but I suppose there >> >> > >> was a reason for that ... >> >> > > There's reasons, but IIRC they're all solvable with reasonable >> >> > > effort. I >> >> > > think most of it boils down to only being able to rely on the new >> >> > > wal_level after a while. For minimal->recovery we basically need a >> >> > > checkpoint started after the change in configuration, and for >> >> > > recovery->logical we need to wait until all sessions have a) read the >> >> > > new config setting b) finished the transaction that used the old >> >> > > setting. >> >> > >> >> > The best behavior from a user's perspective would be if the WAL level >> >> > automatically switched to logical if logical replication slots are >> >> > present. You might not even need 'logical' as an actual value of >> >> > wal_level anymore, you just need to keep a flag in shared memory that >> >> > records whether at least one logical slot exists. >> >> >> >> Currently logical slots cannot be created while wal_level < >> >> logical. Thus a database that has a logical slot must have been once >> >> executed with wal_level >= logical before the creation of the slot. >> >> >> >> I think the creation of slot would take a lot more time in that case >> as it needs to wait for existing transactions to finish which I feel >> could be confusing to users. Sure, the cost would have to be incurred >> the first time but still the user might tempt to cancel such an >> operation if he is not aware of the internals. > > > Yeah, I am unsure if this is doable, but I think that's what Peter was trying to explain, because that's what would bemost user-friendly. But it may definitely not be worth the complexity, I'm guessing. > Also, I think we might need to think shall we allow wal_level to be changed back to replica? If so, how? -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
On Tue, Jun 9, 2020 at 3:02 AM Andres Freund <andres@anarazel.de> wrote: > > Hi, > > On 2020-06-08 13:27:50 -0400, Tom Lane wrote: > > If we can allow wal_level to be changed on the fly, I agree that would > > help reduce the pressure to make the default setting more expensive. > > I don't recall why it's PGC_POSTMASTER right now, but I suppose there > > was a reason for that ... > > There's reasons, but IIRC they're all solvable with reasonable effort. I > think most of it boils down to only being able to rely on the new > wal_level after a while. For minimal->recovery we basically need a > checkpoint started after the change in configuration, and for > recovery->logical we need to wait until all sessions have a) read the > new config setting b) finished the transaction that used the old > setting. > What if we note down the highest transaction id when we set wal_level = logical and won't allow a snapshot in logical decoding to reach a consistent state till we see at least that xid as committed? I think this will mean that it won't allow to decode any transaction which is operated with wal_level < logical and that might serve the purpose. -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
On Tue, Jun 9, 2020 at 6:57 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > > On Tue, Jun 9, 2020 at 3:02 AM Andres Freund <andres@anarazel.de> wrote: > > > > Hi, > > > > On 2020-06-08 13:27:50 -0400, Tom Lane wrote: > > > If we can allow wal_level to be changed on the fly, I agree that would > > > help reduce the pressure to make the default setting more expensive. > > > I don't recall why it's PGC_POSTMASTER right now, but I suppose there > > > was a reason for that ... > > > > There's reasons, but IIRC they're all solvable with reasonable effort. I > > think most of it boils down to only being able to rely on the new > > wal_level after a while. For minimal->recovery we basically need a > > checkpoint started after the change in configuration, and for > > recovery->logical we need to wait until all sessions have a) read the > > new config setting b) finished the transaction that used the old > > setting. > > > > What if we note down the highest transaction id when we set wal_level > = logical and won't allow a snapshot in logical decoding to reach a > consistent state till we see at least that xid as committed? I think > this will mean that it won't allow to decode any transaction which is > operated with wal_level < logical and that might serve the purpose. > I intend to say that if the above is possible then we don't need to wait for (b). -- With Regards, Amit Kapila. EnterpriseDB: http://www.enterprisedb.com
Hi, On 2020-06-09 08:52:24 +0200, Peter Eisentraut wrote: > On 2020-06-08 23:32, Andres Freund wrote: > > On 2020-06-08 13:27:50 -0400, Tom Lane wrote: > > > If we can allow wal_level to be changed on the fly, I agree that would > > > help reduce the pressure to make the default setting more expensive. > > > I don't recall why it's PGC_POSTMASTER right now, but I suppose there > > > was a reason for that ... > > > > There's reasons, but IIRC they're all solvable with reasonable effort. I > > think most of it boils down to only being able to rely on the new > > wal_level after a while. For minimal->recovery we basically need a > > checkpoint started after the change in configuration, and for > > recovery->logical we need to wait until all sessions have a) read the > > new config setting b) finished the transaction that used the old > > setting. > > The best behavior from a user's perspective would be if the WAL level > automatically switched to logical if logical replication slots are present. > You might not even need 'logical' as an actual value of wal_level anymore, > you just need to keep a flag in shared memory that records whether at least > one logical slot exists. Yea, it'd be good to have that. But you'd need the same type of coordination mentioned above, no? Greetings, Andres Freund