Thread: pgsql: walreceiver uses a temporary replication slot by default
walreceiver uses a temporary replication slot by default If no permanent replication slot is configured using primary_slot_name, the walreceiver now creates and uses a temporary replication slot. A new setting wal_receiver_create_temp_slot can be used to disable this behavior, for example, if the remote instance is out of replication slots. Reviewed-by: Masahiko Sawada <masahiko.sawada@2ndquadrant.com> Discussion: https://www.postgresql.org/message-id/CA%2Bfd4k4dM0iEPLxyVyme2RAFsn8SUgrNtBJOu81YqTY4V%2BnqZA%40mail.gmail.com Branch ------ master Details ------- https://git.postgresql.org/pg/commitdiff/329730827848f61eb8d353d5addcbd885fa823da Modified Files -------------- doc/src/sgml/config.sgml | 20 +++++++++++ .../libpqwalreceiver/libpqwalreceiver.c | 4 +++ src/backend/replication/walreceiver.c | 41 ++++++++++++++++++++++ src/backend/utils/misc/guc.c | 9 +++++ src/backend/utils/misc/postgresql.conf.sample | 1 + src/include/replication/walreceiver.h | 7 ++++ 6 files changed, 82 insertions(+)
On Tue, Jan 14, 2020 at 8:57 AM Peter Eisentraut <peter@eisentraut.org> wrote: > walreceiver uses a temporary replication slot by default > > If no permanent replication slot is configured using > primary_slot_name, the walreceiver now creates and uses a temporary > replication slot. A new setting wal_receiver_create_temp_slot can be > used to disable this behavior, for example, if the remote instance is > out of replication slots. > > Reviewed-by: Masahiko Sawada <masahiko.sawada@2ndquadrant.com> > Discussion: https://www.postgresql.org/message-id/CA%2Bfd4k4dM0iEPLxyVyme2RAFsn8SUgrNtBJOu81YqTY4V%2BnqZA%40mail.gmail.com Neither the commit message for this patch nor any of the comments in the patch seem to explain why this is a desirable change. I assume that's probably discussed on the thread that is linked here, but you shouldn't have to dig through the discussion thread to figure out what the benefits of a change like this are. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 2020-01-23 21:49, Robert Haas wrote: > On Tue, Jan 14, 2020 at 8:57 AM Peter Eisentraut <peter@eisentraut.org> wrote: >> walreceiver uses a temporary replication slot by default >> >> If no permanent replication slot is configured using >> primary_slot_name, the walreceiver now creates and uses a temporary >> replication slot. A new setting wal_receiver_create_temp_slot can be >> used to disable this behavior, for example, if the remote instance is >> out of replication slots. >> >> Reviewed-by: Masahiko Sawada <masahiko.sawada@2ndquadrant.com> >> Discussion: https://www.postgresql.org/message-id/CA%2Bfd4k4dM0iEPLxyVyme2RAFsn8SUgrNtBJOu81YqTY4V%2BnqZA%40mail.gmail.com > > Neither the commit message for this patch nor any of the comments in > the patch seem to explain why this is a desirable change. > > I assume that's probably discussed on the thread that is linked here, > but you shouldn't have to dig through the discussion thread to figure > out what the benefits of a change like this are. You are right, this has gotten a bit lost in the big thread. The rationale is basically the same as why client-side tools like pg_basebackup use a temporary slot: So that the WAL data that they are interested in doesn't disappear while they are connected. -- Peter Eisentraut http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Re: pgsql: walreceiver uses a temporary replication slot by default
From
Jehan-Guillaume de Rorthais
Date:
Hello, On Mon, 10 Feb 2020 16:37:53 +0100 Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > On 2020-01-23 21:49, Robert Haas wrote: > > On Tue, Jan 14, 2020 at 8:57 AM Peter Eisentraut <peter@eisentraut.org> > > wrote: > >> walreceiver uses a temporary replication slot by default > >> > >> If no permanent replication slot is configured using > >> primary_slot_name, the walreceiver now creates and uses a temporary > >> replication slot. A new setting wal_receiver_create_temp_slot can be > >> used to disable this behavior, for example, if the remote instance is > >> out of replication slots. > >> > >> Reviewed-by: Masahiko Sawada <masahiko.sawada@2ndquadrant.com> > >> Discussion: > >> https://www.postgresql.org/message-id/CA%2Bfd4k4dM0iEPLxyVyme2RAFsn8SUgrNtBJOu81YqTY4V%2BnqZA%40mail.gmail.com > > > > Neither the commit message for this patch nor any of the comments in > > the patch seem to explain why this is a desirable change. > > > > I assume that's probably discussed on the thread that is linked here, > > but you shouldn't have to dig through the discussion thread to figure > > out what the benefits of a change like this are. > > You are right, this has gotten a bit lost in the big thread. > > The rationale is basically the same as why client-side tools like > pg_basebackup use a temporary slot: So that the WAL data that they are > interested in doesn't disappear while they are connected. In my humble opinion, I prefer the previous behavior, streaming without temporary slot, for one reason: primary availability. Should the standby lag far behind the primary (no matter the root cause), the standby was disconnected because of missing WAL. Worst case scenario, we must rebuild it, hopefully from backups. Best case scenario, it fetches WALs from PITR backup. As soon as the later is possible in the stack, I consider slot like a burden from the operability point of view. If standbys can not fetch archived WAL from PITR, then we can consider slots. With temp slot created by default, if one standby lag far behind, it can make the primary unavailable. We have nothing yet to forbid a slot to fill the pg_wal partition. How new users creating their first cluster would react in such situation? I suppose the original discussion was mostly targeting them? Recovering from this is way more scary than building a standby. So the default behavior might not be desirable and maybe wal_receiver_create_temp_slot might be off by default? Note that Kyotaro HORIGUCHI is working on a patch to restricting maximum keep segments by repslots: https://www.postgresql.org/message-id/flat/20190627162256.4f4872b8%40firost#6cba1177f766e7ffa5237789e748da38 Regards,
On 2020/02/12 7:53, Jehan-Guillaume de Rorthais wrote: > Hello, > > On Mon, 10 Feb 2020 16:37:53 +0100 > Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote: > >> On 2020-01-23 21:49, Robert Haas wrote: >>> On Tue, Jan 14, 2020 at 8:57 AM Peter Eisentraut <peter@eisentraut.org> >>> wrote: >>>> walreceiver uses a temporary replication slot by default >>>> >>>> If no permanent replication slot is configured using >>>> primary_slot_name, the walreceiver now creates and uses a temporary >>>> replication slot. A new setting wal_receiver_create_temp_slot can be >>>> used to disable this behavior, for example, if the remote instance is >>>> out of replication slots. >>>> >>>> Reviewed-by: Masahiko Sawada <masahiko.sawada@2ndquadrant.com> >>>> Discussion: >>>> https://www.postgresql.org/message-id/CA%2Bfd4k4dM0iEPLxyVyme2RAFsn8SUgrNtBJOu81YqTY4V%2BnqZA%40mail.gmail.com >>> >>> Neither the commit message for this patch nor any of the comments in >>> the patch seem to explain why this is a desirable change. >>> >>> I assume that's probably discussed on the thread that is linked here, >>> but you shouldn't have to dig through the discussion thread to figure >>> out what the benefits of a change like this are. >> >> You are right, this has gotten a bit lost in the big thread. >> >> The rationale is basically the same as why client-side tools like >> pg_basebackup use a temporary slot: So that the WAL data that they are >> interested in doesn't disappear while they are connected. > > In my humble opinion, I prefer the previous behavior, streaming without > temporary slot, for one reason: primary availability. +1 > Should the standby lag far behind the primary (no matter the root cause), > the standby was disconnected because of missing WAL. Worst case scenario, we > must rebuild it, hopefully from backups. Best case scenario, it fetches WALs > from PITR backup. As soon as the later is possible in the stack, I consider slot > like a burden from the operability point of view. If standbys can not fetch > archived WAL from PITR, then we can consider slots. > > With temp slot created by default, if one standby lag far behind, it can make > the primary unavailable. We have nothing yet to forbid a slot to fill the > pg_wal partition. How new users creating their first cluster would react in such > situation? I suppose the original discussion was mostly targeting them? > Recovering from this is way more scary than building a standby. > > So the default behavior might not be desirable and maybe > wal_receiver_create_temp_slot might be off by default? > > Note that Kyotaro HORIGUCHI is working on a patch to restricting maximum keep > segments by repslots: > > https://www.postgresql.org/message-id/flat/20190627162256.4f4872b8%40firost#6cba1177f766e7ffa5237789e748da38 Yeah, I think it's better to disable this option until something like Horiguchi-san's proposal will have been committed, i.e., until the upper limit on the number (or size) of WAL files that remain for slots become configurable. Regards, -- Fujii Masao NTT DATA CORPORATION Advanced Platform Technology Group Research and Development Headquarters
On Wed, Feb 12, 2020 at 06:11:06PM +0900, Fujii Masao wrote: > On 2020/02/12 7:53, Jehan-Guillaume de Rorthais wrote: >> In my humble opinion, I prefer the previous behavior, streaming without >> temporary slot, for one reason: primary availability. > > +1 > >> With temp slot created by default, if one standby lag far behind, it can make >> the primary unavailable. We have nothing yet to forbid a slot to fill the >> pg_wal partition. How new users creating their first cluster would react in such >> situation? I suppose the original discussion was mostly targeting them? >> Recovering from this is way more scary than building a standby. >> >> So the default behavior might not be desirable and maybe >> wal_receiver_create_temp_slot might be off by default? >> >> Note that Kyotaro HORIGUCHI is working on a patch to restricting maximum keep >> segments by repslots: >> >> https://www.postgresql.org/message-id/flat/20190627162256.4f4872b8%40firost#6cba1177f766e7ffa5237789e748da38 > > Yeah, I think it's better to disable this option until something like > Horiguchi-san's proposal will have been committed, i.e., until > the upper limit on the number (or size) of WAL files that remain > for slots become configurable. Even with that, are we sure this extra feature would be a reason sufficient to change the default value of this option to be enabled? I am not sure about that either. My opinion is that this option is useful to have and that it is not really a problem if you have slot monitoring on the primary (or a standby for cascading). And I'd like to believe that it is a common practice lately for base backups, archivers based on pg_receivewal or even logical decoding, but it could be surprising for some users who do not do that yet. So Jehan-Guillaume's arguments sound also sensible to me (he also maintains an automatic failover solution called PAF). From what I can see nobody really likes the current state of things for this option, and that does not come down only to its default value. The default GUC value and the way the parameter is loaded by the WAL sender are problematic, still easy enough to fix. How do we move on from here? I could post a patch based on what Sergei Kornilov has sent around [1], but that's Peter's feature. Any opinions? [1]: https://www.postgresql.org/message-id/20200122055510.GH174860@paquier.xyz -- Michael
Attachment
At Thu, 13 Feb 2020 16:48:21 +0900, Michael Paquier <michael@paquier.xyz> wrote in > On Wed, Feb 12, 2020 at 06:11:06PM +0900, Fujii Masao wrote: > > On 2020/02/12 7:53, Jehan-Guillaume de Rorthais wrote: > >> In my humble opinion, I prefer the previous behavior, streaming without > >> temporary slot, for one reason: primary availability. > > > > +1 > > > >> With temp slot created by default, if one standby lag far behind, it can make > >> the primary unavailable. We have nothing yet to forbid a slot to fill the > >> pg_wal partition. How new users creating their first cluster would react in such > >> situation? I suppose the original discussion was mostly targeting them? > >> Recovering from this is way more scary than building a standby. > >> > >> So the default behavior might not be desirable and maybe > >> wal_receiver_create_temp_slot might be off by default? > >> > >> Note that Kyotaro HORIGUCHI is working on a patch to restricting maximum keep > >> segments by repslots: > >> > >> https://www.postgresql.org/message-id/flat/20190627162256.4f4872b8%40firost#6cba1177f766e7ffa5237789e748da38 > > > > Yeah, I think it's better to disable this option until something like > > Horiguchi-san's proposal will have been committed, i.e., until > > the upper limit on the number (or size) of WAL files that remain > > for slots become configurable. > > Even with that, are we sure this extra feature would be a reason > sufficient to change the default value of this option to be enabled? I think the feature (slot limit) is not going to be an reason to enable it (tmp slot). In the first place I think we cannot determine the default value generally workable.. > I am not sure about that either. My opinion is that this option is > useful to have and that it is not really a problem if you have slot > monitoring on the primary (or a standby for cascading). And I'd like > to believe that it is a common practice lately for base backups, > archivers based on pg_receivewal or even logical decoding, but it > could be surprising for some users who do not do that yet. So > Jehan-Guillaume's arguments sound also sensible to me (he also > maintains an automatic failover solution called PAF). > > From what I can see nobody really likes the current state of things > for this option, and that does not come down only to its default > value. The default GUC value and the way the parameter is loaded by > the WAL sender are problematic, still easy enough to fix. How do we > move on from here? I could post a patch based on what Sergei Kornilov > has sent around [1], but that's Peter's feature. Any opinions? > > [1]: https://www.postgresql.org/message-id/20200122055510.GH174860@paquier.xyz regards. -- Kyotaro Horiguchi NTT Open Source Software Center