Re: pgsql: walreceiver uses a temporary replication slot by default - Mailing list pgsql-committers

From Jehan-Guillaume de Rorthais
Subject Re: pgsql: walreceiver uses a temporary replication slot by default
Date
Msg-id 20200211235326.412e3fe2@firost
Whole thread Raw
In response to Re: pgsql: walreceiver uses a temporary replication slot by default  (Peter Eisentraut <peter.eisentraut@2ndquadrant.com>)
Responses Re: pgsql: walreceiver uses a temporary replication slot by default
List pgsql-committers
Hello,

On Mon, 10 Feb 2020 16:37:53 +0100
Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:

> On 2020-01-23 21:49, Robert Haas wrote:
> > On Tue, Jan 14, 2020 at 8:57 AM Peter Eisentraut <peter@eisentraut.org>
> > wrote:  
> >> walreceiver uses a temporary replication slot by default
> >>
> >> If no permanent replication slot is configured using
> >> primary_slot_name, the walreceiver now creates and uses a temporary
> >> replication slot.  A new setting wal_receiver_create_temp_slot can be
> >> used to disable this behavior, for example, if the remote instance is
> >> out of replication slots.
> >>
> >> Reviewed-by: Masahiko Sawada <masahiko.sawada@2ndquadrant.com>
> >> Discussion:
> >> https://www.postgresql.org/message-id/CA%2Bfd4k4dM0iEPLxyVyme2RAFsn8SUgrNtBJOu81YqTY4V%2BnqZA%40mail.gmail.com  
> > 
> > Neither the commit message for this patch nor any of the comments in
> > the patch seem to explain why this is a desirable change.
> > 
> > I assume that's probably discussed on the thread that is linked here,
> > but you shouldn't have to dig through the discussion thread to figure
> > out what the benefits of a change like this are.  
> 
> You are right, this has gotten a bit lost in the big thread.
> 
> The rationale is basically the same as why client-side tools like 
> pg_basebackup use a temporary slot: So that the WAL data that they are 
> interested in doesn't disappear while they are connected.

In my humble opinion, I prefer the previous behavior, streaming without
temporary slot, for one reason: primary availability. 

Should the standby lag far behind the primary (no matter the root cause),
the standby was disconnected because of missing WAL. Worst case scenario, we
must rebuild it, hopefully from backups. Best case scenario, it fetches WALs
from PITR backup. As soon as the later is possible in the stack, I consider slot
like a burden from the operability point of view. If standbys can not fetch
archived WAL from PITR, then we can consider slots.

With temp slot created by default, if one standby lag far behind, it can make
the primary unavailable. We have nothing yet to forbid a slot to fill the
pg_wal partition. How new users creating their first cluster would react in such
situation? I suppose the original discussion was mostly targeting them?
Recovering from this is way more scary than building a standby.

So the default behavior might not be desirable and maybe
wal_receiver_create_temp_slot might be off by default?

Note that Kyotaro HORIGUCHI is working on a patch to restricting maximum keep
segments by repslots:

https://www.postgresql.org/message-id/flat/20190627162256.4f4872b8%40firost#6cba1177f766e7ffa5237789e748da38

Regards,



pgsql-committers by date:

Previous
From: Thomas Munro
Date:
Subject: pgsql: Use pg_pwrite() in more places.
Next
From: Peter Eisentraut
Date:
Subject: pgsql: Document the pg_upgrade -j/--jobs option as taking an argument