Re: pg_basebackup --create-slot-if-not-exists? - Mailing list pgsql-hackers

From Ashwin Agrawal
Subject Re: pg_basebackup --create-slot-if-not-exists?
Date
Msg-id CAKSySwcqnaDwDzHxyNSELTy1FzQe72k4bs33B678TPh55H_e1g@mail.gmail.com
Whole thread Raw
In response to Re: pg_basebackup --create-slot-if-not-exists?  ("David G. Johnston" <david.g.johnston@gmail.com>)
List pgsql-hackers
On Wed, Sep 21, 2022 at 5:34 PM David G. Johnston <david.g.johnston@gmail.com> wrote:
On Wednesday, September 21, 2022, Ashwin Agrawal <ashwinstar@gmail.com> wrote:
Currently, pg_basebackup has
--create-slot option to create slot if not already exists or
--slot to use existing slot

Which means it needs knowledge on if the slot with the given name already exists or not before invoking the command. If pg_basebackup --create-slot <> command fails for some reason after creating the slot. Re-triggering the same command fails with ERROR slot already exists. Either then need to delete the slot and retrigger Or need to add a check before retriggering the command to check if the slot exists and based on the same alter the command to avoid passing --create-slot option. This poses inconvenience while automating on top of pg_basebackup. As checking for slot presence before invoking pg_basebackup unnecessarily involves issuing separate SQL commands. Would be really helpful for such scenarios if similar to CREATE TABLE, pg_basebackup can have IF NOT EXISTS kind of semantic. (Seems the limitation most likely is coming from CREATE REPLICATION SLOT protocol itself), Thoughts?

What’s the use case for automating pg_basebackup with a named replication slot created by the pg_basebackup command?

Greenplum runs N (some hundred) number of PostgreSQL instances to form a sharded database cluster. Hence, automation/scripts are in place to create replicas, failover failback for these N instances and such. As Michael said for predictable management and monitoring of the slot across these many instances, specific named replication slots are used across all these instances. These named replication slots are used both for pg_basebackup followed by streaming replication.

Why can you not leverage a temporary replication slot (i.e., omit —slot). ISTM the create option is basically obsolete now.

We would be more than happy to use a temporary replication slot if it provided full functionality. It might be a gap in my understanding, but I feel a temporary replication slot only protects WAL deletion for the duration of pg_basebackup. It doesn't protect the window between pg_basebackup completion and streaming replication starting. With --write-recovery-conf option "primary_slot_name" only gets written to postgresql.auto.conf if the named replication slot is provided, which makes sure the same slot will be used for pg_basebackup and streaming replication hence will keep the WAL around till streaming replica connects after pg_basebackup. How to avoid this window with a temp slot?

-- 
Ashwin Agrawal (VMware)

pgsql-hackers by date:

Previous
From: Jacob Champion
Date:
Subject: Re: [PoC] Let libpq reject unexpected authentication requests
Next
From: Bharath Rupireddy
Date:
Subject: Re: Refactor backup related code (was: Is it correct to say, "invalid data in file \"%s\"", BACKUP_LABEL_FILE in do_pg_backup_stop?)