Re: pg_basebackup behavior on non-existent slot - Mailing list pgsql-bugs

From Masahiko Sawada
Subject Re: pg_basebackup behavior on non-existent slot
Date
Msg-id CAD21AoAYD5GPUmTYQGNJ+nOEs8zZGVN8UUnjCb1UZW8k81byjA@mail.gmail.com
Whole thread Raw
In response to Re: pg_basebackup behavior on non-existent slot  (Michael Paquier <michael@paquier.xyz>)
List pgsql-bugs
On Mon, Aug 23, 2021 at 4:11 PM Michael Paquier <michael@paquier.xyz> wrote:
>
> On Mon, Aug 23, 2021 at 12:54:58PM +0900, Masahiko Sawada wrote:
> > On Sat, Aug 21, 2021 at 10:15 PM Gerard H. Pille <ghpille@hotmail.com> wrote:
> >> The old thread:
> >> https://postgrespro.com/list/thread-id/2337189#CAMkU=1wSxYBNFY9TzuVh3=mDLr4BBsMct6wcViNMH+-6Xon4Uw@mail.gmail.com
>
> A link to postgresql.org does the same work:
> https://www.postgresql.org/message-id/CAMkU=1wSxYBNFY9TzuVh3=mDLr4BBsMct6wcViNMH+-6Xon4Uw@mail.gmail.com
> No need to redirect that elsewhere.
>
> > It seems it's not fixed yet even in HEAD as far as I tested. There
> > were some ideas to fix that on that thread but the main point was how
> > to fix it on Windows. I guess that since it creates a transient slot
> > it’s not a common case to specify a non-existence slot in pg_baseback
> > but what is your use case? This might help motivate to fix this issue.
> >
> > BTW in that thread, there was a discussion on how to detect the
> > streamer process failure in the main process but probably we can fix
> > this by just doing an existence check for the specified name
> > replication slot before starting the streamer process?
>
> Yeah.  Honestly, I am not really excited to redesign this part of base
> backups just to take care of a side problem that is mitigated for most
> users with the use of a temporary slot.

I think that this problem can happen also when using a temporary slot.
If the slot loses WAL file due to max_slot_wal_keep_size, the streamer
process could fail even while streaming WAL. I guess that the main
process taking a base backup won't stop in this case. If it's true,
pre-checking the existence of the slot might not be enough.

> It would be simpler, as you
> say, to find a way to detect if the slot wanted exists before even
> launching START_REPLICATION with the specified slot on the second
> thread copying the WAL, but that would be a new thing as one cannot do
> catalog lookups with a physical replication session.

Good point.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/



pgsql-bugs by date:

Previous
From: Andrey Borodin
Date:
Subject: Re: CREATE INDEX CONCURRENTLY does not index prepared xact's data
Next
From: Masahiko Sawada
Date:
Subject: Re: BUG #17156: pg_restore: [custom archiver] WARNING: ftell mismatch with expected position -- ftell used