Re: pg_receivewal starting position - Mailing list pgsql-hackers

From Ronan Dunklau
Subject Re: pg_receivewal starting position
Date
Msg-id 4129596.LXfQO9unRY@aivenronan
Whole thread Raw
In response to Re: pg_receivewal starting position  (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>)
Responses Re: pg_receivewal starting position
List pgsql-hackers
Le jeudi 2 septembre 2021, 10:37:20 CEST Michael Paquier a écrit :
> On Thu, Sep 02, 2021 at 10:08:26AM +0200, Ronan Dunklau wrote:
> > I could see the use for sending active_pid for use within pg_basebackup:
> > at
> > least we could fail early if the slot is already in use. But at the same
> > time, maybe it won't be in use anymore once we need it.
>
> There is no real concurrent protection with this design.  You could
> read that the slot is not active during READ_REPLICATION_SLOT just to
> find out after in the process of pg_basebackup streaming WAL that it
> became in use in-between.  And the backend-side protections would kick
> in at this stage.
>
> Hmm.  The logic doing the decision-making with pg_receivewal may
> become more tricky when it comes to pg_replication_slots.wal_status,
> max_slot_wal_keep_size and pg_replication_slots.safe_wal_size.  The
> number of cases we'd like to consider impacts directly the amount of
> data send through READ_REPLICATION_SLOT.  That's not really different
> than deciding of a failure, a success or a retry with active_pid at an
> earlier or a later stage of a base backup.  pg_receivewal, on the
> contrary, can just rely on what the backend tells when it begins
> streaming.  So I'd prefer keeping things simple and limit the number
> of fields a minimum for this command.

Ok. Please find attached new patches rebased on master.*

0001 is yours without any modification.

0002 for pg_receivewal tried to simplify the logic of information to return,
by using a dedicated struct for this. This accounts for Bahrath's demands to
return every possible field.
In particular, an is_logical field simplifies the detection of the type of slot.
In my opinion it makes sense to simplify it like this on the client side while
being more open-minded on the server side if we ever need to provide a new
type of slot. Also, GetSlotInformation now returns an enum to be able to
handle the different modes of failures, which differ between pg_receivewal and
pg_basebackup.

0003 is the pg_basebackup one, not much changed except for the concerns you
had about the log message and handling of different failure modes.

There is still the concern raised by Bharath about being able to select the
way to fetch the replication slot information for the user, and what should or
should not be included in the documentation. I am in favor of documenting the
process of selecting the wal start, and maybe include a --start-lsn option for
the user to override it, but that's maybe for another patch.

--
Ronan Dunklau
Attachment

pgsql-hackers by date:

Previous
From: Daniel Gustafsson
Date:
Subject: Re: Trap errors from streaming child in pg_basebackup to exit early
Next
From: Dilip Kumar
Date:
Subject: Re: [Proposal] Fully WAL logged CREATE DATABASE - No Checkpoints