Re: pg_receivewal starting position - Mailing list pgsql-hackers
From | Bharath Rupireddy |
---|---|
Subject | Re: pg_receivewal starting position |
Date | |
Msg-id | CALj2ACV5daANc=Uu8jwkQ+x9CcSXB77f4win2CtnSwQTJKAfFg@mail.gmail.com Whole thread Raw |
In response to | Re: pg_receivewal starting position (Ronan Dunklau <ronan.dunklau@aiven.io>) |
Responses |
Re: pg_receivewal starting position
|
List | pgsql-hackers |
On Mon, Sep 6, 2021 at 12:20 PM Ronan Dunklau <ronan.dunklau@aiven.io> wrote: > > Le lundi 6 septembre 2021, 06:22:30 CEST Michael Paquier a écrit : > > On Fri, Sep 03, 2021 at 11:58:27AM +0200, Ronan Dunklau wrote: > > > 0002 for pg_receivewal tried to simplify the logic of information to > > > return, by using a dedicated struct for this. This accounts for Bahrath's > > > demands to return every possible field. > > > In particular, an is_logical field simplifies the detection of the type of > > > slot. In my opinion it makes sense to simplify it like this on the client > > > side while being more open-minded on the server side if we ever need to > > > provide a new type of slot. Also, GetSlotInformation now returns an enum > > > to be able to handle the different modes of failures, which differ > > > between pg_receivewal and pg_basebackup. > > > > + if (PQserverVersion(conn) < 150000) > > + return READ_REPLICATION_SLOT_UNSUPPORTED; > > [...] > > +typedef enum { > > + READ_REPLICATION_SLOT_OK, > > + READ_REPLICATION_SLOT_UNSUPPORTED, > > + READ_REPLICATION_SLOT_ERROR, > > + READ_REPLICATION_SLOT_NONEXISTENT > > +} ReadReplicationSlotStatus; > > > > Do you really need this much complication? We could treat the > > unsupported case and the non-existing case the same way: we don't know > > so just assign InvalidXlogRecPtr or NULL to the fields of the > > structure, and make GetSlotInformation() return false just on error, > > with some pg_log_error() where adapted in its internals. > > I actually started with the implementation you propose, but changed my mind > while writing it because I realised it's easier to reason about like this, > instead of failing silently during READ_REPLICATION_SLOT to fail a bit later > when actually trying to start the replication slot because it doesn't exists. > Either the user expects the replication slot to exists, and in this case we > should retry the whole loop in the hope of getting an interesting LSN, or the > user doesn't and shouldn't even pass a replication_slot name to begin with. I don't think so we need to retry the whole loop if the READ_REPLICATION_SLOT command fails as pg_receivewal might enter an infinite loop there. IMO, we should just exit(1) if READ_REPLICATION_SLOT fails. > > > There is still the concern raised by Bharath about being able to select > > > the > > > way to fetch the replication slot information for the user, and what > > > should or should not be included in the documentation. I am in favor of > > > documenting the process of selecting the wal start, and maybe include a > > > --start-lsn option for the user to override it, but that's maybe for > > > another patch. > > > > The behavior of pg_receivewal that you are implementing should be > > documented. We don't say either how the start location is selected > > when working on an existing directory, so I would recommend to add a > > paragraph in the description section to detail all that, as of: > > - First a scan of the existing archives is done. > > - If nothing is found, and if there is a slot, request the slot > > information. > > - If still nothing (aka slot undefined, or command not supported), use > > the last flush location. > > Sounds good, I will add another patch for the documentation of this. +1. > > While on it, READ_REPLICATION_SLOT returns a confirmed LSN when > > grabbing the data of a logical slot. We are not going to use that > > with pg_recvlogical as by default START_STREAMING 0/0 would just use > > the confirmed LSN. Do we have use cases where this information would > > help? There is the argument of consistency with physical slots and > > that this can be helpful to do sanity checks for clients, but that's > > rather thin. > > If we don't, we should rename the command to READ_PHYSICAL_REPLICATION_SLOT > and error out on the server if the slot is not actually a physical one to > spare the client from performing those checks. I still think it's better to > support both cases as opposed to having two completely different APIs > (READ_(PHYSICAL)_REPLICATION_SLOT for physical ones on a replication > connection, pg_replication_slots view for logical ones) as it would enable > more for third-party clients at a relatively low maintenance cost for us. -1 for READ_PHYSICAL_REPLICATION_SLOT or failing on the server. What happens if we have another slot type "PHYSIOLOGICAL" or "FOO" or "BAR" some other? IMO, READ_REPLICATION_SLOT should just return info of all slots. The clients know better how to deal with the slot type. Although, we don't have a use case for logical slots with the READ_REPLICATION_SLOT command, let's not change it. If others are still concerned about the unnecessary slot being returned, you might consider passing in a parameter to READ_REPLICATION_SLOT command, something like below. But this too looks complex to me. I would vote for what the existing patch does with READ_REPLICATION_SLOT. READ_REPLICATION_SLOT 'slot_name' 'physical'; only returns physical slot info with the given name. READ_REPLICATION_SLOT 'slot_name' 'logical'; only returns logical slot with the given name. Regards, Bharath Rupireddy.
pgsql-hackers by date: