Re: Synchronizing slots from primary to standby - Mailing list pgsql-hackers
From | Amit Kapila |
---|---|
Subject | Re: Synchronizing slots from primary to standby |
Date | |
Msg-id | CAA4eK1Ko-EBBDkea2R8V8PeveGg10PBswCF7JQdnRu+MJP+YBQ@mail.gmail.com Whole thread Raw |
In response to | Re: Synchronizing slots from primary to standby (shveta malik <shveta.malik@gmail.com>) |
Responses |
Re: Synchronizing slots from primary to standby
|
List | pgsql-hackers |
On Fri, Dec 15, 2023 at 11:03 AM shveta malik <shveta.malik@gmail.com> wrote: > > Sorry, I missed attaching the patch. PFA v48. > Few comments on v48_0002 ======================== 1. +static void +slotsync_reread_config(WalReceiverConn *wrconn) { ... + pfree(old_primary_conninfo); + pfree(old_primary_slotname); + + if (restart) + { + char *msg = "slot sync worker will restart because of a parameter change"; + + /* + * The exit code 1 will make postmaster restart the slot sync worker. + */ + slotsync_worker_exit(msg, 1 /* proc_exit code */ ); + } ... I don't see the need to explicitly pfree in case we are already exiting the process because anyway the memory will be released. We can avoid using the 'restart' variable for this. Also, probably, directly exiting here makes sense and at another place where this function is used. I see that in maybe_reread_subscription(), we exit with a 0 code and still apply worker restarts, so why use a different exit code here? 2. +static void +check_primary_info(WalReceiverConn *wrconn, bool *am_cascading_standby) { ... + remote_in_recovery = DatumGetBool(slot_getattr(tupslot, 1, &isnull)); + Assert(!isnull); + + /* No need to check further, return that we are cascading standby */ + if (remote_in_recovery) + { + *am_cascading_standby = true; + CommitTransactionCommand(); + return; ... } Don't we need to clear the result and tuple in case of early return? 3. It would be a good idea to mention about requirements like a physical slot on primary, hot_standby_feedback, etc. in the commit message. 4. +static bool +wait_for_primary_slot_catchup(WalReceiverConn *wrconn, RemoteSlot *remote_slot, + bool *wait_attempts_exceeded) { ... + tupslot = MakeSingleTupleTableSlot(res->tupledesc, &TTSOpsMinimalTuple); + if (!tuplestore_gettupleslot(res->tuplestore, true, false, tupslot)) + { + ereport(WARNING, + (errmsg("slot \"%s\" creation aborted", remote_slot->name), + errdetail("This slot was not found on the primary server"))); ... + /* + * It is possible to get null values for LSN and Xmin if slot is + * invalidated on the primary server, so handle accordingly. + */ + new_invalidated = DatumGetBool(slot_getattr(tupslot, 1, &isnull)); + Assert(!isnull); + + new_restart_lsn = DatumGetLSN(slot_getattr(tupslot, 2, &isnull)); + if (new_invalidated || isnull) + { + ereport(WARNING, + (errmsg("slot \"%s\" creation aborted", remote_slot->name), + errdetail("This slot was invalidated on the primary server"))); ... } a. The errdetail message should end with a full stop. Please check all other errdetail messages in the patch to follow the same guideline. b. I think saying slot creation aborted is not completely correct because we would have created the slot especially when it is in 'i' state. Can we change it to something like: "aborting initial sync for slot \"%s\""? c. Also, if the remote_slot is invalidated, ideally, we can even drop the local slot but it seems that the patch will drop the same before the next-sync cycle with any other slot that needs to be dropped. If so, can we add the comment to indicate the same? 5. +static void +local_slot_update(RemoteSlot *remote_slot) +{ + Assert(MyReplicationSlot->data.invalidated == RS_INVAL_NONE); + + LogicalConfirmReceivedLocation(remote_slot->confirmed_lsn); + LogicalIncreaseXminForSlot(remote_slot->confirmed_lsn, + remote_slot->catalog_xmin); + LogicalIncreaseRestartDecodingForSlot(remote_slot->confirmed_lsn, + remote_slot->restart_lsn); + + SpinLockAcquire(&MyReplicationSlot->mutex); + MyReplicationSlot->data.invalidated = remote_slot->invalidated; + SpinLockRelease(&MyReplicationSlot->mutex); ... ... If required, the invalidated flag is updated in the caller as well, so why do we need to update it here as well? -- With Regards, Amit Kapila.
pgsql-hackers by date: