RE: Synchronizing slots from primary to standby - Mailing list pgsql-hackers
From | Zhijie Hou (Fujitsu) |
---|---|
Subject | RE: Synchronizing slots from primary to standby |
Date | |
Msg-id | OS0PR01MB57165AA821D881520E38FE2E94462@OS0PR01MB5716.jpnprd01.prod.outlook.com Whole thread Raw |
In response to | Re: Synchronizing slots from primary to standby (Masahiko Sawada <sawada.mshk@gmail.com>) |
Responses |
Re: Synchronizing slots from primary to standby
|
List | pgsql-hackers |
On Monday, February 5, 2024 10:25 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote: lvh.no-ip.org> > Subject: Re: Synchronizing slots from primary to standby > > On Mon, Feb 5, 2024 at 8:26 PM shveta malik <shveta.malik@gmail.com> > wrote: > > > > On Mon, Feb 5, 2024 at 10:57 AM Ajin Cherian <itsajin@gmail.com> wrote: > > > > > > Just noticed that doc/src/sgml/config.sgml still refers to enable_synclot > instead of sync_replication_slots: > > > > > > The standbys corresponding to the physical replication slots in > > > <varname>standby_slot_names</varname> must configure > > > <literal>enable_syncslot = true</literal> so they can receive > > > failover logical slots changes from the primary. > > > > Thanks Ajin for pointing this out. Here are v78 patches, corrected there. > > > > Other changes are: > > > > 1) Rebased the patches as the v77-001 is now pushed. > > 2) Enabled executing pg_sync_replication_slots() on cascading-standby. > > 3) Rearranged the code around parameter validity checks. Changed > > function names and changed the way how dbname is extracted as > > suggested by Amit offlist. > > 4) Rearranged the code around check_primary_info(). Removed output > args. > > 5) Few other trivial changes. > > > > Thank you for updating the patch! Here are some comments: > > --- > Since Two processes (e.g. the slotsync worker and > pg_sync_replication_slots()) concurrently fetch and update the slot information, > there is a race condition where slot's confirmed_flush_lsn goes backward. . We > have the following check but it doesn't prevent the slot's confirmed_flush_lsn > from moving backward if the restart_lsn does't change: > > /* > * Sanity check: As long as the invalidations are handled > * appropriately as above, this should never happen. > */ > if (remote_slot->restart_lsn < slot->data.restart_lsn) > elog(ERROR, > "cannot synchronize local slot \"%s\" LSN(%X/%X)" > " to remote slot's LSN(%X/%X) as synchronization" > " would move it backwards", remote_slot->name, > LSN_FORMAT_ARGS(slot->data.restart_lsn), > LSN_FORMAT_ARGS(remote_slot->restart_lsn)); > As discussed, I added a flag in shared memory to control the concurrent slot sync. > --- > + It is recommended that subscriptions are first disabled before > + promoting > f+ the standby and are enabled back after altering the connection string. > > I think it's better to describe the reason why it's recommended to disable > subscriptions before the standby promotion. Added. > > --- > +/* Slot sync worker objects */ > +extern PGDLLIMPORT char *PrimaryConnInfo; extern PGDLLIMPORT char > +*PrimarySlotName; > > These two variables are declared also in xlogrecovery.h. Is it intentional? If so, I > think it's better to write comments. Will address. > > --- > Global functions and variables used by the slotsync worker are declared in > logicalworker.h and worker_internal.h. But is it really okay to make a > dependency between the slotsync worker and logical replication workers? IIUC > the slotsync worker is conceptually a separate feature from the logical > replication. I think the slotsync worker can have its own header file. Added. > > --- > + SELECT r.srsubid AS subid, CONCAT('pg_' || srsubid || > '_sync_' || srrelid || '_' || ctl.system_identifier) AS slotname > > and > > + SELECT (CASE WHEN r.srsubstate = 'f' THEN > pg_replication_origin_progress(CONCAT('pg_' || r.srsubid || '_' || r.srrelid), false) > > If we use CONCAT function, we can replace '||' with ','. > Will address. > --- > + Confirm that the standby server is not lagging behind the subscribers. > + This step can be skipped if > + <link > linkend="guc-standby-slot-names"><varname>standby_slot_names</varna > me></link> > + has been correctly configured. > > How can the user confirm if standby_slot_names is correctly configured? Will address after concluding. Thanks Shveta for helping addressing the comments. Best Regards, Hou zj
pgsql-hackers by date: