On Mon, Apr 29, 2024 at 11:38 AM shveta malik <shveta.malik@gmail.com> wrote:
>
> On Mon, Apr 29, 2024 at 10:57 AM Zhijie Hou (Fujitsu)
> <houzj.fnst@fujitsu.com> wrote:
> >
> > On Friday, March 15, 2024 10:45 PM Bertrand Drouvot <bertranddrouvot.pg@gmail.com> wrote:
> > >
> > > Hi,
> > >
> > > On Thu, Mar 14, 2024 at 02:22:44AM +0000, Zhijie Hou (Fujitsu) wrote:
> > > > Hi,
> > > >
> > > > Since the standby_slot_names patch has been committed, I am attaching
> > > > the last doc patch for review.
> > > >
> > >
> > > Thanks!
> > >
> > > 1 ===
> > >
> > > + continue subscribing to publications now on the new primary server
> > > without
> > > + any data loss.
> > >
> > > I think "without any data loss" should be re-worded in this context. Data loss in
> > > the sense "data committed on the primary and not visible on the subscriber in
> > > case of failover" can still occurs (in case synchronous replication is not used).
> > >
> > > 2 ===
> > >
> > > + If the result (<literal>failover_ready</literal>) of both above steps is
> > > + true, existing subscriptions will be able to continue without data loss.
> > > + </para>
> > >
> > > I don't think that's true if synchronous replication is not used. Say,
> > >
> > > - synchronous replication is not used
> > > - primary is not able to reach the standby anymore and standby_slot_names is
> > > set
> > > - new data is inserted into the primary
> > > - then not replicated to subscriber (due to standby_slot_names)
> > >
> > > Then I think the both above steps will return true but data would be lost in case
> > > of failover.
> >
> > Thanks for the comments, attach the new version patch which reworded the
> > above places.
>
> Thanks for the patch.
>
> Few comments:
>
> 1) Tested the steps, one of the queries still refers to
> 'conflict_reason'. I think it should refer 'conflicting'.
>
> 2) Will it be good to mention that in case of planned promotion, it is
> recommended to run pg_sync_replication_slots() as last sync attempt
> before we run failvoer-ready validation steps? This can be mentioned
> in high-availaibility.sgml of current patch
I recall now that with the latest fix, we cannot run
pg_sync_replication_slots() unless we disable the slot-sync worker.
Considering that, I think it will be too many steps just to run the
SQL function at the end without much value added. Thus we can skip
this point, we can rely on slot sync worker completely.
thanks
Shveta