Re: [HACKERS] make async slave to wait for lsn to be replayed - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: [HACKERS] make async slave to wait for lsn to be replayed
Date
Msg-id CAPpHfdtuiL1x4APTs7u1fCmxkVp2-ZruXcdCfprDMdnOzvdC+A@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] make async slave to wait for lsn to be replayed  (Amit Kapila <amit.kapila16@gmail.com>)
Responses Re: [HACKERS] make async slave to wait for lsn to be replayed
List pgsql-hackers
On Tue, Mar 19, 2024 at 1:51 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Mon, Mar 18, 2024 at 3:24 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:
> >
> > On Mon, Mar 18, 2024 at 5:17 AM Amit Kapila <amit.kapila16@gmail.com> wrote:
> > > > 1. First, check that it was called with non-atomic context (that is,
> > > > it's not called within a transaction). Trigger error if called with
> > > > atomic context.
> > > > 2. Release a snapshot to be able to wait without risk of WAL replay
> > > > stuck.  Procedure is still called within the snapshot.  It's a bit of
> > > > a hack to release a snapshot, but Vacuum statements already do so.
> > > >
> > >
> > > Can you please provide a bit more details with some example what is
> > > the existing problem with functions and how using procedures will
> > > resolve it? How will this this address the implicit transaction case
> > > or do we have any other workaround for those cases?
> >
> > Please check [1] and [2] for the explanation of the problem with functions.
> >
> > Also, please find a draft patch implementing the procedure.  The issue with the snapshot is addressed with the
followinglines. 
> >
> > We first ensure we're in a non-atomic context, then pop an active snapshot (tricky, but ExecuteVacuum() does the
same). Then we should have no active snapshot and it's safe to wait for lsn replay. 
> >
> >     if (context->atomic)
> >         ereport(ERROR,
> >                 (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> >                  errmsg("pg_wait_lsn() must be only called in non-atomic context")));
> >
> >     if (ActiveSnapshotSet())
> >         PopActiveSnapshot();
> >     Assert(!ActiveSnapshotSet());
> >
> > The function call could be added either before the BEGIN statement or before the implicit transaction.
> >
> > CALL pg_wait_lsn('my_lsn', my_timeout); BEGIN;
> > CALL pg_wait_lsn('my_lsn', my_timeout); SELECT ...;
> >
>
> I haven't thought in detail about whether there are any other problems
> with this idea but sounds like it should solve the problems you shared
> with a function call approach. BTW, if the application has to anyway
> know the LSN till where replica needs to wait, why can't they simply
> monitor the pg_last_wal_replay_lsn() value?

Amit, thank you for your feedback.

Yes, the application can monitor pg_last_wal_replay_lsn() value,
that's our state of the art solution.  But that's rather inconvenient
and takes extra latency and network traffic. And it can't be wrapped
into a server-side function in procedural language for the reasons we
can't implement it as a built-in function.

------
Regards,
Alexander Korotkov



pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: speed up a logical replica setup
Next
From: Heikki Linnakangas
Date:
Subject: Re: BitmapHeapScan streaming read user and prelim refactoring