Re: Strange decreasing value of pg_last_wal_receive_lsn() - Mailing list pgsql-hackers

From Jehan-Guillaume de Rorthais
Subject Re: Strange decreasing value of pg_last_wal_receive_lsn()
Date
Msg-id 20200514184457.48d58ef5@firost
Whole thread Raw
In response to Re: Strange decreasing value of pg_last_wal_receive_lsn()  (godjan • <g0dj4n@gmail.com>)
Responses Re: Strange decreasing value of pg_last_wal_receive_lsn()
List pgsql-hackers
(please, the list policy is bottom-posting to keep history clean, thanks).

On Thu, 14 May 2020 07:18:33 +0500
godjan • <g0dj4n@gmail.com> wrote:

> -> Why do you kill -9 your standby?
> Hi, it’s Jepsen test for our HA solution. It checks that we don’t lose data
> in such situation.

OK. This test is highly useful to stress data high availability and durability,
of course. However, how useful is this test in a context of auto failover for
**service** high availability?  If all your nodes are killed in the same
disaster, how/why an automatic cluster manager should take care of starting all
nodes again and pick the right node to promote?

> So, now we update logic as Michael said. All ha alive standbys now waiting
> for replaying all WAL that they have and after we use pg_last_replay_lsn() to
> choose which standby will be promoted in failover.
>
> It fixed out trouble, but there is one another. Now we should wait when all
> ha alive hosts finish replaying WAL to failover. It might take a while(for
> example WAL contains wal_record about splitting b-tree).

Indeed, this is the concern I wrote about yesterday in a second mail on this
thread.

> We are looking for options that will allow us to find a standby that contains
> all data and replay all WAL only for this standby before failover.

Note that when you promote a node, it first replays available WALs before
acting as a primary. So you can safely signal the promotion to the node and
wait for it to finish the replay and promote.

> Maybe you have ideas on how to keep the last actual value of
> pg_last_wal_receive_lsn()?

Nope, no clean and elegant idea. One your instances are killed, maybe you can
force flush the system cache (secure in-memory-only data) and read the latest
received WAL using pg_waldump?

But, what if some more data are available from archives, but not received from
streaming rep because of a high lag?

> As I understand WAL receiver doesn’t write to disk walrcv->flushedUpto.

I'm not sure to understand what you mean here.
pg_last_wal_receive_lsn() reports the actual value of walrcv->flushedUpto.
walrcv->flushedUpto reports the latest LSN force-flushed to disk.


> > On 13 May 2020, at 19:52, Jehan-Guillaume de Rorthais <jgdr@dalibo.com>
> > wrote:
> >
> >
> > (too bad the history has been removed to keep context)
> >
> > On Fri, 8 May 2020 15:02:26 +0500
> > godjan • <g0dj4n@gmail.com> wrote:
> >
> >> I got it, thank you.
> >> Can you recommend what to use to determine which quorum standby should be
> >> promoted in such case? We planned to use pg_last_wal_receive_lsn() to
> >> determine which has fresh data but if it returns the beginning of the
> >> segment on both replicas we can’t determine which standby confirmed that
> >> write transaction to disk.
> >
> > Wait, pg_last_wal_receive_lsn() only decrease because you killed your
> > standby.
> >
> > pg_last_wal_receive_lsn() returns the value of walrcv->flushedUpto. The
> > later is set to the beginning of the segment requested only during the first
> > walreceiver startup or a timeline fork:
> >
> >     /*
> >      * If this is the first startup of walreceiver (on this timeline),
> >      * initialize flushedUpto and latestChunkStart to the starting
> > point. */
> >     if (walrcv->receiveStart == 0 || walrcv->receivedTLI != tli)
> >     {
> >         walrcv->flushedUpto = recptr;
> >         walrcv->receivedTLI = tli;
> >         walrcv->latestChunkStart = recptr;
> >     }
> >     walrcv->receiveStart = recptr;
> >     walrcv->receiveStartTLI = tli;
> >
> > After a primary loss, as far as the standby are up and running, it is fine
> > to use pg_last_wal_receive_lsn().
> >
> > Why do you kill -9 your standby? Whay am I missing? Could you explain the
> > usecase you are working on to justify this?
> >
> > Regards,
>
>
>



--
Jehan-Guillaume de Rorthais
Dalibo



pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: PG 13 release notes, first draft
Next
From: Ranier Vilela
Date:
Subject: [PATCH] Fix ouside scope t_ctid (ItemPointerData)