Re: Synchronizing slots from primary to standby - Mailing list pgsql-hackers

From Amit Kapila
Subject Re: Synchronizing slots from primary to standby
Date
Msg-id CAA4eK1JD8h_XLRsK_o_Xh=5MhTzm+6d4Cb4_uPgFJ2wSQDah=g@mail.gmail.com
Whole thread Raw
In response to Re: Synchronizing slots from primary to standby  (shveta malik <shveta.malik@gmail.com>)
Responses Re: Synchronizing slots from primary to standby
List pgsql-hackers
On Thu, Apr 4, 2024 at 2:59 PM shveta malik <shveta.malik@gmail.com> wrote:
>
> There is an intermittent BF failure observed at [1] after this commit (2ec005b).
>

Thanks for analyzing and providing the patch. I'll look into it. There
is another BF failure [1] which I have analyzed. The main reason for
failure is the following test:

#   Failed test 'logical slots have synced as true on standby'
#   at /home/bf/bf-build/serinus/HEAD/pgsql/src/test/recovery/t/040_standby_failover_slots_sync.pl
line 198.
#          got: 'f'
#     expected: 't'

Here, we are expecting that the two logical slots (lsub1_slot, and
lsub2_slot), one created via subscription and another one via API
pg_create_logical_replication_slot() are synced. The standby LOGs
which are as follows show that the one created by API 'lsub2_slot' is
synced but the other one 'lsub1_slot':

LOG for lsub1_slot:
================
2024-04-05 04:37:07.421 UTC [3867682][client backend][0/2:0] DETAIL:
Streaming transactions committing after 0/0, reading WAL from
0/3000060.
2024-04-05 04:37:07.421 UTC [3867682][client backend][0/2:0]
STATEMENT:  SELECT pg_sync_replication_slots();
2024-04-05 04:37:07.422 UTC [3867682][client backend][0/2:0] DEBUG:
xmin required by slots: data 0, catalog 740
2024-04-05 04:37:07.422 UTC [3867682][client backend][0/2:0] LOG:
could not sync slot "lsub1_slot"

LOG for lsub2_slot:
================
2024-04-05 04:37:08.518 UTC [3867682][client backend][0/2:0] DEBUG:
xmin required by slots: data 0, catalog 740
2024-04-05 04:37:08.769 UTC [3867682][client backend][0/2:0] LOG:
newly created slot "lsub2_slot" is sync-ready now
2024-04-05 04:37:08.769 UTC [3867682][client backend][0/2:0]
STATEMENT:  SELECT pg_sync_replication_slots();

We can see from the log of lsub1_slot that its restart_lsn is
0/3000060 which means it will start reading from the WAL from that
location. Now, if we check the publisher log, we have running_xacts
record at that location. See following LOGs:

2024-04-05 04:36:57.830 UTC [3860839][client backend][8/2:0] LOG:
statement: SELECT pg_create_logical_replication_slot('lsub2_slot',
'test_decoding', false, false, true);
2024-04-05 04:36:58.718 UTC [3860839][client backend][8/2:0] DEBUG:
snapshot of 0+0 running transaction ids (lsn 0/3000060 oldest xid 740
latest complete 739 next xid 740)
....
....
2024-04-05 04:37:05.074 UTC [3854278][background writer][:0] DEBUG:
snapshot of 0+0 running transaction ids (lsn 0/3000098 oldest xid 740
latest complete 739 next xid 740)

The first running_xact record ends at 3000060 and the second one at
3000098. So, the start location of the second running_xact is 3000060,
the same can be confirmed by the following LOG line of walsender:

2024-04-05 04:37:05.144 UTC [3857385][walsender][25/0:0] DEBUG:
serializing snapshot to pg_logical/snapshots/0-3000060.snap

This shows that while processing running_xact at location 3000060, we
have serialized the snapshot. As there is no running transaction in
WAL at 3000060 so ideally we should have reached a consistent state
after processing that record on standby. But the reason standby didn't
process that LOG is that the confirmed_flush LSN is also at the same
location so the function LogicalSlotAdvanceAndCheckSnapState() exits
without reading the WAL at that location. Now, this can be confirmed
by the below walsender-specific LOG in publisher:

2024-04-05 04:36:59.155 UTC [3857385][walsender][25/0:0] DEBUG: write
0/3000060 flush 0/3000060 apply 0/3000060 reply_time 2024-04-05
04:36:59.155181+00

We update the confirmed_flush location with the flush location after
receiving the above feedback. You can notice that we didn't receive
the feedback for the 3000098 location and hence both the
confirmed_flush and restart_lsn are at the same location 0/3000060.
Now, the test is waiting for the subscriber to send feedback of the
last WAL write location by
$primary->wait_for_catchup('regress_mysub1'); As noticed from the
publisher LOGs, the query we used for wait is:

SELECT '0/3000060' <= replay_lsn AND state = 'streaming'
        FROM pg_catalog.pg_stat_replication
        WHERE application_name IN ('regress_mysub1', 'walreceiver')

Here, instead of '0/3000060' it should have used ''0/3000098' which is
the last write location. This position we get via function
pg_current_wal_lsn()->GetXLogWriteRecPtr()->LogwrtResult.Write. And
this variable seems to be touched by commit
c9920a9068eac2e6c8fb34988d18c0b42b9bf811. Though unlikely could
c9920a9068eac2e6c8fb34988d18c0b42b9bf811 be a reason for failure? At
this stage, I am not sure so just sharing with others to see if what I
am saying sounds logical. I'll think more about this.


[1] - https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=serinus&dt=2024-04-05%2004%3A34%3A27

--
With Regards,
Amit Kapila.



pgsql-hackers by date:

Previous
From: shveta malik
Date:
Subject: Re: Synchronizing slots from primary to standby
Next
From: Alvaro Herrera
Date:
Subject: Re: LogwrtResult contended spinlock