RE: Synchronizing slots from primary to standby - Mailing list pgsql-hackers
From | Zhijie Hou (Fujitsu) |
---|---|
Subject | RE: Synchronizing slots from primary to standby |
Date | |
Msg-id | OS0PR01MB57162C119597F999895E9DAA94002@OS0PR01MB5716.jpnprd01.prod.outlook.com Whole thread Raw |
In response to | Re: Synchronizing slots from primary to standby (Amit Kapila <amit.kapila16@gmail.com>) |
Responses |
Re: Synchronizing slots from primary to standby
|
List | pgsql-hackers |
On Saturday, April 6, 2024 12:43 PM Amit Kapila <amit.kapila16@gmail.com> wrote: > On Fri, Apr 5, 2024 at 8:05 PM Bertrand Drouvot > <bertranddrouvot.pg@gmail.com> wrote: > > > > On Fri, Apr 05, 2024 at 06:23:10PM +0530, Amit Kapila wrote: > > > On Fri, Apr 5, 2024 at 5:17 PM Amit Kapila <amit.kapila16@gmail.com> > wrote: > > > Thinking more on this, it doesn't seem related to > > > c9920a9068eac2e6c8fb34988d18c0b42b9bf811 as that commit doesn't > > > change any locking or something like that which impacts write positions. > > > > Agree. > > > > > I think what has happened here is that running_xact record written > > > by the background writer [1] is not written to the kernel or disk > > > (see LogStandbySnapshot()), before pg_current_wal_lsn() checks the > > > current_lsn to be compared with replayed LSN. > > > > Agree, I think it's not visible through pg_current_wal_lsn() yet. > > > > Also I think that the DEBUG message in LogCurrentRunningXacts() > > > > " > > elog(DEBUG2, > > "snapshot of %d+%d running transaction ids (lsn %X/%X oldest > xid %u latest complete %u next xid %u)", > > CurrRunningXacts->xcnt, CurrRunningXacts->subxcnt, > > LSN_FORMAT_ARGS(recptr), > > CurrRunningXacts->oldestRunningXid, > > CurrRunningXacts->latestCompletedXid, > > CurrRunningXacts->nextXid); " > > > > should be located after the XLogSetAsyncXactLSN() call. Indeed, the > > new LSN is visible after the spinlock (XLogCtl->info_lck) in > > XLogSetAsyncXactLSN() is released, > > > > I think the new LSN can be visible only when the corresponding WAL is written > by XLogWrite(). I don't know what in XLogSetAsyncXactLSN() can make it > visible. In your experiment below, isn't it possible that in the meantime WAL > writer has written that WAL due to which you are seeing an updated location? > > >see: > > > > \watch on Session 1 provides: > > > > pg_current_wal_lsn > > -------------------- > > 0/87D110 > > (1 row) > > > > Until: > > > > Breakpoint 2, XLogSetAsyncXactLSN (asyncXactLSN=8900936) at > xlog.c:2579 > > 2579 XLogRecPtr WriteRqstPtr = asyncXactLSN; > > (gdb) n > > 2581 bool wakeup = false; > > (gdb) > > 2584 SpinLockAcquire(&XLogCtl->info_lck); > > (gdb) > > 2585 RefreshXLogWriteResult(LogwrtResult); > > (gdb) > > 2586 sleeping = XLogCtl->WalWriterSleeping; > > (gdb) > > 2587 prevAsyncXactLSN = XLogCtl->asyncXactLSN; > > (gdb) > > 2588 if (XLogCtl->asyncXactLSN < asyncXactLSN) > > (gdb) > > 2589 XLogCtl->asyncXactLSN = asyncXactLSN; > > (gdb) > > 2590 SpinLockRelease(&XLogCtl->info_lck); > > (gdb) p p/x (uint32) XLogCtl->asyncXactLSN > > $1 = 0x87d148 > > > > Then session 1 provides: > > > > pg_current_wal_lsn > > -------------------- > > 0/87D148 > > (1 row) > > > > So, when we see in the log: > > > > 2024-04-05 04:37:05.074 UTC [3854278][background writer][:0] DEBUG: > > snapshot of 0+0 running transaction ids (lsn 0/3000098 oldest xid 740 > > latest complete 739 next xid 740) > > 2024-04-05 04:37:05.197 UTC [3866475][client backend][2/4:0] LOG: > statement: SELECT '0/3000060' <= replay_lsn AND state = 'streaming' > > > > It's indeed possible that the new LSN was not visible yet (spinlock > > not released?) before the query began (because we can not rely on the > > time the DEBUG message has been emitted). > > > > > Note that the reason why > > > walsender has picked the running_xact written by background writer > > > is because it has checked after pg_current_wal_lsn() query, see LOGs [2]. > > > I think we can probably try to reproduce manually via debugger. > > > > > > If this theory is correct > > > > It think it is. > > > > > then I think we will need to use injection points to control the > > > behavior of bgwriter or use the slots created via SQL API for > > > syncing in tests. > > > > > > Thoughts? > > > > I think that maybe as a first step we should move the "elog(DEBUG2," > > message as proposed above to help debugging (that could help to confirm > the above theory). > > > > I think I am missing how exactly moving DEBUG2 can confirm the above theory. > > > If the theory is proven then I'm not sure we need the extra complexity > > of injection point here, maybe just relying on the slots created via > > SQL API could be enough. > > > > Yeah, that could be the first step. We can probably add an injection point to > control the bgwrite behavior and then add tests involving walsender > performing the decoding. But I think it is important to have sufficient tests in > this area as I see they are quite helpful in uncovering the issues. Here is the patch to drop the subscription in the beginning so that the restart_lsn of the lsub1_slot won't be advanced due to concurrent xl_running_xacts from bgwriter. The subscription will be re-created after all the slots are sync-ready. I think maybe we can use this to stabilize the test as a first step and then think about how to make use of injection point to add more tests if it's worth it. Best Regards, Hou zj
Attachment
pgsql-hackers by date: