Re: Synchronizing slots from primary to standby - Mailing list pgsql-hackers

From Drouvot, Bertrand
Subject Re: Synchronizing slots from primary to standby
Date
Msg-id f3228cfb-7bf3-4bd8-8f37-c55fc4054759@gmail.com
Whole thread Raw
In response to Re: Synchronizing slots from primary to standby  (shveta malik <shveta.malik@gmail.com>)
Responses Re: Synchronizing slots from primary to standby
List pgsql-hackers
Hi,

On 10/9/23 12:30 PM, shveta malik wrote:
> PFA v22 patch-set. It has below changes:
> 
> patch 001:
> 1) Now physical walsender wakes up logical walsender(s) by using a new
> CV as suggested in [1]

Thanks!

I think that works fine as long as the standby is up and running and catching up.

The problem I see with the current WalSndWaitForStandbyConfirmation() implementation
is that if the standby is not running then:

+   for (;;)
+   {
+       ListCell   *l;
+       long        sleeptime = -1;

will loop until we reach the "terminating walsender process due to replication timeout" if we
explicitly want to end with SIGINT or friends.

For example a scenario like:

- standby down
- pg_recvlogical running

then CTRL-C on pg_recvlogical would not "respond" immediately but when we reach the replication timeout.

So it seems that we should use something like WalSndWait() instead of ConditionVariableTimedSleep() here:

+               /*
+                * Sleep until other physical walsenders awaken us or until a timeout
+                * occurs.
+                */
+               sleeptime = WalSndComputeSleeptime(GetCurrentTimestamp());
+
+               ConditionVariableTimedSleep(&WalSndCtl->wal_confirm_rcv_cv, sleeptime,
+
WAIT_EVENT_WAL_SENDER_WAIT_FOR_STANDBY_CONFIRMATION);

In that case I think that WalSndWait() should take care of the new CV WalSndCtl->wal_confirm_rcv_cv too.
The wait on the socket should allow us to stop waiting when, for example, CTRL-C on pg_recvlogical is triggered.

Then we would need to deal with this scenario: Standby down or not catching up and exited WalSndWait() due to the
socket
to break the loop or shutdown the walsender.

Thoughts?

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: Daniele Varrazzo
Date:
Subject: libpq async connection and multiple hosts
Next
From: Robert Haas
Date:
Subject: Re: trying again to get incremental backup