Re: Minimal logical decoding on standbys - Mailing list pgsql-hackers

From Drouvot, Bertrand
Subject Re: Minimal logical decoding on standbys
Date
Msg-id 061eeb26-156e-6011-02b6-b4b1422e2aa6@gmail.com
Whole thread Raw
In response to Re: Minimal logical decoding on standbys  (Jeff Davis <pgsql@j-davis.com>)
Responses Re: Minimal logical decoding on standbys  (Jeff Davis <pgsql@j-davis.com>)
List pgsql-hackers
Hi,

On 3/1/23 1:48 AM, Jeff Davis wrote:
> On Mon, 2023-02-27 at 09:40 +0100, Drouvot, Bertrand wrote:
>> Please find attached V51 tiny rebase due to a6cd1fc692 (for 0001) and
>> 8a8661828a (for 0005).
> 
> [ Jumping into this thread late, so I apologize if these comments have
> already been covered. ]
> 

Thanks for looking at it!

> Regarding v51-0004:
> 
> * Why is the CV sleep not being canceled?

I think that's an oversight, I'll look at it.

> * Comments on WalSndWaitForWal need to be updated to explain the
> difference between the flush (primary) and the replay (standby) cases.
> 

Yeah, will do.

> Overall, it seems like what you really want for the sleep/wakeup logic
> in WalSndWaitForLSN 

Typo for WalSndWaitForWal()?

> is something like this:
> 
>     condVar = RecoveryInProgress() ? replayCV : flushCV;
>     waitEvent = RecoveryInProgress() ?
>         WAIT_EVENT_WAL_SENDER_WAIT_REPLAY :
>         WAIT_EVENT_WAL_SENDER_WAIT_FLUSH;
> 
>     ConditionVariablePrepareToSleep(condVar);
>     for(;;)
>     {
>        ...
>        sleeptime = WalSndComputeSleepTime(GetCurrentTimestamp());
>        socketEvents = WL_SOCKET_READABLE;
>        if (pq_is_send_pending())
>            socketEvents = WL_SOCKET_WRITABLE;
>        ConditionVariableTimedSleepOrEvents(
>            condVar, sleeptime, socketEvents, waitEvent);
>     }
>     ConditionVariableCancelSleep();
> 
> 
> But the problem is that ConditionVariableTimedSleepOrEvents() doesn't
> exist, and I think that's what Andres was suggesting here[1].
> WalSndWait() only waits for a timeout or a socket event, but not a CV;
> ConditionVariableTimedSleep() only waits for a timeout or a CV, but not
> a socket event.
> 
> I'm also missing how WalSndWait() works currently. It calls
> ModifyWaitEvent() with NULL for the latch, so how does WalSndWakeup()
> wake it up?

I think it works because the latch is already assigned to the FeBeWaitSet
in pq_init()->AddWaitEventToSet() (for latch_pos).

> 
> Assuming I'm wrong, and WalSndWait() does use the latch, then I guess
> it could be extended by having two different latches in the WalSnd
> structure, and waking them up separately and waiting on the right one.

I'm not sure this is needed in this particular case, because:

Why not "simply" call ConditionVariablePrepareToSleep() without any call to ConditionVariableTimedSleep() later?

In that case the walsender will be put in the wait queue (thanks to ConditionVariablePrepareToSleep())
and will be waked up by the event on the socket, the timeout or the CV broadcast (since IIUC they all rely on the same
latch).

So, something like:

    condVar = RecoveryInProgress() ? replayCV : flushCV;
    ConditionVariablePrepareToSleep(condVar);
    for(;;)
    {
            ...
            sleeptime = WalSndComputeSleepTime(GetCurrentTimestamp());
            socketEvents = WL_SOCKET_READABLE;
            if (pq_is_send_pending())
              socketEvents = WL_SOCKET_WRITABLE;
       WalSndWait(wakeEvents, sleeptime, WAIT_EVENT_WAL_SENDER_WAIT_WAL);  <-- Note: the code within the loop does not
changeat all
 

          }
          ConditionVariableCancelSleep();


If the walsender is waked up by the CV broadcast, then it means the flush/replay occurred and then we should exit the
loop
right after due to:

"
         /* check whether we're done */
         if (loc <= RecentFlushPtr)
             break;

"

meaning that in this particular case there is only one wake up due to the CV broadcast before exiting the loop.

That looks weird to use ConditionVariablePrepareToSleep() without actually using ConditionVariableTimedSleep()
but it looks to me that it would achieve the same goal: having the walsender being waked up
by the event on the socket, the timeout or the CV broadcast.

In that case we would be missing the WAIT_EVENT_WAL_SENDER_WAIT_REPLAY and/or the WAIT_EVENT_WAL_SENDER_WAIT_FLUSH
wait events thought (and we'd just provide the WAIT_EVENT_WAL_SENDER_WAIT_WAL one) but I'm not sure that's a big
issue.

What do you think?

Regards,

-- 
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com



pgsql-hackers by date:

Previous
From: "Daniel Verite"
Date:
Subject: Re: psql's FETCH_COUNT (cursor) is not being respected for CTEs
Next
From: Pavel Luzanov
Date:
Subject: Re: psql: Add role's membership options to the \du+ command