Re: walsender performance regression due to logical decoding on standby changes - Mailing list pgsql-hackers
From | Drouvot, Bertrand |
---|---|
Subject | Re: walsender performance regression due to logical decoding on standby changes |
Date | |
Msg-id | ddf14508-282d-b8a8-0a31-764ffccbeb04@gmail.com Whole thread Raw |
In response to | Re: walsender performance regression due to logical decoding on standby changes (Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>) |
Responses |
Re: walsender performance regression due to logical decoding on standby changes
|
List | pgsql-hackers |
Hi, On 5/10/23 8:36 AM, Bharath Rupireddy wrote: > On Wed, May 10, 2023 at 12:33 AM Andres Freund <andres@anarazel.de> wrote: >> >> Unfortunately I have found the following commit to have caused a performance >> regression: >> >> commit e101dfac3a53c20bfbf1ca85d30a368c2954facf >> >> The problem is that, on a standby, after the change - as needed to for the >> approach to work - the call to WalSndWakeup() in ApplyWalRecord() happens for >> every record, instead of only happening when the timeline is changed (or WAL >> is flushed or ...). >> >> WalSndWakeup() iterates over all walsender slots, regardless of whether in >> use. For each of the walsender slots it acquires a spinlock. >> >> When replaying a lot of small-ish WAL records I found the startup process to >> spend the majority of the time in WalSndWakeup(). I've not measured it very >> precisely yet, but the overhead is significant (~35% slowdown), even with the >> default max_wal_senders. If that's increased substantially, it obviously gets >> worse. > Thanks Andres for the call out! I do agree that this is a concern. >> The only saving grace is that this is not an issue on the primary. > > Yeah. +1 > >> My current guess is that mis-using a condition variable is the best bet. I >> think it should work to use ConditionVariablePrepareToSleep() before a >> WalSndWait(), and then ConditionVariableCancelSleep(). I.e. to never use >> ConditionVariableSleep(). The latch set from ConditionVariableBroadcast() >> would still cause the necessary wakeup. Yeah, I think that "mis-using" a condition variable is a valid option. Unless I'm missing something, I don't think there is anything wrong with using a CV that way (aka not using ConditionVariableTimedSleep() or ConditionVariableSleep() in this particular case). > > How about something like the attached? Recovery and subscription tests > don't complain with the patch. Thanks Bharath for looking at it! I launched a full Cirrus CI test with it but it failed on one environment (did not look in details, just sharing this here): https://cirrus-ci.com/task/6570140767092736 Also I have a few comments: @@ -1958,7 +1959,7 @@ ApplyWalRecord(XLogReaderState *xlogreader, XLogRecord *record, TimeLineID *repl * ------ */ if (AllowCascadeReplication()) - WalSndWakeup(switchedTLI, true); + ConditionVariableBroadcast(&WalSndCtl->cv); I think the comment above this change needs to be updated. @@ -3368,9 +3370,13 @@ WalSndWait(uint32 socket_events, long timeout, uint32 wait_event) WaitEvent event; ModifyWaitEvent(FeBeWaitSet, FeBeWaitSetSocketPos, socket_events, NULL); + + ConditionVariablePrepareToSleep(&WalSndCtl->cv); if (WaitEventSetWait(FeBeWaitSet, timeout, &event, 1, wait_event) == 1 && (event.events & WL_POSTMASTER_DEATH)) proc_exit(1); + + ConditionVariableCancelSleep(); May be worth to update the comment above WalSndWait() too? (to explain why a CV handling is part of it). @@ -108,6 +109,8 @@ typedef struct */ bool sync_standbys_defined; + ConditionVariable cv; Worth to give it a more meaning full name? (to give a clue as when it is used) I think we also need to update the comment above WalSndWakeup(): " * For cascading replication we need to wake up physical walsenders separately * from logical walsenders (see the comment before calling WalSndWakeup() in * ApplyWalRecord() for more details). " Regards, -- Bertrand Drouvot PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
pgsql-hackers by date: