Re: walsender performance regression due to logical decoding on standby changes - Mailing list pgsql-hackers

From Andres Freund
Subject Re: walsender performance regression due to logical decoding on standby changes
Date
Msg-id 20230521161046.3qcrunn5xxiwfhhu@awork3.anarazel.de
Whole thread Raw
In response to Re: walsender performance regression due to logical decoding on standby changes  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Responses RE: walsender performance regression due to logical decoding on standby changes
List pgsql-hackers
Hi,

On 2023-05-19 12:07:56 +0900, Kyotaro Horiguchi wrote:
> At Thu, 18 May 2023 20:11:11 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in 
> > > > +             ConditionVariableInit(&WalSndCtl->physicalWALSndCV);
> > > > +             ConditionVariableInit(&WalSndCtl->logicalWALSndCV);
> > >
> > > It's not obvious to me that it's worth having two CVs, because it's more
> > > expensive to find no waiters in two CVs than to find no waiters in one CV.
> > 
> > I disagree. In the tight per-WAL record recovery loop, WalSndWakeup
> > wakes up logical walsenders for every WAL record, but it wakes up
> > physical walsenders only if the applied WAL record causes a TLI
> > switch. Therefore, the extra cost of spinlock acquire-release for per
> > WAL record applies only for logical walsenders. On the other hand, if
> > we were to use a single CV, we would be unnecessarily waking up (if at
> > all they are sleeping) physical walsenders for every WAL record -
> > which is costly IMO.
> 
> As I was reading this, I start thinking that one reason for the
> regression could be to exccessive frequency of wakeups during logical
> replication. In physical replication, we make sure to avoid exccessive
> wakeups when the stream is tightly packed.  I'm just wondering why
> logical replication doesn't (or can't) do the same thing, IMHO.

It's possible we could try to reduce the frequency by issuing wakeups only at
specific points. The most obvious thing to do would be to wake only when
waiting for more WAL or when crossing a page boundary, or such. Unfortunately
that could easily lead to deadlocks, because the startup process might be
blocked waiting for a lock, held by a backend doing logical decoding - which
can't progress until the startup process wakes the backend up.

So I don't think this is promising avenue in the near term.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: Naming of gss_accept_deleg
Next
From: Andres Freund
Date:
Subject: Re: walsender performance regression due to logical decoding on standby changes