RE: walsender performance regression due to logical decoding on standby changes - Mailing list pgsql-hackers

From Zhijie Hou (Fujitsu)
Subject RE: walsender performance regression due to logical decoding on standby changes
Date
Msg-id OS0PR01MB57166082BFFE53FDE1DBE14A94439@OS0PR01MB5716.jpnprd01.prod.outlook.com
Whole thread Raw
In response to Re: walsender performance regression due to logical decoding on standby changes  (Andres Freund <andres@anarazel.de>)
Responses Re: walsender performance regression due to logical decoding on standby changes
List pgsql-hackers
Hi,

On Monday, May 22, 2023 12:11 AM Andres Freund <andres@anarazel.de> wrote:
> On 2023-05-19 12:07:56 +0900, Kyotaro Horiguchi wrote:
> > At Thu, 18 May 2023 20:11:11 +0530, Bharath Rupireddy
> <bharath.rupireddyforpostgres@gmail.com> wrote in
> > > > > +
> ConditionVariableInit(&WalSndCtl->physicalWALSndCV);
> > > > > +             ConditionVariableInit(&WalSndCtl->logicalWALSndCV);
> > > >
> > > > It's not obvious to me that it's worth having two CVs, because it's more
> > > > expensive to find no waiters in two CVs than to find no waiters in one CV.
> > >
> > > I disagree. In the tight per-WAL record recovery loop, WalSndWakeup
> > > wakes up logical walsenders for every WAL record, but it wakes up
> > > physical walsenders only if the applied WAL record causes a TLI
> > > switch. Therefore, the extra cost of spinlock acquire-release for per
> > > WAL record applies only for logical walsenders. On the other hand, if
> > > we were to use a single CV, we would be unnecessarily waking up (if at
> > > all they are sleeping) physical walsenders for every WAL record -
> > > which is costly IMO.
> >
> > As I was reading this, I start thinking that one reason for the
> > regression could be to exccessive frequency of wakeups during logical
> > replication. In physical replication, we make sure to avoid exccessive
> > wakeups when the stream is tightly packed.  I'm just wondering why
> > logical replication doesn't (or can't) do the same thing, IMHO.
>
> It's possible we could try to reduce the frequency by issuing wakeups only at
> specific points. The most obvious thing to do would be to wake only when
> waiting for more WAL or when crossing a page boundary, or such.
> Unfortunately
> that could easily lead to deadlocks, because the startup process might be
> blocked waiting for a lock, held by a backend doing logical decoding - which
> can't progress until the startup process wakes the backend up.

Just out of curiosity about the mentioned deadlock scenario, no comment on the
committed patch.

About "a backend doing logical decoding", do you mean the case when a user
start a backend and invoke pg_logical_slot_get_changes() to do the logical
decoding ? If so, it seems the logical decoding in a backend won't be waked up
by startup process because the backend won't be registered as a walsender so
the backend won't be found in WalSndWakeup().

Or do you mean the deadlock between the real logical walsender and startup
process ? (I might miss something) I think the logical decoding doesn't lock
the target user relation when decoding because it normally can get the needed
information from WAL. Besides, the walsender sometimes will lock the system
table(e.g. use RelidByRelfilenumber() to get the relid) but it will unlock it
after finishing systable scan.

So, if possible, would you be able to share some details about the deadlock
case you mentioned earlier? it's helpful as it can prevent similar problems in
the future development.

Best Regards,
Hou zj



pgsql-hackers by date:

Previous
From: Nathan Bossart
Date:
Subject: Re: createuser --memeber and PG 16
Next
From: Peter Eisentraut
Date:
Subject: Re: Order changes in PG16 since ICU introduction