RE: walsender performance regression due to logical decoding on standby changes - Mailing list pgsql-hackers

From Zhijie Hou (Fujitsu)
Subject RE: walsender performance regression due to logical decoding on standby changes
Date
Msg-id OS0PR01MB57165FA7A7B24BEF34BAE75194749@OS0PR01MB5716.jpnprd01.prod.outlook.com
Whole thread Raw
In response to walsender performance regression due to logical decoding on standby changes  (Andres Freund <andres@anarazel.de>)
Responses Re: walsender performance regression due to logical decoding on standby changes
List pgsql-hackers
On Wednesday, May 10, 2023 3:03 AM Andres Freund <andres@anarazel.de> wrote:
> Hi,
> 
> Unfortunately I have found the following commit to have caused a
> performance
> regression:
> 
> commit e101dfac3a53c20bfbf1ca85d30a368c2954facf
> Author: Andres Freund <andres@anarazel.de>
> Date:   2023-04-08 00:24:24 -0700
> 
>     For cascading replication, wake physical and logical walsenders separately
> 
>     Physical walsenders can't send data until it's been flushed; logical
>     walsenders can't decode and send data until it's been applied. On the
>     standby, the WAL is flushed first, which will only wake up physical
>     walsenders; and then applied, which will only wake up logical
>     walsenders.
> 
>     Previously, all walsenders were awakened when the WAL was flushed. That
>     was fine for logical walsenders on the primary; but on the standby the
>     flushed WAL would have been not applied yet, so logical walsenders were
>     awakened too early.
> 
>     Per idea from Jeff Davis and Amit Kapila.
> 
>     Author: "Drouvot, Bertrand" <bertranddrouvot.pg@gmail.com>
>     Reviewed-By: Jeff Davis <pgsql@j-davis.com>
>     Reviewed-By: Robert Haas <robertmhaas@gmail.com>
>     Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
>     Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
>     Discussion:
> https://postgr.es/m/CAA4eK1+zO5LUeisabX10c81LU-fWMKO4M9Wyg1cdkb
> W7Hqh6vQ@mail.gmail.com
> 
> The problem is that, on a standby, after the change - as needed to for the
> approach to work - the call to WalSndWakeup() in ApplyWalRecord() happens
> for
> every record, instead of only happening when the timeline is changed (or WAL
> is flushed or ...).
> 
> WalSndWakeup() iterates over all walsender slots, regardless of whether in
> use. For each of the walsender slots it acquires a spinlock.
> 
> When replaying a lot of small-ish WAL records I found the startup process to
> spend the majority of the time in WalSndWakeup(). I've not measured it very
> precisely yet, but the overhead is significant (~35% slowdown), even with the
> default max_wal_senders. If that's increased substantially, it obviously gets
> worse.

I did some simple tests for this to see the performance impact on
the streaming replication, just share it here for reference.

1) sync primary-standby setup, load data on primary and count the time spent on
  replication. the degradation will be more obvious as the value of max_wal_senders
  increases.

max_wal_senders    before(ms)    after(ms)    degradation
100    13394.4013    14141.2615    5.58%
200    13280.8507    14597.1173    9.91%
300    13535.0232    16735.7379    23.65%

2) Similar as 1) but count the time that the standby startup process spent on
   replaying WAL(via gprof).

10 senders
===========
before
  %   cumulative   self              self     total           
 time   seconds   seconds    calls   s/call   s/call  name    
  4.12      0.45     0.11        1     0.11     2.46  PerformWalRecovery

after
  %   cumulative   self              self     total           
 time   seconds   seconds    calls   s/call   s/call  name    
 17.99      0.59     0.59  4027383     0.00     0.00  WalSndWakeup
  8.23      0.86     0.27        1     0.27     3.11  PerformWalRecovery

100 senders
===========
before
  %   cumulative   self              self     total           
 time   seconds   seconds    calls   s/call   s/call  name    
  5.56      0.36     0.18        1     0.18     2.91  PerformWalRecovery

after
  %   cumulative   self              self     total           
 time   seconds   seconds    calls   s/call   s/call  name    
 64.65      4.39     4.39  4027383     0.00     0.00  WalSndWakeup
  2.95      4.59     0.20        1     0.20     6.62  PerformWalRecovery

Will test after applying the latest patch in this thread later.

Best Regards,
Hou zj

pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: smgrzeroextend clarification
Next
From: Daniel Gustafsson
Date:
Subject: Re: Redundant strlen(query) in get_rel_infos