Hi,
On 2023-05-11 09:42:39 +0000, Zhijie Hou (Fujitsu) wrote:
> I did some simple tests for this to see the performance impact on
> the streaming replication, just share it here for reference.
>
> 1) sync primary-standby setup, load data on primary and count the time spent on
> replication. the degradation will be more obvious as the value of max_wal_senders
> increases.
FWIW, using syncrep likely under-estimates the overhead substantially, because
that includes a lot overhead on the WAL generating side. I saw well over 20%
overhead for the default max_wal_senders=10.
I just created a standby, shut it down, then ran a deterministically-sized
workload on the primary, started the standby, and measured how long it took to
catch up. I just used the log messages to measure the time.
> 2) Similar as 1) but count the time that the standby startup process spent on
> replaying WAL(via gprof).
I don't think that's the case here, but IME gprof's overhead is so high, that
it can move bottlenecks quite drastically. The problem is that it adds code to
every function enter/exit - for simple functions, that overhead is much higher
than the "original" cost of the function.
gprof style instrumentation is good for things like code coverage, but for
performance evaluation it's normally better to use a sampling profiler like
perf. That also causes slowdowns, but largely only in places that already take
up substantial execution time.
Greetings,
Andres Freund