Home > mailing lists

[PATCH] fix a performance issue with multiple logical-decoding walsenders - Mailing list pgsql-hackers

From	Pierre Ducroquet
Subject	[PATCH] fix a performance issue with multiple logical-decoding walsenders
Date	December 26, 2019 16:42:51
Msg-id	2931018.Vxl9zapr77@pierred-pdoc Whole thread Raw
Responses	Re: [PATCH] fix a performance issue with multiple logical-decoding walsenders
List	pgsql-hackers

Tree view

Hello

Our current setup uses logical replication to build a BI replication server 
along our primary clusters (running PG 10.10 so far). This implies having one 
logical replication slot per database. After some analysis, we identified two 
hot spots behind this issue. Fixing them gave us a 10 fold performance 
improvement in decoding speed.


We noticed our load had gotten quite bigger on our primary since the 
introduction of this replication, seeing spikes in system time when a lot of 
wal were being written (for instance when creating GIN indexes).

The first hot spot is PostmasterIsAlive. The implementation reads on a pipe to 
know if the postmaster is still alive, but this is very expensive kernel-wise. 
When switching the implementation to a much more primitive (and probably 
wrong):
    bool PostmasterIsAliveInternal() {
        return getppid() == PostmasterPid;
    }
we stopped seeing spikes in system time.
But after doing that, the CPU time used by our walsenders increased. We 
reached a second hot spot, this time in XLogSendLogical, where each walsender 
was using 100% of user CPU for minutes. After checking with perf, it appears 
all the decoders are fighting on GetFlushRecPtr.

On PostgreSQL 12, the call to PostmasterIsAlive is no longer present in 
WalSndLoop (thanks to commit cfdf4dc4f), so only the second hot spot is still 
present, with the same effects.

Attached to this email are two patches. 

The first one, specific to PG 10 for our use-case, simplifies the 
PostmasterIsAlive function, as described above. I don't know if this 
implementation is valid, but it was needed to uncover the performance issue in 
XLogSendLogical. Would it be possible to remove the systematic call to 
PostmasterIsAlive in WalSndLoop? We are not certain of the behaviour.

The second one was tested on PG 10 and PG 12 (with 48 lines offset). It has on 
PG12 the same effect it has on a PG10+isAlive patch. Instead of calling each 
time GetFlushRecPtr, we call it only if we notice we have reached the value of 
the previous call. This way, when the senders are busy decoding, we are no 
longer fighting for a spinlock to read the FlushRecPtr.

Here are some benchmark results.
On PG 10, to decode our replication stream, we went from 3m 43s to over 5 
minutes after removing the first hot spot, and then down to 22 seconds.
On PG 12, we had to change the benchmark (due to GIN indexes creation being 
more optimized) so we can not compare directly with our previous bench. We 
went from 15m 11s down to 59 seconds.
If needed, we can provide scripts to reproduce this situation. It is quite 
simple: add ~20 walsenders doing logical replication in database A, and then 
generate a lot of data in database B. The walsenders will be woken up by the 
activity on database B, but not sending it thus keeping hitting the same 
locks.

Regards

Attachment

pgsql-hackers by date:

From: Tom Lane
Date: 26 December 2019, 16:20:48
Subject: Re: unsupportable composite type partition keys

From: Tom Lane
Date: 26 December 2019, 17:47:31
Subject: Re: Rearranging ALTER TABLE to avoid multi-operations bugs

[PATCH] fix a performance issue with multiple logical-decoding walsenders - Mailing list pgsql-hackers

Attachment

Previous

Next