Re: Slow standby snapshot - Mailing list pgsql-hackers

From Michail Nikolaev
Subject Re: Slow standby snapshot
Date
Msg-id CANtu0oiPoSdQsjRd6Red5WMHi1E83d2+-bM9J6dtWR3c5Tap9g@mail.gmail.com
Whole thread Raw
In response to Re: Slow standby snapshot  (Simon Riggs <simon.riggs@enterprisedb.com>)
List pgsql-hackers
Hello everyone.

> However ... I tried to reproduce the original complaint, and
> failed entirely.  I do see KnownAssignedXidsGetAndSetXmin
> eating a bit of time in the standby backends, but it's under 1%
> and doesn't seem to be rising over time.  Perhaps we've already
> applied some optimization that ameliorates the problem?  But
> I tested v13 as well as HEAD, and got the same results.

> Hmm.  I wonder if my inability to detect a problem is because the startup
> process does keep ahead of the workload on my machine, while it fails
> to do so on the OP's machine.  I've only got a 16-CPU machine at hand,
> which probably limits the ability of the primary to saturate the standby's
> startup process.

Yes, optimization by Andres Freund made things much better, but the
impact is still noticeable.

I was also using 16CPU machine - but two of them (primary and standby).

Here are the scripts I was using (1) for benchmark - maybe it could help.


> Nowadays we've *got* those primitives.  Can we get rid of
> known_assigned_xids_lck, and if so would it make a meaningful
> difference in this scenario?

I was trying it already - but was unable to find real benefits for it.
WIP patch in attachment.

Hm, I see I have sent it to list, but it absent in archives... Just
quote from it:

> First potential positive effect I could see is
> (TransactionIdIsInProgress -> KnownAssignedXidsSearch) locking but
> seems like it is not on standby hotpath.

> Second one - locking for KnownAssignedXidsGetAndSetXmin (build
> snapshot). But I was unable to measure impact. It wasn’t visible
> separately in (3) test.

> Maybe someone knows scenario causing known_assigned_xids_lck or
> TransactionIdIsInProgress become bottleneck on standby?

The latest question is still actual :)

> I think it might be a bigger effect than one might immediately think. Because
> the spinlock will typically be on the same cacheline as head/tail, and because
> every spinlock acquisition requires the cacheline to be modified (and thus
> owned mexlusively) by the current core, uses of head/tail will very commonly
> be cache misses even in workloads without a lot of KAX activity.

I was trying to find some way to practically achieve any noticeable
impact here, but unsuccessfully.

>> But yeah, it does feel like the proposed
>> approach is only going to be optimal over a small range of conditions.

> In particular, it doesn't adapt at all to workloads that don't replay all that
> much, but do compute a lot of snapshots.

The approach (2) was optimized to avoid any additional work for anyone
except non-startup
processes (approach with offsets to skip gaps while building snapshot).


[1]: https://gist.github.com/michail-nikolaev/e1dfc70bdd7cfd1b902523dbb3db2f28
[2]:
https://www.postgresql.org/message-id/flat/CANtu0ogzo4MsR7My9%2BNhu3to5%3Dy7G9zSzUbxfWYOn9W5FfHjTA%40mail.gmail.com#341a3c3b033f69b260120b3173a66382

--
Michail Nikolaev

Attachment

pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: ExecRTCheckPerms() and many prunable partitions
Next
From: Bharath Rupireddy
Date:
Subject: Re: Reducing power consumption on idle servers