Hi,
On 2021-08-03 22:23:58 +0300, Michail Nikolaev wrote:
> > I'm not sure that we need to care as much about the cost of
> > KnownAssignedXidsGetAndSetXmin() - for one, the caching we now have makes that
> > less frequent.
>
> It is still about 5-7% of CPU for a typical workload, a considerable
> amount for me.
I'm not saying we shouldn't optimize things. Just that it's less pressing. And
what kind of price we're willing to optimize may have changed.
> And a lot of systems still work on 12 and 13.
I don't see us backporting performance improvements around this to 12 and 13,
so I don't think that matters much... We've done that a few times, but usually
when there's some accidentally quadratic behaviour or such.
> > But more importantly, it'd not be hard to maintain an
> > occasionally (or opportunistically) maintained denser version for
> > GetSnapshotData() - there's only a single writer, making the concurrency
> > issues a lot simpler.
>
> Could you please explain it in more detail?
> Single writer and GetSnapshotData() already exclusively hold
> ProcArrayLock at the moment of KnownAssignedXidsRemove,
> KnownAssignedXidsGetAndSetXmin, and sometimes KnownAssignedXidsAdd.
GetSnapshotData() only locks ProcArrayLock in shared mode.
The problem is that we don't want to add a lot of work
KnownAssignedXidsAdd/Remove, because very often nobody will build a snapshot
for that moment and building a sorted, gap-free, linear array of xids isn't
cheap. In my experience it's more common to be bottlenecked by replay CPU
performance than on replica snapshot building.
During recovery, where there's only one writer to the procarray / known xids,
it might not be hard to opportunistically build a dense version of the known
xids whenever a snapshot is built.
Greetings,
Andres Freund