Re: Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold < - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold <
Date
Msg-id 20160413173544.wccprqbilckmoacb@alap3.anarazel.de
Whole thread Raw
In response to Re: Re: [COMMITTERS] pgsql: Avoid extra locks in GetSnapshotData if old_snapshot_threshold <  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On 2016-04-13 13:25:14 -0400, Robert Haas wrote:
> > With -c old_snapshot_threshold=0:
> >
> > latency average = 0.218 ms
> > latency stddev = 0.154 ms
> > tps = 584666.289753 (including connections establishing)
> > tps = 584867.785569 (excluding connections establishing)
> >
> >
> > With -c old_snapshot_threshold=10:
> >
> > latency average = 1.112 ms
> > latency stddev = 1.246 ms
> > tps = 114883.528964 (including connections establishing)
> > tps = 114905.555943 (excluding connections establishing)
> >
> >
> > With 848ef42bb8c7909c9d7baa38178d4a209906e7c1 (and followups) reverted:
> > latency average = 0.210 ms
> > latency stddev = 0.050 ms
> > tps = 607734.407158 (including connections establishing)
> > tps = 607918.118566 (excluding connections establishing)
>
> Yuck.  Aside from the fact that performance tanks when the feature is
> turned on

A quick look at the former shows that it's primarily contention around
the new OldSnapshotTimeMapLock not, on that hardware in that workload,
the spinlock. Which isn't that surprising because it adds an exclusive
lock to a path which doesn't contain any other exclusive locks these
days...

I have to say, I'm *highly* doubtful that it's ok to add an exclusive
lock in a readonly workload to such an hot path, without any clear path
forward how to fix that scalability issue. This doesn't apear to be
requiring just a bit of elbow grease, but a fair bit more.


> it seems that there is a significant effect even with it turned off.

It looks that way, but I'd rather run a bit more careful and repeated
tests to make sure about that part. At a factor of 5, as with the on/off
tests, per-run varitions don't play a large role, but at smaller
percentages it's worthwhile to put more care into it.  If possible it'd
be helpful to avoid a VM too...

Andres


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Parallel Aggregate costs don't consider combine/serial/deserial funcs
Next
From: Robert Haas
Date:
Subject: Re: Odd system-column handling in postgres_fdw join pushdown patch