Re: pgsql: Add the "snapshot too old" feature - Mailing list pgsql-hackers

From Kevin Grittner
Subject Re: pgsql: Add the "snapshot too old" feature
Date
Msg-id CACjxUsMwh980686LfmqGOGv58XsTpyUqqHPcsD3NGF5CrnXs+A@mail.gmail.com
Whole thread Raw
In response to Re: [COMMITTERS] pgsql: Add the "snapshot too old" feature  (Alexander Korotkov <a.korotkov@postgrespro.ru>)
Responses Re: pgsql: Add the "snapshot too old" feature  (Kevin Grittner <kgrittn@gmail.com>)
List pgsql-hackers
On Mon, Apr 11, 2016 at 8:20 AM, Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:

> 1) We uglify buffer manager interface a lot.  This patch adds 3 more
> arguments to BufferGetPage().  It looks weird.  Should we try to find less
> invasive way for doing this?

As already pointed out, I originally touched about 450 fewer places
in the code, and did not change the signature of BufferGetPage().
I was torn on the argument that we needed a "forced choice" on
checking the snapshot age built into BufferGetPage() -- it is a bit
annoying, but it might prevent a later bug which could silently
return incorrect data.  In the end, I caved to the argument that
the annoyance was worth the improved chances of avoiding such a
bug.

> 2) Did you ever try to examine performance regression?

Yes.  Our customer used big machines for extensive performance
testing -- although they used "paced" input to simulate real users,
and saw nothing but gains.  On my own i7 box, your test scaled to
100 (so it would fit in memory) yielded this:

unpatched:

number of transactions actually processed: 40534737
latency average = 0.739 ms
latency stddev = 0.359 ms
tps = 134973.430374 (including connections establishing)
tps = 135039.395041 (excluding connections establishing)

with the "snapshot too old" patch:

number of transactions actually processed: 40679654
latency average = 0.735 ms
latency stddev = 0.486 ms
tps = 135463.614244 (including connections establishing)
tps = 135866.160933 (excluding connections establishing)

> I tried simple read
> only test on 4x18 Intel server.
> pgbench -s 1000 -c 140 -j 100 -M prepared -S -T 300 -P 1 postgres (data fits
> to shared_buffers)
>
> master                    -   193 740.3 TPS
> snapshot too old reverted - 1 065 732.6 TPS
>
> So, for read-only benchmark this patch introduces more than 5 times
> regression on big machine.

I did not schedule a saturation test on a big machine.  I guess I
should have done.

I'm confident this can be fixed; your suggestion for a wrapping
"if" is probably sufficient.  I am looking at this and the misuse
of "volatile" now.

Are you able to easily test that or should I book time on one (or
more) of our big machines on my end?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: "Shulgin, Oleksandr"
Date:
Subject: Re: PQsendQuery+PQgetResult+PQsetSingleRowMode limitations and support
Next
From: "Karl O. Pinc"
Date:
Subject: Re: PQsendQuery+PQgetResult+PQsetSingleRowMode limitations and support