Re: BUG #8470: 9.3 locking/subtransaction performance regression - Mailing list pgsql-bugs

From Alvaro Herrera
Subject Re: BUG #8470: 9.3 locking/subtransaction performance regression
Date
Msg-id 20150410161704.GH4369@alvh.no-ip.org
Whole thread Raw
In response to Re: BUG #8470: 9.3 locking/subtransaction performance regression  (Bruce Momjian <bruce@momjian.us>)
Responses Re: BUG #8470: 9.3 locking/subtransaction performance regression  (Alvaro Herrera <alvherre@2ndquadrant.com>)
List pgsql-bugs
Bruce Momjian wrote:
> On Mon, Jan  5, 2015 at 02:23:18PM -0300, Alvaro Herrera wrote:
> > I just got around to merging this patch to 9.5 sources.  I'm glad I
> > did it because in the course of doing so I noticed a bug in a recent
> > patch, which led to commit d5e3d1e969d2f65009f718d3100d6565f47f9112
> > (back-patched to 9.3).
> >
> > I'm now more confident in this code and will probably push this a few
> > days, but only to 9.5 at least for now.  It probably won't apply cleanly
> > to 9.3 due to other changes in the area, such as 05315498012530d44cd89a2
> > and df630b0dd5ea2de52972d456f5978a012436115e and others.
>
> Where are we on this?

That's indeed the question.  I gave this a look yesterday and came up
with two patches that "fix" the issue.  (I had to fix one additional bug
in the formulation of the complex patch that I posted in January).  I
leave the details of the bug as exercise for the interested readers
(hint: run the isolation tests).  First some timings.  Ran the script
against unpatched master and each of the patches five times.  Results:

unpatched:
real    0m8.192s
real    0m8.230s
real    0m8.233s
real    0m8.187s
real    0m8.212s

simple patch:
real    0m0.741s
real    0m0.728s
real    0m0.729s
real    0m0.738s
real    0m0.731s

complex patch:
real    0m0.732s
real    0m0.723s
real    0m0.730s
real    0m0.725s
real    0m0.726s

In 9.2 the time is about 0.150s, so the regression is not completely
resolved, but it's a huge improvement.

The main catch of the "simple" formulation of the patch is that we do
the new GetMultiXactIdMembers call with the buffer lock held, which is a
horrible idea from a concurrency point of view; it will make many cases
where the optimization doesn't apply a lot slower.  I think with some
extra contortions we could fix that problem, but it's already quite ugly
that we have a duplicate check for the are-we-already-a-multixact-locker
so I reject the idea that this seemingly simple patch is any good.  I
much prefer the complex formulation, which is what I had to start with,
and makes thing a bit less unclear(*).

There was some traction to the idea of backpatching this, but I'm no
longer on board with that.  If somebody wants to, I would like some
commitment to a huge testing effort.


(*) In a scale 1 to 10 with 10 being most unclear, the original code is
about 12-unclear, and with the patch is 11-unclear.  So it's an
improvement anyhow.

PS: Apologies for unified diff.  This is one case where filterdiff
dropped some hunks from the patch produced by git diff.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

pgsql-bugs by date:

Previous
From: Michael Paquier
Date:
Subject: Re: PQexec() hangs on OOM
Next
From: Alvaro Herrera
Date:
Subject: Re: BUG #8470: 9.3 locking/subtransaction performance regression