Bruce Momjian wrote:
> On Mon, Jan 5, 2015 at 02:23:18PM -0300, Alvaro Herrera wrote:
> > I just got around to merging this patch to 9.5 sources. I'm glad I
> > did it because in the course of doing so I noticed a bug in a recent
> > patch, which led to commit d5e3d1e969d2f65009f718d3100d6565f47f9112
> > (back-patched to 9.3).
> >
> > I'm now more confident in this code and will probably push this a few
> > days, but only to 9.5 at least for now. It probably won't apply cleanly
> > to 9.3 due to other changes in the area, such as 05315498012530d44cd89a2
> > and df630b0dd5ea2de52972d456f5978a012436115e and others.
>
> Where are we on this?
That's indeed the question. I gave this a look yesterday and came up
with two patches that "fix" the issue. (I had to fix one additional bug
in the formulation of the complex patch that I posted in January). I
leave the details of the bug as exercise for the interested readers
(hint: run the isolation tests). First some timings. Ran the script
against unpatched master and each of the patches five times. Results:
unpatched:
real 0m8.192s
real 0m8.230s
real 0m8.233s
real 0m8.187s
real 0m8.212s
simple patch:
real 0m0.741s
real 0m0.728s
real 0m0.729s
real 0m0.738s
real 0m0.731s
complex patch:
real 0m0.732s
real 0m0.723s
real 0m0.730s
real 0m0.725s
real 0m0.726s
In 9.2 the time is about 0.150s, so the regression is not completely
resolved, but it's a huge improvement.
The main catch of the "simple" formulation of the patch is that we do
the new GetMultiXactIdMembers call with the buffer lock held, which is a
horrible idea from a concurrency point of view; it will make many cases
where the optimization doesn't apply a lot slower. I think with some
extra contortions we could fix that problem, but it's already quite ugly
that we have a duplicate check for the are-we-already-a-multixact-locker
so I reject the idea that this seemingly simple patch is any good. I
much prefer the complex formulation, which is what I had to start with,
and makes thing a bit less unclear(*).
There was some traction to the idea of backpatching this, but I'm no
longer on board with that. If somebody wants to, I would like some
commitment to a huge testing effort.
(*) In a scale 1 to 10 with 10 being most unclear, the original code is
about 12-unclear, and with the patch is 11-unclear. So it's an
improvement anyhow.
PS: Apologies for unified diff. This is one case where filterdiff
dropped some hunks from the patch produced by git diff.
--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services