Re: BUG #13681: Serialization failures caused by new multixact code of 9.3 (back-patch request) - Mailing list pgsql-bugs

From Olivier Dony
Subject Re: BUG #13681: Serialization failures caused by new multixact code of 9.3 (back-patch request)
Date
Msg-id CAP4GjTKdFn4MF+GndusWWxOJNF2aQ=DYrPYoNRytEKkm2twUZg@mail.gmail.com
Whole thread Raw
In response to Re: BUG #13681: Serialization failures caused by new multixact code of 9.3 (back-patch request)  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Responses Re: BUG #13681: Serialization failures caused by new multixact code of 9.3 (back-patch request)  (Alvaro Herrera <alvherre@2ndquadrant.com>)
List pgsql-bugs
On Fri, Dec 18, 2015 at 7:53 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
>
> Alvaro Herrera wrote:
>
> > diff --git a/src/backend/access/heap/heapam.c b/src/backend/access/heap/heapam.c
> > index 559970f..aaf8e8e 100644
> > --- a/src/backend/access/heap/heapam.c
> > +++ b/src/backend/access/heap/heapam.c
> > @@ -4005,7 +4005,7 @@ l3:
> >               UnlockReleaseBuffer(*buffer);
> >               elog(ERROR, "attempted to lock invisible tuple");
> >       }
> > -     else if (result == HeapTupleBeingUpdated)
> > +     else if (result == HeapTupleBeingUpdated || result == HeapTupleUpdated)
> >       {
> >               TransactionId xwait;
> >               uint16          infomask;
> >
> > I think heap_lock_rows had that shape (only consider BeingUpdated as a
> > reason to check/wait) only because it was possible to lock a row that
> > was being locked by someone else, but it wasn't possible to lock a row
> > that had been updated by someone else -- which became possible in 9.3.
> > So this patch is necessary, and not just to fix this one bug.

I was surprised as well to see that the initial patch, supposed to be
an optimization, would be fixing this bug.
It's starting to make more sense with your analysis.


> (...)
> I have a hard time convincing myself that it's acceptable to back-patch
> such a change, in any case.  I observed no other regression failure, but
> what did change does make me a bit uncomfortable.

I'm afraid I won't be of much help to assess the consequences of the
patch, the PG source code and internal data structures are still
pretty new to me.

Would it count somehow in the balance that this use case worked fine
in 9.2, seems to be fine with regard to the documented behavior of
row-level locks and REPEATABLE READ isolation level, but suddenly
stopped working in 9.3/9.4? I suppose 9.3 is an old story, but 9.4 has
the same problem and will be around for a while, being the default
version in most LTS/stable distributions.

Thanks so much to both of you for looking further into this bug!

PS: I can confirm that the patch works, but you didn't need my confirmation ;-)

--
Olivier

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: BUG #9923: "reassign owned" does not change permissions grantor
Next
From: Jeff Janes
Date:
Subject: Re: [BUGS] GIN index isn’t working with intarray