Re: Update with subselect sometimes returns wrong result - Mailing list pgsql-bugs

From Andres Freund
Subject Re: Update with subselect sometimes returns wrong result
Date
Msg-id 20131201114514.GG18793@alap2.anarazel.de
Whole thread Raw
In response to Re: Update with subselect sometimes returns wrong result  (Andres Freund <andres@2ndquadrant.com>)
Responses Re: Update with subselect sometimes returns wrong result  (Andres Freund <andres@2ndquadrant.com>)
Re: Update with subselect sometimes returns wrong result  (Alvaro Herrera <alvherre@2ndquadrant.com>)
List pgsql-bugs
On 2013-12-01 09:19:19 +0100, Andres Freund wrote:
> On 2013-12-01 01:41:02 -0500, Tom Lane wrote:
> > Now, the thing about this is that the tuple heap_lock_tuple is rejecting
> > in the second pass is the one that we just updated, up at the ModifyTable
> > plan node.  So I can't find it exactly surprising that it says
> > HeapTupleSelfUpdated.  But tracing through tqual.c shows that the tuple
> > has got the HEAP_XMAX_IS_MULTI bit set, which might be thought a bit
> > peculiar.  There's not multiple transactions involved.
>
> Hm. Haven't looked at that, but if so that seems like an independent bug.

Afaics it's a missed optimization, nothing more. There's a branch in
compute_new_xmax_infomask() where it considers xmax == add_to_xmax with
imo somewhat strange conditions for the optimization:

/*
 * If the existing lock mode is identical to or weaker than the new
 * one, we can act as though there is no existing lock, so set
 * XMAX_INVALID and restart.
 */
if (xmax == add_to_xmax)
{
    LockTupleMode old_mode = TUPLOCK_from_mxstatus(status);
    bool        old_isupd = ISUPDATE_from_mxstatus(status);

    /*
     * We can do this if the new LockTupleMode is higher or equal than
     * the old one; and if there was previously an update, we need an
     * update, but if there wasn't, then we can accept there not being
     * one.
     */
    if ((mode >= old_mode) && (is_update || !old_isupd))
    {
        /*
         * Note that the infomask might contain some other dirty bits.
         * However, since the new infomask is reset to zero, we only
         * set what's minimally necessary, and that the case that
         * checks HEAP_XMAX_INVALID is the very first above, there is
         * no need for extra cleanup of the infomask here.
         */
        old_infomask |= HEAP_XMAX_INVALID;
        goto l5;
    }
}

Several things:
a) If the old lockmode is stronger than the new one, we can just promote
   the new one. That's fine.
b) the old xmax cannot be an update, we wouldn't see the row version in that
   case. And in any way, ISUPDATE_from_mxstatus() isn't sufficient to
   determine whether the old row was an update, one needs to look at
   LOCK_ONLY as well, no?
c) Any reason we can't apply this optimization for subtransactions in
   some scenarios?

a), b) are relatively easy. Patch attached. Being a clear regression, I
think it should be backpatched, but I'm not sure if it has to be this
point release. It's simple enough, but ...

The reason we didn't hit the (mode == old_mode && is_update) case above
is that the UPDATE uses only LockTupleNoKeyExclusive since there are no
keys, but FOR UPDATE was used before.

I am not awake enough to think about c) and it seems somewhat
more complicated. I think we can only optimize it if the previous lock
was in a subcommited subtransaction and not if it's a transaction higher
up the chain.

On the note of superflous MultiXactIdCreate() calls, the latter contains
the following assert:
   Assert(!TransactionIdEquals(xid1, xid2) || (status1 != status2));
Is there any reason we *ever* should add two times the same xid? Afaics
there's several code paths to get there today, but imo we should work
towards that never happening.

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Attachment

pgsql-bugs by date:

Previous
From: Andres Freund
Date:
Subject: Re: Update with subselect sometimes returns wrong result
Next
From: Andres Freund
Date:
Subject: Re: Update with subselect sometimes returns wrong result