Re: Update with subselect sometimes returns wrong result - Mailing list pgsql-bugs

From Tom Lane
Subject Re: Update with subselect sometimes returns wrong result
Date
Msg-id 22220.1385880062@sss.pgh.pa.us
Whole thread Raw
In response to Re: Update with subselect sometimes returns wrong result  (Andres Freund <andres@2ndquadrant.com>)
Responses Re: Update with subselect sometimes returns wrong result  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Update with subselect sometimes returns wrong result  (Andres Freund <andres@2ndquadrant.com>)
List pgsql-bugs
Andres Freund <andres@2ndquadrant.com> writes:
> Slightly hacked up (probably python 2 only) version of the test script
> attached.

Ah, deleting and reinserting the rows each time like that makes it a
lot more reproducible.  I don't have the full story but it's got something
to do with screwed-up tuple info flags.  The plan that's generated looks
like

 Update on t1
   ->  Nested Loop
         Join Filter: (t1.id = subset.id)
         ->  Seq Scan on t1
         ->  Subquery Scan on subset
               ->  Limit
                     ->  LockRows
                           ->  Seq Scan on t1 t1_1

so the subquery scan will be executed twice, once for each row in t1.
The first time through, heap_lock_tuple looks at the first tuple and
locks it.  It doesn't get called on the second tuple because the LIMIT
node stops evaluation.  But in the second scan, heap_lock_tuple returns
HeapTupleSelfUpdated for the first tuple, so ExecLockRows ignores it,
moves on to the second tuple, and returns that.  The LIMIT is happy cause
it just got one row back; it doesn't know it's not the same row as
before.  And this row of course passes the join qual with the second
row from the outer scan of t1, so we perform a second update.

Now, the thing about this is that the tuple heap_lock_tuple is rejecting
in the second pass is the one that we just updated, up at the ModifyTable
plan node.  So I can't find it exactly surprising that it says
HeapTupleSelfUpdated.  But tracing through tqual.c shows that the tuple
has got the HEAP_XMAX_IS_MULTI bit set, which might be thought a bit
peculiar.  There's not multiple transactions involved.

Anyway, at this point I'm not so much wondering why it fails as why it
(seems to) work *any* of the time.  And how is it that VACUUM sometimes
manages to flip it from working state to not-working state?  (Once you're
in the state where the UPDATE will say it updated two rows, it's 100%
reproducible.)

Anyway, it seems pretty clear that the explanation is down somewhere in
the tuple visibility and multixact logic that you and Alvaro have been
hacking on with such vim lately.  I'm out of steam for tonight, over
to you ...

            regards, tom lane

pgsql-bugs by date:

Previous
From: David Johnston
Date:
Subject: Re: Update with subselect sometimes returns wrong result
Next
From: Tom Lane
Date:
Subject: Re: Update with subselect sometimes returns wrong result