Re: Update with subselect sometimes returns wrong result - Mailing list pgsql-bugs

From Andres Freund
Subject Re: Update with subselect sometimes returns wrong result
Date
Msg-id 20131201122434.GA26364@alap2.anarazel.de
Whole thread Raw
In response to Re: Update with subselect sometimes returns wrong result  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Update with subselect sometimes returns wrong result  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
On 2013-12-01 02:03:55 -0500, Tom Lane wrote:
> The apparent dependency on VACUUM is probably coming from updating the
> table's relpages/reltuples counts to new values in a way that causes the
> planner to think one version or the other is a bit cheaper.

Hah, didn't realize that for a good bit... Even though I had reproduced
the problem with just concurrently ANALYZEing the table. Things could
have clicked at that point...

> I'd still kind of like to know how HEAP_XMAX_IS_MULTI is getting
> involved,

Hopefully answered nearby.

> but it seems that the fundamental problem here is we haven't thought
> through what the interactions of LockRows and ModifyTable operations in
> the same query ought to be.

I think it's more that we haven't actually thought about the case where
both happen in the same plan at all ;). I think most of that code is
from the time where it was only possible to get there when using UPDATE
... WHERE CURRENT OF cursor_name because FOR .. wasn't allowed in
subselects.

The reason it reproducably fails is:
/*
 * The target tuple was already updated or deleted by the
 * current command, or by a later command in the current
 * transaction.  We *must* ignore the tuple in the former
 * case, so as to avoid the "Halloween problem" of repeated
 * update attempts.  In the latter case it might be sensible
 * to fetch the updated tuple instead, but doing so would
 * require changing heap_lock_tuple as well as heap_update and
 * heap_delete to not complain about updating "invisible"
 * tuples, which seems pretty scary.  So for now, treat the
 * tuple as deleted and do not process.
 */
goto lnext;

in ExecLockRows(), right? Is there actually a real "Halloween problem"
type of situation here?

But either way, even if we would manage to finagle some meaning into
that case, that query would still not be safe in any way, since there's
no determinism in which row the subselect will return.
On a green field, I'd say we should forbid using FOR UPDATE below an
UPDATE/DELETE and just allow combining them via a CTE. But that's
probably hard to do now.

Greetings,

Andres Freund

--
 Andres Freund                       http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

pgsql-bugs by date:

Previous
From: Andres Freund
Date:
Subject: Re: Update with subselect sometimes returns wrong result
Next
From: David Johnston
Date:
Subject: Re: Update with subselect sometimes returns wrong result