Thread: Incorrect UPDATE trigger invocation in the UPDATE clause of an UPSERT statement.

Incorrect UPDATE trigger invocation in the UPDATE clause of an UPSERT statement.

From
Stanislav Grozev
Date:
Hello,

I'd like to report a bug in the UPDATE trigger invocation when using the new INSERT ON CONFLICT UPDATE (UPSERT) functionality.

In short, if an UPDATE trigger is invoked by the ON CONFLICT DO UPDATE clause of an UPSERT statement - it receives the new values in both the OLD and NEW variables.  Whereas if invoked by a normal UPDATE statement - it correctly gets the respective values in OLD and NEW.

Here's a short test case to reproduce it:

test=# CREATE OR REPLACE FUNCTION public.test_trigger() RETURNS trigger AS $$
    BEGIN
        RAISE NOTICE '%:%: old: %, new: %', TG_NAME, TG_OP, OLD, NEW;
        IF OLD.value IS DISTINCT FROM NEW.value THEN
            RAISE WARNING '%:%: Values are different!', TG_NAME, TG_OP;
        ELSE
            RAISE NOTICE '%:%: Values are the same', TG_NAME, TG_OP;
        END IF;
        RETURN NEW;
    END;
$$ LANGUAGE plpgsql;

test=# CREATE table test(id serial primary key, value varchar(32));

test=# CREATE TRIGGER test_trigger after UPDATE on test for each row execute procedure test_trigger();

test=# INSERT INTO test (value) VALUES('initial value');

test=# SELECT * FROM test;

┌────┬───────────────┐
│ id │     value     │
├────┼───────────────┤
│  1 │ initial value │
└────┴───────────────┘
(1 row)

Now, if we do an UPDATE, everything is as expected:

test=# UPDATE test SET value='plain update value' WHERE id=1;
NOTICE:  00000: test_trigger:UPDATE: old: (1,"initial value"), new: (1,"plain update value")
LOCATION:  exec_stmt_raise, pl_exec.c:3216
WARNING:  01000: test_trigger:UPDATE: Values are different!
LOCATION:  exec_stmt_raise, pl_exec.c:3216
UPDATE 1

If we do an UPSERT instead, watch how OLD and NEW are the same (NEW):

test=# INSERT INTO test (id, value) VALUES(1, 'upserted value') ON CONFLICT ON CONSTRAINT test_pkey DO UPDATE SET value='upserted value';
NOTICE:  00000: test_trigger:UPDATE: old: (1,"upserted value"), new: (1,"upserted value")
LOCATION:  exec_stmt_raise, pl_exec.c:3216
NOTICE:  00000: test_trigger:UPDATE: Values are the same
LOCATION:  exec_stmt_raise, pl_exec.c:3216
INSERT 0 1

We have traced the problem to be in the src/backend/executor/nodeModifyTable.c file, more specifically in the ExecOnConflictUpdate function. The attached simple patch appears to fix the issue. Now, the UPSERT behaves correctly (at least what we think should be correct):

test=# INSERT INTO test (id, value) VALUES(1, 'upserted value') ON CONFLICT ON CONSTRAINT test_pkey DO UPDATE SET value='upserted value';
NOTICE:  00000: test_trigger:UPDATE: old: (1,"initial value"), new: (1,"upserted value")
LOCATION:  exec_stmt_raise, pl_exec.c:3216
WARNING:  01000: test_trigger:UPDATE: Values are different!
LOCATION:  exec_stmt_raise, pl_exec.c:3216
INSERT 0 1

We have verified this behaviour with PostgreSQL 9.5 beta1, beta2 and Git head.

Thanks.
--


-S

Attachment

Re: Incorrect UPDATE trigger invocation in the UPDATE clause of an UPSERT statement.

From
Michael Paquier
Date:
On Mon, Nov 30, 2015 at 8:43 PM, Stanislav Grozev wrote:
> If we do an UPSERT instead, watch how OLD and NEW are the same (NEW):

AFAIK, that's the expected behavior. AFTER UPDATE triggers firing for
ON CONFLICT DO UPDATE will see the same NEW and OLD values. Comments
from others?
--
Michael

Re: Incorrect UPDATE trigger invocation in the UPDATE clause of an UPSERT statement.

From
Stanislav Grozev
Date:
I fail to see the logic and why would it be useful for an UPDATE trigger to
get the same values. Especially when it would differ from a normal AFTER
UPDATE firing for an UPDATE query?

On Thu, Dec 3, 2015 at 9:30 AM Michael Paquier <michael.paquier@gmail.com>
wrote:

> On Mon, Nov 30, 2015 at 8:43 PM, Stanislav Grozev wrote:
> > If we do an UPSERT instead, watch how OLD and NEW are the same (NEW):
>
> AFAIK, that's the expected behavior. AFTER UPDATE triggers firing for
> ON CONFLICT DO UPDATE will see the same NEW and OLD values. Comments
> from others?
> --
> Michael
>
--


-S
On 3 December 2015 at 07:30, Michael Paquier <michael.paquier@gmail.com> wrote:
> On Mon, Nov 30, 2015 at 8:43 PM, Stanislav Grozev wrote:
>> If we do an UPSERT instead, watch how OLD and NEW are the same (NEW):
>
> AFAIK, that's the expected behavior. AFTER UPDATE triggers firing for
> ON CONFLICT DO UPDATE will see the same NEW and OLD values. Comments
> from others?

That's not what I would expect. I would expect it to be consistent
with AFTER triggers firing after a plain UPDATE, and also with BEFORE
triggers for ON CONFLICT DO UPDATE.

Regards,
Dean

Re: Incorrect UPDATE trigger invocation in the UPDATE clause of an UPSERT statement.

From
Michael Paquier
Date:
On Thu, Dec 3, 2015 at 5:19 PM, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
> On 3 December 2015 at 07:30, Michael Paquier <michael.paquier@gmail.com> wrote:
>> On Mon, Nov 30, 2015 at 8:43 PM, Stanislav Grozev wrote:
>>> If we do an UPSERT instead, watch how OLD and NEW are the same (NEW):
>>
>> AFAIK, that's the expected behavior. AFTER UPDATE triggers firing for
>> ON CONFLICT DO UPDATE will see the same NEW and OLD values. Comments
>> from others?
>
> That's not what I would expect. I would expect it to be consistent
> with AFTER triggers firing after a plain UPDATE, and also with BEFORE
> triggers for ON CONFLICT DO UPDATE.

triggers.out is telling that this may be intended to work this way:
WARNING:  after update (old): (5,"updated green trig modified")
WARNING:  after update (new): (5,"updated green trig modified")
Though I agree that it is not instinctive...

Btw, the patch provided fails on an assertion with regression tests:
   2426        /* Determine lock mode to use */
   2427        lockmode = ExecUpdateLockMode(estate, relinfo);
   2428
-> 2429        Assert(HeapTupleIsValid(fdw_trigtuple) ^
ItemPointerIsValid(tupleid));
   2430        if (fdw_trigtuple == NULL)

Peter, Andres, thoughts?
--
Michael

Re: Incorrect UPDATE trigger invocation in the UPDATE clause of an UPSERT statement.

From
Peter Geoghegan
Date:
On Thu, Dec 3, 2015 at 4:34 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> triggers.out is telling that this may be intended to work this way:
> WARNING:  after update (old): (5,"updated green trig modified")
> WARNING:  after update (new): (5,"updated green trig modified")
> Though I agree that it is not instinctive...
>
> Btw, the patch provided fails on an assertion with regression tests:
>    2426        /* Determine lock mode to use */
>    2427        lockmode = ExecUpdateLockMode(estate, relinfo);
>    2428
> -> 2429        Assert(HeapTupleIsValid(fdw_trigtuple) ^
> ItemPointerIsValid(tupleid));
>    2430        if (fdw_trigtuple == NULL)
>
> Peter, Andres, thoughts?

I agree with Stanislav that the behavior is wrong. The fix is more
complicated, though. As it says at the top of ExecUpdate():

 * When updating a table, tupleid identifies the tuple to
 * update and oldtuple is NULL.  When updating a view, oldtuple

Since the ON CONFLICT DO UPDATE variant is always updating a table, it
is always supposed to pass oldtuple = NULL. The problem is that unlike
a regular update, it cannot reconstruct the original/target tuple for
an AFTER UPDATE trigger *correctly* -- the GetTupleForTrigger() logic
depends on executor state, which is not consistent with a regular
update.

I'll need to think about a fix.

--
Peter Geoghegan

Re: Incorrect UPDATE trigger invocation in the UPDATE clause of an UPSERT statement.

From
Michael Paquier
Date:
On Fri, Dec 4, 2015 at 6:10 AM, Peter Geoghegan <pg@heroku.com> wrote:
> On Thu, Dec 3, 2015 at 4:34 AM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
>> triggers.out is telling that this may be intended to work this way:
>> WARNING:  after update (old): (5,"updated green trig modified")
>> WARNING:  after update (new): (5,"updated green trig modified")
>> Though I agree that it is not instinctive...
>>
>> Btw, the patch provided fails on an assertion with regression tests:
>>    2426        /* Determine lock mode to use */
>>    2427        lockmode = ExecUpdateLockMode(estate, relinfo);
>>    2428
>> -> 2429        Assert(HeapTupleIsValid(fdw_trigtuple) ^
>> ItemPointerIsValid(tupleid));
>>    2430        if (fdw_trigtuple == NULL)
>>
>> Peter, Andres, thoughts?
>
> I agree with Stanislav that the behavior is wrong. The fix is more
> complicated, though. As it says at the top of ExecUpdate():
>
>  * When updating a table, tupleid identifies the tuple to
>  * update and oldtuple is NULL.  When updating a view, oldtuple
>
> Since the ON CONFLICT DO UPDATE variant is always updating a table, it
> is always supposed to pass oldtuple = NULL. The problem is that unlike
> a regular update, it cannot reconstruct the original/target tuple for
> an AFTER UPDATE trigger *correctly* -- the GetTupleForTrigger() logic
> depends on executor state, which is not consistent with a regular
> update.
>
> I'll need to think about a fix.

I am adding that to the list of open items for 9.5.
--
Michael

Re: Incorrect UPDATE trigger invocation in the UPDATE clause of an UPSERT statement.

From
Peter Geoghegan
Date:
On Thu, Dec 3, 2015 at 3:53 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> I am adding that to the list of open items for 9.5.

I did already.

--
Peter Geoghegan

Re: Incorrect UPDATE trigger invocation in the UPDATE clause of an UPSERT statement.

From
Peter Geoghegan
Date:
On Thu, Dec 3, 2015 at 1:10 PM, Peter Geoghegan <pg@heroku.com> wrote:
> I'll need to think about a fix.

The problem was with the pointer we pass to ExecUpdate().

It's a pointer to the target tuple in shared memory. So the field
"tuple.t_data->t_ctid" within ExecOnConflictUpdate() starts out
pointing to an ItemPointerData with the correct ctid (when it
initially points to the current/target tuple, since as an
about-to-be-upserted tuple the "t_ctid" field must be a pointer to the
self-same tuple). Then, it is modified in-place in shared memory by
heap_update(), within its critical section.

The fix is to take a deep copy (pass a pointer to an ItemPointerData
on the stack), as in the attached. I've also fixed up the tests, which
should have caught this, but didn't. Mea culpa.

Many thanks to Stanislav for the report! While I didn't adopt his
suggestion, he certainly almost had it right.

--
Peter Geoghegan

Attachment

Re: Incorrect UPDATE trigger invocation in the UPDATE clause of an UPSERT statement.

From
Michael Paquier
Date:
On Fri, Dec 4, 2015 at 10:34 AM, Peter Geoghegan <pg@heroku.com> wrote:
> On Thu, Dec 3, 2015 at 1:10 PM, Peter Geoghegan <pg@heroku.com> wrote:
>> I'll need to think about a fix.
>
> The problem was with the pointer we pass to ExecUpdate().
>
> It's a pointer to the target tuple in shared memory. So the field
> "tuple.t_data->t_ctid" within ExecOnConflictUpdate() starts out
> pointing to an ItemPointerData with the correct ctid (when it
> initially points to the current/target tuple, since as an
> about-to-be-upserted tuple the "t_ctid" field must be a pointer to the
> self-same tuple). Then, it is modified in-place in shared memory by
> heap_update(), within its critical section.
>
> The fix is to take a deep copy (pass a pointer to an ItemPointerData
> on the stack), as in the attached. I've also fixed up the tests, which
> should have caught this, but didn't. Mea culpa.
>
> Many thanks to Stanislav for the report! While I didn't adopt his
> suggestion, he certainly almost had it right.

Cool, that looks right and there are no assertion problems. In the
case of a plain UPDATE the old tuple referenced in both BEFORE/AFTER
triggers is the same, so things are indeed consistent this way
(attached is simple test case I played with to check manually the
patch).
--
Michael

Attachment

Re: Incorrect UPDATE trigger invocation in the UPDATE clause of an UPSERT statement.

From
Stanislav Grozev
Date:
Peter, thank you very much. We have adopted your patch in our test
environment and I can confirm it works as expected.

Cheers.

On Fri, Dec 4, 2015 at 3:34 AM Peter Geoghegan <pg@heroku.com> wrote:

> On Thu, Dec 3, 2015 at 1:10 PM, Peter Geoghegan <pg@heroku.com> wrote:
> > I'll need to think about a fix.
>
> The problem was with the pointer we pass to ExecUpdate().
>
> It's a pointer to the target tuple in shared memory. So the field
> "tuple.t_data->t_ctid" within ExecOnConflictUpdate() starts out
> pointing to an ItemPointerData with the correct ctid (when it
> initially points to the current/target tuple, since as an
> about-to-be-upserted tuple the "t_ctid" field must be a pointer to the
> self-same tuple). Then, it is modified in-place in shared memory by
> heap_update(), within its critical section.
>
> The fix is to take a deep copy (pass a pointer to an ItemPointerData
> on the stack), as in the attached. I've also fixed up the tests, which
> should have caught this, but didn't. Mea culpa.
>
> Many thanks to Stanislav for the report! While I didn't adopt his
> suggestion, he certainly almost had it right.
>
> --
> Peter Geoghegan
>
--


-S
On 2015-12-03 17:34:12 -0800, Peter Geoghegan wrote:
> On Thu, Dec 3, 2015 at 1:10 PM, Peter Geoghegan <pg@heroku.com> wrote:
> > I'll need to think about a fix.
>
> The problem was with the pointer we pass to ExecUpdate().
>
> It's a pointer to the target tuple in shared memory. So the field
> "tuple.t_data->t_ctid" within ExecOnConflictUpdate() starts out
> pointing to an ItemPointerData with the correct ctid (when it
> initially points to the current/target tuple, since as an
> about-to-be-upserted tuple the "t_ctid" field must be a pointer to the
> self-same tuple). Then, it is modified in-place in shared memory by
> heap_update(), within its critical section.

Hm. But why it correct to use t_ctid in the first place? I mean if there
was a previously-failed UPDATE t_ctid will point somewhere meaningless?
Shouldn't we use tuple->t_self, or conflictTid here?

Doing that reveals one change in the regression tests. Namely
    -- This shows an attempt to update an invisible row, which should really be
    -- reported as a cardinality violation, but it doesn't seem worth fixing:
    WITH simpletup AS (
      SELECT 2 k, 'Green' v),
    upsert_cte AS (
      INSERT INTO z VALUES(2, 'Blue') ON CONFLICT (k) DO
        UPDATE SET (k, v) = (SELECT k, v FROM simpletup WHERE simpletup.k = z.k)
        RETURNING k, v)
    INSERT INTO z VALUES(2, 'Red') ON CONFLICT (k) DO
    UPDATE SET (k, v) = (SELECT k, v FROM upsert_cte WHERE upsert_cte.k = z.k)
    RETURNING k, v;
out of with.sql doesn't fail anymore, and instead returns no rows.

At that point in the regression tests there's a conflicting tuple for
'2'. Rewriting the statement to

WITH simpletup AS (
  SELECT 2 k, 'Green' v),
upsert_cte AS (
  UPDATE z SET (k, v) = (SELECT k, v FROM simpletup WHERE simpletup.k = z.k)
  WHERE k = 2
  RETURNING k, v)
UPDATE z SET (k, v) = (SELECT k, v FROM upsert_cte WHERE upsert_cte.k = z.k)
WHERE k = 2
RETURNING k, v;

does *not* error out. That's because it hits the HeapTupleSelfUpdated
block in ExecUpdate(). So, working as designed.

The reason the upsert variant gets an error with master/your patch is
solely that t_ctid already points to the new version of the tuple -
which surely is wrong! t_ctid could point to nearly arbitrary things
here (if the previous target for t_ctid was hot pruned and then replaced
with new contents), unless I miss something.


It sounds to me like the real fix would be to just use
&tuple.t_self. That'll "break" the above regression test. But the reason
for that seems to be entirely independent: Namely that in this case the
tuple is only modified after the heap_lock_tuple(), in the course of the
ExecProject() computing the new tuple version - the SELECT FROM
upsert_cte...

Nasty.


ISTM, that if we really want to protect against UpdatedSelf we need to
to do so after ExecQual(), ExecWCE(), and ExecProject(). Each of those
can trigger such an update.


Am I missing something major here?


Greetings,

Andres Freund

Re: Incorrect UPDATE trigger invocation in the UPDATE clause of an UPSERT statement.

From
Peter Geoghegan
Date:
On Tue, Dec 8, 2015 at 6:44 AM, Andres Freund <andres@anarazel.de> wrote:
> It sounds to me like the real fix would be to just use
> &tuple.t_self. That'll "break" the above regression test. But the reason
> for that seems to be entirely independent: Namely that in this case the
> tuple is only modified after the heap_lock_tuple(), in the course of the
> ExecProject() computing the new tuple version - the SELECT FROM
> upsert_cte...

I think you're right. I had a feeling that there was some unanswered
question about that particular regression test.

FWIW, t_ctid could not really point to arbitrary things, because
heap_lock_tuple() is not asked to follow the update chain, and we
start from the first phase at the first sign of a conflict within
ExecOnConflictUpdate() (it succeeds only when it locks the
*definitive* conflict tuple, with no future tuple version). As you say
t_ctid could still point to a dead tuple, unfortunately, because that
doesn't count as a new version, which would result in raising an
error. Maybe we should revisit the idea of making jjanes_upsert do
fault injection. After all, the whole origin of that tool is Jeff's
crash recovery tester, which artificially simulated torn pages, went
through recovery, and reconciled the state of the database after
recovery with a known-good state that the tool kept track of.

> ISTM, that if we really want to protect against UpdatedSelf we need to
> to do so after ExecQual(), ExecWCE(), and ExecProject(). Each of those
> can trigger such an update.

Hmm. You mean having changed things to pass &tuple.t_self to
ExecUpdate() instead?

One tricky point here is that things are different between
ExecOnConflictUpdate() and ExecUpdate(). The return value
HeapTupleSelfUpdated is a "can't happen" state within
ExecOnConflictUpdate(), for reasons that I think can be well isolated,
and understood relatively easily (see comments), whereas
HeapTupleSelfUpdated is the "I guess we'll just ignore a second
attempt to update tuple" state within ExecUpdate().

I think you might be confusing ExecUpdate()'s use of
HeapTupleSelfUpdated with ExecOnConflictUpdate()'s similar use of
HeapTupleInvisible (where it's almost the same to the user -- it's the
"I guess we'll throw a cardinality violation error to reject a second
attempt at updating the tuple" state there). And, contrariwise,
ExecUpdate() has HeapTupleInvisible as its own "can't happen" error.
Of course, these differences are due to the different types of
snapshots used in each case (dirty snapshot for
ExecOnConflictUpdate(), MVCC snapshot for ExecUpdate() -- at least
prior to 9.5/this bug).

The reason that the with.sql regression test fails when you change the
code to pass &tuple.t_self to ExecUpdate() is that, as you say, we get
the historic ExecUpdate()/plain update tolerance of would-be
"cardinality violations" (the ExecUpdate() HeapTupleSelfUpdated case).

What I don't see is why you suggest we need to worry about each case
(e.g. ExecQual(), ExecWCE(), and ExecProject()) specifically. Could we
just instruct ExecUpdate() that the caller is ExecOnConflictUpdate(),
and so it's appropriate to throw a cardinality violation for its
HeapTupleSelfUpdated case? After all, that case already discriminates
against updates that are not from the same command in the xact (e.g.
due to weird before triggers) by throwing an error [1]. I don't think
we need to throw a cardinality violation unless an UPSERT attempts to
update a tuple a second time, but not if that occurs within a command
that happens to contain an UPSERT not directly doing the updating.

[1] Commit 6868ed74
--
Peter Geoghegan

Re: Incorrect UPDATE trigger invocation in the UPDATE clause of an UPSERT statement.

From
Peter Geoghegan
Date:
On Tue, Dec 8, 2015 at 1:17 PM, Peter Geoghegan <pg@heroku.com> wrote:
> I think you might be confusing ExecUpdate()'s use of
> HeapTupleSelfUpdated with ExecOnConflictUpdate()'s similar use of
> HeapTupleInvisible (where it's almost the same to the user -- it's the
> "I guess we'll throw a cardinality violation error to reject a second
> attempt at updating the tuple" state there). And, contrariwise,
> ExecUpdate() has HeapTupleInvisible as its own "can't happen" error.
> Of course, these differences are due to the different types of
> snapshots used in each case (dirty snapshot for
> ExecOnConflictUpdate(), MVCC snapshot for ExecUpdate() -- at least
> prior to 9.5/this bug).

Note that it's possible for ExecOnConflictUpdate() to never see a
HeapTupleSelfUpdated case (it is, as I said, a "can't happen"
condition), and yet for its ExecUpdate call to see a
HeapTupleSelfUpdated case with odd queries like the with.sql test
Andres highlighted.

This is only because, as Andres said, ExecProject() and so on can
change things. But things cannot change between finding a conflict TID
in the first stage, and returning from heap_lock_tuple() as part of
the second, such that HeapTupleSelfUpdated is seen as the
heap_lock_tuple() return value (within ExecOnConflictUpdate()). There
is no contradiction.

--
Peter Geoghegan
On 2015-12-08 13:17:15 -0800, Peter Geoghegan wrote:
> On Tue, Dec 8, 2015 at 6:44 AM, Andres Freund <andres@anarazel.de> wrote:
> > It sounds to me like the real fix would be to just use
> > &tuple.t_self. That'll "break" the above regression test. But the reason
> > for that seems to be entirely independent: Namely that in this case the
> > tuple is only modified after the heap_lock_tuple(), in the course of the
> > ExecProject() computing the new tuple version - the SELECT FROM
> > upsert_cte...
>
> I think you're right. I had a feeling that there was some unanswered
> question about that particular regression test.
>
> FWIW, t_ctid could not really point to arbitrary things, because
> heap_lock_tuple() is not asked to follow the update chain, and we
> start from the first phase at the first sign of a conflict within
> ExecOnConflictUpdate() (it succeeds only when it locks the
> *definitive* conflict tuple, with no future tuple version). As you say
> t_ctid could still point to a dead tuple, unfortunately, because that
> doesn't count as a new version, which would result in raising an
> error.

I'm not 100% sure, but I do think it could point to arbitrary tuples:
Consider what happens if a previous update to the same tuple failed and
the new version of the tuple is pruned away, and that space is then
reused for something else. Bang. That's probably only possible if the
xmax is a multixact, as heap_lock_tuple() otherwise would clear out
t_ctid - but that's perfectly possible.

> > > ISTM, that if we really want to protect against UpdatedSelf we need to
> > to do so after ExecQual(), ExecWCE(), and ExecProject(). Each of those
> > can trigger such an update.
>
> Hmm. You mean having changed things to pass &tuple.t_self to
> ExecUpdate() instead?

Yes, after that. Passing t_data->t_ctid is just plain wrong.

> I think you might be confusing ExecUpdate()'s use of
> HeapTupleSelfUpdated with ExecOnConflictUpdate()'s similar use of
> HeapTupleInvisible (where it's almost the same to the user -- it's the
> "I guess we'll throw a cardinality violation error to reject a second
> attempt at updating the tuple" state there). And, contrariwise,
> ExecUpdate() has HeapTupleInvisible as its own "can't happen" error.
> Of course, these differences are due to the different types of
> snapshots used in each case (dirty snapshot for
> ExecOnConflictUpdate(), MVCC snapshot for ExecUpdate() -- at least
> prior to 9.5/this bug).

I don't think so. Check the attached, very rough, WIP patch.

> What I don't see is why you suggest we need to worry about each case
> (e.g. ExecQual(), ExecWCE(), and ExecProject()) specifically.

I don't mean individually, just after all of those, but before calling ExecUpdate()

> Could we
> just instruct ExecUpdate() that the caller is ExecOnConflictUpdate(),
> and so it's appropriate to throw a cardinality violation for its
> HeapTupleSelfUpdated case? After all, that case already discriminates
> against updates that are not from the same command in the xact (e.g.
> due to weird before triggers) by throwing an error [1]. I don't think
> we need to throw a cardinality violation unless an UPSERT attempts to
> update a tuple a second time, but not if that occurs within a command
> that happens to contain an UPSERT not directly doing the updating.

I'm inclined to do the check in ExecOnConflictUpdate() instead - istm
that's easier to understand.

Greetings,

Andres Freund

Attachment

Re: Incorrect UPDATE trigger invocation in the UPDATE clause of an UPSERT statement.

From
Peter Geoghegan
Date:
On Tue, Dec 8, 2015 at 3:12 PM, Andres Freund <andres@anarazel.de> wrote:
> I'm not 100% sure, but I do think it could point to arbitrary tuples:
> Consider what happens if a previous update to the same tuple failed and
> the new version of the tuple is pruned away, and that space is then
> reused for something else. Bang. That's probably only possible if the
> xmax is a multixact, as heap_lock_tuple() otherwise would clear out
> t_ctid - but that's perfectly possible.

 That's an alarming possibility. Either way though, it's wrong, and
the fix is fairly clear.

>> Hmm. You mean having changed things to pass &tuple.t_self to
>> ExecUpdate() instead?
>
> Yes, after that. Passing t_data->t_ctid is just plain wrong.

I agree, obviously.

>> Could we
>> just instruct ExecUpdate() that the caller is ExecOnConflictUpdate(),
>> and so it's appropriate to throw a cardinality violation for its
>> HeapTupleSelfUpdated case? After all, that case already discriminates
>> against updates that are not from the same command in the xact (e.g.
>> due to weird before triggers) by throwing an error [1]. I don't think
>> we need to throw a cardinality violation unless an UPSERT attempts to
>> update a tuple a second time, but not if that occurs within a command
>> that happens to contain an UPSERT not directly doing the updating.
>
> I'm inclined to do the check in ExecOnConflictUpdate() instead - istm
> that's easier to understand.

We're on the same page. I just happen to think we might as well put
the check beside the existing special case check for weird before
triggers -- within ExecUpdate()'s HeapTupleSelfUpdated case. That
avoids an extra HeapTupleSatisfiesUpdate() call for every UPSERT
update. We mostly only call that routine from heapam.c as things are.
But I'm hardly going to insist on that.

I actually think that there is an argument to be made for doing
nothing here, and not allowing a cardinality violation at all. It
doesn't seem too inconsistent to follow the old behavior (just accept
a redundant second update) in the case where you update the row within
an UPSERT after the UPSERT locks the row, but before your UPSERT has a
chance to do its UPDATE of that same row (in other words, something
happened to your tuple in between). I do prefer to do what we've
agreed to, though (whether or not it happens in ExecUpdate()), mostly
because it's closer to the other ExecUpdate() HeapTupleSelfUpdated
special case.

Note the similarity between that case, and this new one.

--
Peter Geoghegan

Re: Incorrect UPDATE trigger invocation in the UPDATE clause of an UPSERT statement.

From
Peter Geoghegan
Date:
On Tue, Dec 8, 2015 at 3:32 PM, Peter Geoghegan <pg@heroku.com> wrote:
> Note the similarity between that case, and this new one.

It might be best if this new ereport() site doesn't use errcode
ERRCODE_CARDINALITY_VIOLATION at all. Pretty sure it shouldn't be
ERRCODE_TRIGGERED_DATA_CHANGE_VIOLATION either, though.

--
Peter Geoghegan

Re: Incorrect UPDATE trigger invocation in the UPDATE clause of an UPSERT statement.

From
Peter Geoghegan
Date:
On Tue, Dec 8, 2015 at 3:32 PM, Peter Geoghegan <pg@heroku.com> wrote:
> We're on the same page. I just happen to think we might as well put
> the check beside the existing special case check for weird before
> triggers -- within ExecUpdate()'s HeapTupleSelfUpdated case. That
> avoids an extra HeapTupleSatisfiesUpdate() call for every UPSERT
> update.

It would also be nice to "Assert(!isOnConflict)" within the
HeapTupleUpdated case within ExecUpdate(), if only to document that
that's not expected or possible. Adding a new isOnConflict argument to
ExecUpdate() (so that it can potentially raise an error to deal with
this case) also makes this possible.

--
Peter Geoghegan

Re: Incorrect UPDATE trigger invocation in the UPDATE clause of an UPSERT statement.

From
Peter Geoghegan
Date:
On Tue, Dec 8, 2015 at 3:32 PM, Peter Geoghegan <pg@heroku.com> wrote:
> We're on the same page. I just happen to think we might as well put
> the check beside the existing special case check for weird before
> triggers -- within ExecUpdate()'s HeapTupleSelfUpdated case. That
> avoids an extra HeapTupleSatisfiesUpdate() call for every UPSERT
> update. We mostly only call that routine from heapam.c as things are.
> But I'm hardly going to insist on that.

Attached is a worked out patch, with the small differences I went
into. Take from it what you want.

--
Peter Geoghegan

Attachment
On 2015-12-08 15:32:03 -0800, Peter Geoghegan wrote:
> I actually think that there is an argument to be made for doing
> nothing here, and not allowing a cardinality violation at all.

Works for me. I'll add a short comment before the ExecUpdate() detailing
that the row could, in rather uncommon cases, be updated by that
point. Adding an extra parameter to ExecUpdate() + additional concerns
to it's already nontrivial HTSV codepath doesn't seem worth it to
me.

Greetings,

Andres Freund

Re: Incorrect UPDATE trigger invocation in the UPDATE clause of an UPSERT statement.

From
Peter Geoghegan
Date:
On Wed, Dec 9, 2015 at 1:43 PM, Andres Freund <andres@anarazel.de> wrote:
> Works for me. I'll add a short comment before the ExecUpdate() detailing
> that the row could, in rather uncommon cases, be updated by that
> point. Adding an extra parameter to ExecUpdate() + additional concerns
> to it's already nontrivial HTSV codepath doesn't seem worth it to
> me.

HeapTupleSatisfiesVacuum? What of it?

--
Peter Geoghegan
On 2015-12-09 13:52:35 -0800, Peter Geoghegan wrote:
> On Wed, Dec 9, 2015 at 1:43 PM, Andres Freund <andres@anarazel.de> wrote:
> > Works for me. I'll add a short comment before the ExecUpdate() detailing
> > that the row could, in rather uncommon cases, be updated by that
> > point. Adding an extra parameter to ExecUpdate() + additional concerns
> > to it's already nontrivial HTSV codepath doesn't seem worth it to
> > me.
>
> HeapTupleSatisfiesVacuum? What of it?

Err, HTSU, via heap_update()'s result.

Re: Incorrect UPDATE trigger invocation in the UPDATE clause of an UPSERT statement.

From
Peter Geoghegan
Date:
On Wed, Dec 9, 2015 at 1:43 PM, Andres Freund <andres@anarazel.de> wrote:
> On 2015-12-08 15:32:03 -0800, Peter Geoghegan wrote:
>> I actually think that there is an argument to be made for doing
>> nothing here, and not allowing a cardinality violation at all.
>
> Works for me. I'll add a short comment before the ExecUpdate() detailing
> that the row could, in rather uncommon cases, be updated by that
> point. Adding an extra parameter to ExecUpdate() + additional concerns
> to it's already nontrivial HTSV codepath doesn't seem worth it to
> me.

I would prefer to throw an error (not a cardinality violation error),
since there is a critical change between locking the tuple, and the
corresponding DO UPDATE call to ExecUpdate().

I'm not going to insist on that, though (just as I didn't insist on
the exact style of the fix). I only insist that this new
HealTupleSatisfiesSelf-in-ExecUpdate() case should not be treated as a
cardinality violation specifically, and should be separately
acknowledge at some level.

--
Peter Geoghegan
Hi,

On 2015-11-30 11:43:12 +0000, Stanislav Grozev wrote:
> I'd like to report a bug in the UPDATE trigger invocation when using the
> new INSERT ON CONFLICT UPDATE (UPSERT) functionality.
>
> In short, if an UPDATE trigger is invoked by the ON CONFLICT DO UPDATE
> clause of an UPSERT statement - it receives the new values in both the OLD
> and NEW variables.  Whereas if invoked by a normal UPDATE statement - it
> correctly gets the respective values in OLD and NEW.

I've pushed a fix for this. Thanks for the report!

- Andres