Thread: "Triggered data change violation", once again

"Triggered data change violation", once again

From
Tom Lane
Date:
I have been looking at the way that deferred triggers slow down when the
same row is updated multiple times within a transaction.  The problem
appears to be entirely due to calling deferredTriggerGetPreviousEvent()
to find the trigger list entry for the previous update of the row: we
do a linear search, so the behavior is roughly O(N^2) when there are N
updated rows.

The only reason we do this is to enforce the "triggered data change
violation" restriction of the spec.  However, I think we've
misinterpreted the spec.  The code prevents an RI referenced value from
being changed more than once in a transaction, but what the spec
actually says is thou shalt not change it more than once per
*statement*.  We have discussed this several times in the past and
I think people have agreed that the current behavior is wrong,
but nothing's been done about it.

I think all we need to do to implement things correctly is to consider a
previous event only if both xmin and cmin of the old tuple match the
current xact & command IDs, rather than considering it on the basis of
xmin alone.

Aside from being correct, this will make a significant difference in
performance.  If we were doing it per spec then
deferredTriggerGetPreviousEvent would never be called in typical
operations, and so its speed wouldn't be an issue.  Moreover, if we do
it per spec then completed trigger event records could be removed from
the trigger list at end of statement, rather than keeping them till end
of transaction, which'd save memory space.

Comments?
        regards, tom lane


Re: "Triggered data change violation", once again

From
Stephan Szabo
Date:
On Wed, 24 Oct 2001, Tom Lane wrote:

> The only reason we do this is to enforce the "triggered data change
> violation" restriction of the spec.  However, I think we've
> misinterpreted the spec.  The code prevents an RI referenced value from
> being changed more than once in a transaction, but what the spec
> actually says is thou shalt not change it more than once per
> *statement*.  We have discussed this several times in the past and
> I think people have agreed that the current behavior is wrong,
> but nothing's been done about it.
> 
> I think all we need to do to implement things correctly is to consider a
> previous event only if both xmin and cmin of the old tuple match the
> current xact & command IDs, rather than considering it on the basis of
> xmin alone.

Are there any things that might update the command ID during the execution
of the statement from inside functions that are being run?  I really don't
understand the details of how that works (which is the biggest reason I
haven't yet tackled some of the big remaining broken stuff in the
referential actions, because AFAICT we need to be able to update a row
that matched at the beginning of the statement, not the ones that match
at the time the triggers run).  



Re: "Triggered data change violation", once again

From
Tom Lane
Date:
Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:
>> I think all we need to do to implement things correctly is to consider a
>> previous event only if both xmin and cmin of the old tuple match the
>> current xact & command IDs, rather than considering it on the basis of
>> xmin alone.

> Are there any things that might update the command ID during the execution
> of the statement from inside functions that are being run?

Functions can run new commands that get new command ID numbers within
the current transaction --- but on return from the function, the current
command number is restored.  I believe rows inserted by such a function
would look "in the future" to us at the outer command, and would be
ignored.

Actually, now that I think about it, the MVCC rules are that tuples with
xmin = currentxact are not visible unless they have cmin < currentcmd.
Not equal to.  This seems to render the entire "triggered data change"
test moot --- I rather suspect that we cannot have such a condition
as old tuple cmin = currentcmd at all, and so we could just yank all
that code entirely.
        regards, tom lane


Re: "Triggered data change violation", once again

From
Stephan Szabo
Date:
On Wed, 24 Oct 2001, Tom Lane wrote:

> Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:
> >> I think all we need to do to implement things correctly is to consider a
> >> previous event only if both xmin and cmin of the old tuple match the
> >> current xact & command IDs, rather than considering it on the basis of
> >> xmin alone.
> 
> > Are there any things that might update the command ID during the execution
> > of the statement from inside functions that are being run?
> 
> Functions can run new commands that get new command ID numbers within
> the current transaction --- but on return from the function, the current
> command number is restored.  I believe rows inserted by such a function
> would look "in the future" to us at the outer command, and would be
> ignored.
> 
> Actually, now that I think about it, the MVCC rules are that tuples with
> xmin = currentxact are not visible unless they have cmin < currentcmd.
> Not equal to.  This seems to render the entire "triggered data change"
> test moot --- I rather suspect that we cannot have such a condition
> as old tuple cmin = currentcmd at all, and so we could just yank all
> that code entirely.

I'm not sure if this sequence would be an example of something that
would be disallowed, but if I do something like:

Make a plpgsql function that does update table1 set key=1 where key=2;
Make that an after update trigger on table1
Put a key=1 row into table1
Update table1 to set key to 2

I end up with a 1 in the table. I'm not sure, but I think that such
a case would be possible through the fk stuff with triggers that modify 
the primary key table (right now it might "work" due to the problems
of checking intermediate states). Wouldn't this be the kind of thing
the "triggered data change" is supposed to prevent?  I may be just
misunderstanding the intent of the spec.



Re: "Triggered data change violation", once again

From
Hiroshi Inoue
Date:
Tom Lane wrote:
> 
> Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:
> >> I think all we need to do to implement things correctly is to consider a
> >> previous event only if both xmin and cmin of the old tuple match the
> >> current xact & command IDs, rather than considering it on the basis of
> >> xmin alone.
> 
> > Are there any things that might update the command ID during the execution
> > of the statement from inside functions that are being run?
> 
> Functions can run new commands that get new command ID numbers within
> the current transaction --- but on return from the function, the current
> command number is restored.  I believe rows inserted by such a function
> would look "in the future" to us at the outer command, and would be
> ignored.

I'm suspicious if this is reasonable. If those changes are ignored
when are taken into account ?  ISTM deferred constraints has to see
the latest tuples and take the changes into account. 

regards,
Hiroshi Inoue