Re: Deferred constraint trigger semantics - Mailing list pgsql-general
From | Bryn Llewellyn |
---|---|
Subject | Re: Deferred constraint trigger semantics |
Date | |
Msg-id | 5BB967C0-E64E-4848-8817-6CDAC960CF82@yugabyte.com Whole thread Raw |
In response to | Re: Deferred constraint trigger semantics ("David G. Johnston" <david.g.johnston@gmail.com>) |
Responses |
Re: Deferred constraint trigger semantics
Re: Deferred constraint trigger semantics |
List | pgsql-general |
david.g.johnston@gmail.com wrote:bryn@yugabyte.com wrote:
Thanks for the point-by-point reply, David.
...makes no mention of what you might expect to see in an AFTER EACH ROW trigger.
...the absence of a comment declaring a guarantee of order means that, like the comment for the row-level BEFORE trigger, the row-level AFTER row ordering is not guaranteed (even if one cannot produce a counter-example in today's codebase).
Got it!
...unless anybody contradicts me.
Caveat emptor...? I wouldn't be surprised that doing so is technically possible in all cases - as to whether a particular algorithm is sound, to some extent, isn't something we try to predict. We do try to describe all the known interactions though - and let the user work within what those mean for them.
Got it again!
...implies that there's no such unpredictability in the AFTER EACH ROW cases.
I would not read it that way. In general, absence of mention of predictability like this means there is none - that some other sentence goes into more detail doesn't change that.
OK.
But there has to be a subtle caveat here for the deferred constraint trigger when the txn changes two or more tables, all of which participate in the query that the trigger function issues… The "raise info" output below illustrates my point (n changes from 5 to 8).
I'm failing to see the deferral aspect of that example. First statement finishes, sees the 5 inserts, next statement finishes, sees 3 more inserts. Not, both statements finish, triggers fire, triggers see all 8 inserts (which I suspect they will if you actually perform deferral).
Oops. I did a copy-and-paste error on going from my test env. to email and missed out the "deferral" that I'd intended. For completeness, here's the test that I meant:
create table t1(k serial primary key, v int not null);
create table t2(k serial primary key, v int not null);
create function trg_fn()
returns trigger
language plpgsql
as $body$
declare
n int not null := 0;
begin
n := (select count(*) from t1) + (select count(*) from t2);
raise info 'trg fired. new.v = %, n = %', new.v, n;
return new;
end;
$body$;
create constraint trigger trg
after insert on t1
for each row
execute function trg_fn();
create constraint trigger trg
after insert on t2
initially deferred
for each row
execute function trg_fn();
set default_transaction_isolation = 'read committed';
do $body$
begin
insert into t1(v)
values (10), (20), (30), (40), (50);
insert into t2(v)
values (60), (70), (80);
end;
$body$;
create table t2(k serial primary key, v int not null);
create function trg_fn()
returns trigger
language plpgsql
as $body$
declare
n int not null := 0;
begin
n := (select count(*) from t1) + (select count(*) from t2);
raise info 'trg fired. new.v = %, n = %', new.v, n;
return new;
end;
$body$;
create constraint trigger trg
after insert on t1
for each row
execute function trg_fn();
create constraint trigger trg
after insert on t2
initially deferred
for each row
execute function trg_fn();
set default_transaction_isolation = 'read committed';
do $body$
begin
insert into t1(v)
values (10), (20), (30), (40), (50);
insert into t2(v)
values (60), (70), (80);
end;
$body$;
It adds the "initially deferred" decoration to the "create constraint trigger" statement. This is (still) the result:
INFO: trg fired. new.v = 10, n = 5
INFO: trg fired. new.v = 20, n = 5
INFO: trg fired. new.v = 30, n = 5
INFO: trg fired. new.v = 40, n = 5
INFO: trg fired. new.v = 50, n = 5
INFO: trg fired. new.v = 60, n = 8
INFO: trg fired. new.v = 70, n = 8
INFO: trg fired. new.v = 80, n = 8
INFO: trg fired. new.v = 20, n = 5
INFO: trg fired. new.v = 30, n = 5
INFO: trg fired. new.v = 40, n = 5
INFO: trg fired. new.v = 50, n = 5
INFO: trg fired. new.v = 60, n = 8
INFO: trg fired. new.v = 70, n = 8
INFO: trg fired. new.v = 80, n = 8
Even though both inserts have completed by commit time, only the trigger firing caused by the second statement sees the final state that obtains the moment before commit. The first statement sees only the state after it finishes and before the second statement executes. You said « I suspect [that both statements will see the final state] if you actually perform deferral ». My test shows that this is not the case.
Did I misunderstand you? Or does this result surprise you? If it does, do you think that this is a bug?
...Your test case for the deferred constraint, that supposedly allows for the insertion of invalid data per the specification of the constraint trigger, isn't something I've worked through yet; and as written reads like a bug report.
It's not a report of a PG bug. Rather, it shows how an application programmer might write a bug in *their* code. When "set constraints all immediate" is used, it opens a race condition window between its execution and the commit. I'd speculated on that earlier. So I felt that I should show a self-contained demo of this possible trap for the application programmer. As long as "set constraints all immediate" is not used, the demo shows proper behavior. Of course, it moves the constraint violation error to commit time—and this means that PL/pgSQL code cannot handle it (as discussed at length elsewhere).
I appreciate that using "set constraints all immediate" is useful in many scenarios to allow handling the violation error in PL/pgSQL. (Thanks again for that tip, David.) For example, and as I reason it, the famous "mandatory one-to-one relationship" use-case is safe when you use this technique. This use-case needs mutual FK constraints between the two tables. But you have to insert one of the two rows (call it "main") that are so related before the other (call it "extra"). And so the mutual FK constraints cannot be satisfied until both new rows are in place. Therefore the "main" table's FK constraint to the "extra" table (when the insertion order that I described is used) must be deferred. But, because this is insert, no other session can see your uncommitted work in progress. So there's no race condition window. And once the new row-pair is committed, other non-deferred constraints can ensure that either deleting the "extra" row is RESTRICTed or deleting the "main" row CASCADEs. So the business rule is easily enforced once the new row-pair is in place.
It does take a fair effort of reasoning, for each specific use-case, to determine the safety of using "set constraints all immediate". But programming is just like that!
Is there a fundamental reason why a deferred AFTER EACH STATEMENT constraint trigger is not allowed? Nothing in what is explained in the "Overview of Trigger Behavior" and "Visibility of Data Changes" sections lets me see why the present restriction is needed.
I imagine having to keep around a working set of what are the changed records is both memory intensive and also problematic should a future statement make yet more changes to the table… And given that constraints are defined per-row everywhere else there is a pull to not push the envelope of our extension too far.
At the conceptual level, there are both per-row constraints and the kind that can be precisely specified (albeit not implemented) using SQL Assertion syntax.
With respect to « having to keep around a working set of what are the changed records » I think that the complexity that you envisaged is avoided by the (emergent) rule that an AFTER EACH STATEMENT trigger cannot see "old" and "new" values. In other words, all you can sensibly do in its function is ordinary SQL that sees the current state at the moment it fires.
To my surprise, it *is* legal to write code that accesses "old" and "new" values. But, because many rows can be affected by a single statement, and the trigger fires just once, the meanings of "old" and "new" are undefined. I've seen that, in any test that I do, both are always set to NULL (which seems reasonable). For example:
create function trg_fn()
returns trigger
language plpgsql
as $body$
begin
raise info 'old.v is %', coalesce(old.v::text, 'null as expected');
raise info 'new.v is %', coalesce(new.v::text, 'null as expected');
return new;
end;
$body$;
create trigger trg
after insert on t
for each statement
execute function trg_fn();
set default_transaction_isolation = 'read committed';
insert into t(v)
values (10), (20), (30), (40), (50);
It produces this output:
INFO: old.v is null as expected
INFO: new.v is null as expected
INFO: new.v is null as expected
A deferred AFTER EACH STATEMENT constraint trigger would be a good tool for the kind of business rule that a SQL Assertion specifies. Yes, it would fire more times than is actually needed. But nevertheless, the timing point better expresses the intention—and there would, at least, be fewer firings than with an AFTER EACH ROW trigger.
If what I pointed out above (not all deferred constraint triggers see the final state just before commit) is regarded as a bug and fixed, then the application programming would be straightforward. As things are now, tricky (but feasible, I believe) programming would be needed in the multi-table case to make only the very last deferred trigger function execution perform the actual test.
pgsql-general by date: