Thread: [PATCH] Allow queries in WHEN expression of FOR EACH STATEMENT triggers
Hi hackers, Attached is a patch for supporting queries in the WHEN expression of statement triggers. It is restricted so that the expression can reference only the transition tables and the table to which the trigger is attached. This seemed to make the most sense in that it follows what you can do in the per row triggers. I did have a look in the standards document about triggers, and couldn't see any restrictions mentioned, but nevertheless thought it made most sense. One possibility controversial aspect is that the patch doesn't use SPI to evaluate the expression; it constructs a Query instead and passes it to the executor. Don't know what people's thoughts are on doing that? -Joe
Attachment
Re: [PATCH] Allow queries in WHEN expression of FOR EACH STATEMENT triggers
> On 17 Jul 2020, at 00:22, Joe Wildish <joe@lateraljoin.com> wrote: > Attached is a patch for supporting queries in the WHEN expression of statement triggers.at? Hi!, Please create an entry for this patch in the 2020-09 commitfest to make sure it's properly tracked: https://commitfest.postgresql.org/29/ cheers ./daniel
Re: [PATCH] Allow queries in WHEN expression of FOR EACH STATEMENT triggers
Hi Joe,
This is my review of your patch
Hi hackers,
Attached is a patch for supporting queries in the WHEN expression of
statement triggers.
- Currently, <literal>WHEN</literal> expressions cannot contain
- subqueries.
subqueries in row trigger's is not supported in your patch so the the documentation have to reflect it
+ </literal>UPDATE</literal> triggers are able to refer to both </literal>OLD</literal>
+ and <literal>NEW</literal>
Opening and ending tag mismatch on UPDATE and OLD literal so documentation build fails and please update the documentation on server programming section too
+ /*
+ * Plan the statement. No need to rewrite as it can only refer to the
+ * transition tables OLD and NEW, and the relation which is being
+ * triggered upon.
+ */
+ stmt = pg_plan_query(query, trigger->tgqual, 0, NULL);
+ dest = CreateDestReceiver(DestTuplestore);
+ store = tuplestore_begin_heap(false, false, work_mem);
+ tupdesc = CreateTemplateTupleDesc(1);
+ whenslot = MakeSingleTupleTableSlot(tupdesc, &TTSOpsMinimalTuple);
Instead of planning every time the trigger fire I suggest to store plan or prepared statement node so planning time can be saved
There are server crash on the following sequence of command
CREATE TABLE main_table (a int unique, b int);
CREATE FUNCTION trigger_func() RETURNS trigger LANGUAGE plpgsql AS '
BEGIN
RAISE NOTICE ''trigger_func(%) called: action = %, when = %, level = %'', TG_ARGV[0], TG_OP, TG_WHEN, TG_LEVEL;
RETURN NULL;
END;';
INSERT INTO main_table DEFAULT VALUES;
CREATE TRIGGER after_insert AFTER INSERT ON main_table
REFERENCING NEW TABLE AS NEW FOR EACH STATEMENT
WHEN (500 <= ANY(SELECT b FROM NEW union SELECT a FROM main_table))
EXECUTE PROCEDURE trigger_func('after_insert');
INSERT INTO main_table (a, b) VALUES
(101, 498),
(102, 499);
server crashed
regards
Surafel
Re: [PATCH] Allow queries in WHEN expression of FOR EACH STATEMENT triggers
On Thu, Sep 03, 2020 at 09:22:31PM +0300, Surafel Temesgen wrote: > server crashed That's a problem. As this feedback has not been answered after two weeks, I am marking the patch as returned with feedback. -- Michael
Attachment
Re: [PATCH] Allow queries in WHEN expression of FOR EACH STATEMENT triggers
Hi Surafel, On 3 Sep 2020, at 19:22, Surafel Temesgen wrote: > This is my review of your patch Thanks for the review. > subqueries in row trigger's is not supported in your patch so the the > documentation have to reflect it It is still the case that the documentation says this. But, that may have been unclear as the documentation wouldn't compile (as you noted), so it wasn't possible to read it in the rendered form. > > + </literal>UPDATE</literal> triggers are able to refer to both > </literal>OLD</literal> > > + and <literal>NEW</literal> > > Opening and ending tag mismatch on UPDATE and OLD literal so > documentation > build fails and please update the documentation on server programming > section too Fixed. I've also amended the server programming section to accurately reflect how WHEN conditions can be used. > Instead of planning every time the trigger fire I suggest to store > plan or > prepared statement node so planning time can be saved Yes, that would make sense. I'll look in to what needs to be done. Do you know if there are other areas of the code that cache plans that could act as a guide as to how best to achieve it? > There are server crash on the following sequence of command > > ... > > INSERT INTO main_table (a, b) VALUES > > (101, 498), > > (102, 499); > > server crashed Thanks. It was an incorrect Assert about NULL returns. Fixed. -Joe
Attachment
Re: [PATCH] Allow queries in WHEN expression of FOR EACH STATEMENT triggers
Attachment
"Joe Wildish" <joe@lateraljoin.com> writes: > The main change is a switch to using SPI for expression evaluation. The plans are also cached along the same lines asthe RI trigger plans. I really dislike this implementation technique. Aside from the likely performance hit for existing triggers, I think it opens serious security holes, because we can't fully guarantee that deparse-and-reparse doesn't change the semantics. For comparison, the RI trigger code goes to ridiculous lengths to force exact parsing of the queries it makes, and succeeds only because it needs just a very stylized subset of SQL. With a generic user-written expression, we'd be at risk for several inherently-ambiguous SQL constructs such as IS DISTINCT FROM (see relevant reading at [1]). You could argue that the same hazards apply if the user writes the same query within the body of the trigger, and you'd have a point. But we've made a policy decision that users are on the hook to write their functions securely. No such decision has ever been taken with respect to pre-parsed expression trees. In general, users may expect that once those are parsed by the accepting DDL command, they'll hold still, not get re-interpreted at runtime. > a. I originally disallowed functions and table-valued functions from appearing in the expression as they could potentiallydo anything and everything. However, I noticed that we allow functions in FOR EACH ROW triggers so we are alreadyin that position. Do we want to continue allowing that in FOR EACH STATEMENT triggers? If so, then the choice torestrict the expression to just OLD, NEW and the table being triggered against might be wrong. Meh ... users have always been able to write CHECK constraints, WHEN clauses, etc, that have side-effects --- they just have to bury that inside a function. It's only their own good taste and the lack of predictability of when the side-effects will happen that stop them. I don't see much point in enforcing restrictions that are easily evaded by making a function. (Relevant to that, I wonder why this patch is only concerned with WHEN clauses and not all the other places where we forbid subqueries for implementation simplicity.) > b. If a WHEN expression is defined as "n = (SELECT ...)", there is the possibility that a user gets the error "more thanone row returned by a subquery used as an expression" when performing DML, which would be rather cryptic if they didn'tknow there was a trigger involved. To avoid this, we could disallow scalar expressions, with a hint to use the ANY/ALLquantifiers. How is that any more cryptic than any other error that can occur in a WHEN expression? regards, tom lane [1] https://www.postgresql.org/message-id/10492.1531515255%40sss.pgh.pa.us
Re: [PATCH] Allow queries in WHEN expression of FOR EACH STATEMENT triggers
> The main change is a switch to using SPI for expression evaluation. The plans are also cached along the same lines as the RI trigger plans.I really dislike this implementation technique. Aside from the likelyperformance hit for existing triggers, I think it opens serious securityholes, because we can't fully guarantee that deparse-and-reparse doesn'tchange the semantics. For comparison, the RI trigger code goes toridiculous lengths to force exact parsing of the queries it makes,and succeeds only because it needs just a very stylized subset of SQL.With a generic user-written expression, we'd be at risk for severalinherently-ambiguous SQL constructs such as IS DISTINCT FROM (seerelevant reading at [1]).
In general, users may expect thatonce those are parsed by the accepting DDL command, they'll hold still,not get re-interpreted at runtime.
...(Relevant to that, I wonder why this patch is only concerned withWHEN clauses and not all the other places where we forbid subqueriesfor implementation simplicity.)
> b. If a WHEN expression is defined as "n = (SELECT ...)", there is the possibility that a user gets the error "more than one row returned by a subquery used as an expression" when performing DML, which would be rather cryptic if they didn't know there was a trigger involved. To avoid this, we could disallow scalar expressions, with a hint to use the ANY/ALL quantifiers.How is that any more cryptic than any other error that can occurin a WHEN expression?
Joe Wildish" <joe@lateraljoin.com> writes: > On Wed, 22 Sep 2021, at 17:09, Tom Lane wrote: > The main change is a switch to using SPI for expression evaluation. The plans are also cached along the same lines asthe RI trigger plans. >> >> I really dislike this implementation technique. Aside from the likely >> performance hit for existing triggers, I think it opens serious security >> holes, because we can't fully guarantee that deparse-and-reparse doesn't >> change the semantics. > Where do you consider the performance hit to be? The deparse/reparse cost might not be negligible, and putting SPI into the equation where it was not before is certainly going to add overhead. Now maybe those things are negligible in context, but I wouldn't believe it without seeing some performance numbers. > Do you have a suggestion for an alternative? I guess it would be go to the planner/executor directly with the node tree? What I'd be thinking about is what it'd take to extend expression_planner and related infrastructure to allow expressions containing SubLinks. I fear there are a lot of moving parts there though, since the restriction has been in place so long. >> (Relevant to that, I wonder why this patch is only concerned with >> WHEN clauses and not all the other places where we forbid subqueries >> for implementation simplicity.) > I don't know how many other places WHEN clauses are used. Rules, perhaps? I'm thinking of things like CHECK constraints. Grepping for calls to expression_planner would give you a clearer idea of the scope. > The short answer is this patch was written to solve a specific problem I had rather than it being a more general attemptto remove all "subquery forbidden" restrictions. I won't carp too much if the initial patch only removes the restriction for WHEN; but I'd like to see it lay the groundwork to remove the restriction elsewhere as well. regards, tom lane
On Thu, Sep 23, 2021 at 5:34 AM Joe Wildish <joe@lateraljoin.com> wrote: > Regarding the deparse-and-reparse --- if I understand correctly, the core problem is that we have no way of going froma node tree to a string, such that the string is guaranteed to have the same meaning as the node tree? (I did try justnow to produce such a scenario with the patch but I couldn't get ruleutils to emit the wrong thing). Moreover, we couldn'tstore the string for use with SPI, as the string would be subject to trigger-time search path lookups. That prettymuch rules out SPI for this then. Do you have a suggestion for an alternative? I guess it would be go to the planner/executordirectly with the node tree? I think hoping that you can ever make deparse and reparse reliably produce the same result is a hopeless endeavor. Tom mentioned hazards related to ambiguous constructs, but there's also often the risk of concurrent DDL. Commit 5f173040e324f6c2eebb90d86cf1b0cdb5890f0a is a cautionary tale, demonstrating that you can't even count on schema_name.table_name to resolve to the same OID for the entire duration of a single DDL command. The same hazard exists for functions, operators, and anything else that gets looked up in a system catalog. I don't know what all of that means for your patch, but just wanted to get my $0.02 in on the general topic. -- Robert Haas EDB: http://www.enterprisedb.com