Re: Command Triggers - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Command Triggers
Date
Msg-id 201112011732.11126.andres@anarazel.de
Whole thread Raw
In response to Command Triggers  (Dimitri Fontaine <dimitri@2ndQuadrant.fr>)
Responses Re: Command Triggers  (Dimitri Fontaine <dimitri@2ndQuadrant.fr>)
List pgsql-hackers
Hi Dimitri, Hi all,

On Tuesday, November 08, 2011 06:47:13 PM Dimitri Fontaine wrote:
> As proposed by Jan Wieck on hackers and discussed in Ottawa at the
> Clusters Hackers Meeting, the most (only?) workable way to ever have DDL
> triggers is to have "command triggers" instead.
> Rather than talking about catalogs and the like, it's all about firing a
> registered user-defined function before or after processing the utility
> command, or instead of processing it.  This naturally involves a new
> catalog (pg_cmdtrigger) and some new subcommands of CREATE and DROP
> TRIGGER, and some support code for calling the function at the right
> time.
Great ;)

fwiw I think thats the way forward as well.

The patch obviously isn't thought to be commitable yet, so I am not going to 
do a very detailed code level review. Attached is a patch with low level 
targeted comments and such I noticed while reading it.

At this state the important things are highlevel so I will try to concentrate 
on those:

== the trigger function ==
I don't like the set of options parsed to the trigger functions very much. To 
cite an example of yours those currently are:
* cmd_string     text
* cmd_nodestring text
* schemaname     text
* relname        text

I think at least a
* command_tag    text
is missing. Sure, you can disambiguate it by creating a function for every 
command type but that seems pointlessly complex for many applications. And 
adding it should be trivial as its already tracked ;)

Currently the examples refer to relname as relname which I dislike as that 
seems to be too restricted to one use case. The code is already doing it 
correctly (as objectname) though ;)

Why is there explicit documentation of not checking the arguments of said 
trigger function? That seems to be a bit strange to me.

schemaname currently is mightily unusable because whether its sent or not 
depends wether the table was created with a schemaname specified or not. That 
doesn't seem to be a good approach.

That brings me to the next point:

== normalization of statements ==
Currently the normalization is a bit confusing. E.g. for:

CREATE SCHEMA barf;
SET search_path = barf;
CREATE TYPE bart AS ENUM ('a', 'b');
CREATE TABLE bar(a int, b bigint, c serial, d text, e varchar, "F" text, g 
bart, h int4);

one gets

CREATE TABLE bar (a pg_catalog.int4,b pg_catalog.int8,c serial,d text,e 
pg_catalog.varchar,F text,g bart,h int4);

Which points out two problems:
* inconsistent schema prefixing
* no quoting

Imo the output should fully schema qualify anything including operators. Yes, 
thats rather ugly but anything else seems to be too likely to lead to 
problems.

As an alternative it would be possible to pass the current search path but 
that doesn't seem to the best approach to me;

Currently CHECK() constraints are not decodable, but inside those the same 
quoting/qualifying rules should apply.


Then there is nice stuff like CREATE TABLE .... (LIKE ...);
What shall that return in future? I vote for showing it as the plain CREATE 
TABLE where all columns are specified.

That touches another related topic. Dim's work in adding support for utility 
cmd support in nodeToString and friends possibly will make the code somewhat 
command trigger specific. Is there any other usage we envision?

== grammar == 

* multiple actions:
I think it would sensible to allow multiple actions on which to trigger to be 
specified just as you can for normal triggers. I also would like an ALL option

* options:
Currently the grammar allows options to be passed to command triggers. Do we 
want to keep that? If yes, with the same arcane set of datatypes as in data 
triggers?
If yes it needs to be wired up.

* schema qualification:
An option to add triggers only to a specific schema would be rather neat for 
many scenarios.
I am not totally sure if its worth the hassle of defining what happens in the 
edge cases of e.g. moving from one into another schema though.

== locking ==

Currently there is no locking strategy at all. Which e.g. means that there is 
no protection against two sessions changing stuff concurrently or such.

Was that just left out - which is ok - or did you miss that?

I think it would be ok to just always grab row level locks there.

== permissions ==

Command triggers should only be allowed for the database owner. 

== recursion ==
I contrast to data triggers command triggers cannot change what is done. They 
can either prevent it from running or not. Which imo is fine.
The question I have is whether it would be a good idea to make it easy to 
prevent recursion.... Especially if the triggers wont be per schema.


== calling the trigger ==

Imo the current callsite in ProcessUtility isn't that good. I think it would 
make more sense moving it into standard_ProcessUtility. It should be *after* 
the check_xact_readonly check in there in my opinion.

I am also don't think its a good idea to run the 
ExecBeforeOrInsteadOfCommandTriggers stuff there because thats allso the path 
that COMMIT/BEGIN and other stuff take. Those are pretty "fast path".

I suggest making two switches in standard_ProcessUtility, one for the non-
command trigger stuff which returns on success and one after that. Only after 
the first triggers are checked.

I wonder whether its the right choice to run triggers on untransformed input. 
I.e. CREATE TABLE ... (LIKE ...) seems to only make sense to me after 
transforming.

Also youre very repeatedly transforming the parstree into a string. It would 
be better doing that only once instead of every trigger...


Ok, thats the first round of high level stuff...

Cool Work!

Some lower level annotations:

* many functions should be static but are not. Especially in cmdtrigger.c
* Either tgenable's datatype or its readfunc is wrong (watch the compiler ;))
* ruleutils.c: * generic routine for IF EXISTS, CASCADE, ...



Greetings,

Andres

pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: [PATCH] PostgreSQL fails to build with 32bit MinGW-w64
Next
From: Peter Geoghegan
Date:
Subject: Re: Inlining comparators as a performance optimisation