Re: Command Triggers - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Command Triggers |
Date | |
Msg-id | CA+TgmoZ2WrhL1kFg=A86NDUb=vrjNZd2h3Wxp47=QadOaEFqpg@mail.gmail.com Whole thread Raw |
In response to | Re: Command Triggers (Dimitri Fontaine <dimitri@2ndQuadrant.fr>) |
Responses |
Re: Command Triggers
Re: Command Triggers Re: Command Triggers |
List | pgsql-hackers |
On Sun, Dec 11, 2011 at 1:55 PM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > Andres Freund <andres@anarazel.de> writes: >> Hm. I just noticed a relatively big hole in the patch: The handling of >> deletion of dependent objects currently is nonexistant because they don't go >> through ProcessUtility... > > That's the reason why we're talking about “command triggers” rather than > “DDL triggers”. We don't intend to fire the triggers at each DDL > operation happening on the server, but for each command. > > This restriction still allows us to provide a very useful feature when > checked against the main use cases we target here. Those are auditing, > and replication (the replay will also CASCADEs), and a generic enough > SUDO facility (because the trigger function can well be SECURITY > DEFINER). I haven't yet thought about your specific proposal here in enough to have a fully-formed opinion, but I am a little nervous that this may turn out to be one of those cases where the obvious API ends up working less well than might have been supposed. For example, consider the "auditing" use case, and suppose that the user wants to log a message (to a table, or some other separate logging facility) every time someone creates an index. In order to do that, they're going to need to trap not only CREATE INDEX but also CREATE TABLE and ALTER CONSTRAINT, and in the latter two cases they're then going to have to then write some grotty logic to grovel through the statement or its node tree in order to figure out what indexes are being created. Also, if they want to log the name of the new index in cases where the user doesn't specify it, they're going to have to duplicate the backend logic which generates the index name, which is going to be a pain in the ass and a recipe for future bugs (e.g. because we might change the algorithm at some point, or the trigger might see a different view of the world than the actual command, due to intervening DDL). This is something that has come up a lot in the context of sepgsql: it's not really enough to know what the toplevel command is in some cases, because whether or not the operation is allowed depends on exactly what it's doing, which depends on things like which schema gets used to create the table, and which tablespaces get used to create the table and its indexes, and perhaps (given Peter's pending work) what types are involved. You can deduce all of these things from the command text, but it requires duplicating nontrivial amounts of backend code, which is undesirable from a maintainability point of view. I think the same issues are likely to arise in the context of auditing, as in the example above, and even for replication: if, for example, the index name that is chosen on the master happens to exist on the standby with a different definition, replaying the actual command text might succeed or fail depending on how the command is phrased. Maybe it's OK for the slave to just choose a different name for that index, but now a subsequent DROP INDEX command that gets replayed across all servers might end up dropping logically unrelated indexes on different machines. Now, on the other hand, the conceptual simplicity of this approach is very appealing, and I can certainly see people who would never consider writing a ProcessUtility_hook using this type of facility. However, I'm worried that it will be difficult to use as a foundation for robust, general-purpose tools. Is that enough reason to reject the whole approach? I'm not sure. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: