Re: Command Triggers patch v18 - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Command Triggers patch v18 |
Date | |
Msg-id | CA+TgmoYZGsk7uS0iM0UamzqRMXCQkQW44S5B3EkDTE40A=ZuoA@mail.gmail.com Whole thread Raw |
In response to | Re: Command Triggers patch v18 (Dimitri Fontaine <dimitri@2ndQuadrant.fr>) |
Responses |
Re: Command Triggers patch v18
|
List | pgsql-hackers |
On Fri, Mar 30, 2012 at 4:32 AM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > I did that another way in previous incarnations of the patch, which was > to allow for INSTEAD OF event trigger backed by a SECURITY DEFINER > function. When the extension is whitelisted, prevent against recursion > then CREATE EXTENSION in the security definer function, then signal that > the execution should now be aborted. > > That was too dangerous given the lack of policy about where exactly the > user code is fired, but I think we could now implement that for some of > the event timing specs we're listing. Only some of them, I guess only > those that are happening before we lock the objects. Oh, right: I remember that now. I still think it's a bad way to do it, because the trigger potentially has a lot of work to do to reconstruct a working command string, and it still ends up getting executed by the wrong user. For CREATE EXTENSION it's not that bad, because the arguments to the command are so simple, but of course any time we extend the CREATE EXTENSION syntax, the trigger needs to know about it too whether it's security-relevant or not, and doing something similar with, say, ALTER TABLE would be a ridiculously complicated. I think there is a use case for what you called an INSTEAD OF trigger, but I don't believe in this one. It seems to me that there's a lot of power in being able to *just* intercept the security decision and then let the rest of the command go about its business. Of course, you have to avoid getting security checks (like, you must own the table in order to drop a column) with integrity checks (like, you can't drop a column from pg_class) but I think that's not very hard to get right. >> More generally, my thought on the structure of this is that you're >> going to have certain toplevel events, many of which will happen at >> only a single place in the code, like "an object got dropped" or "a >> DDL command started" or "a DDL command ended". So we give those >> names, like sql_drop, ddl_command_start, and ddl_command_end. Inside > > I really dislike mixing sql_drop and ddl_command_start as being the same > kind of objects here, even if I can bend my head in the right angle and > see that it's a fair view when looking at how it's implemented. I can't > see a way to explain that to users without having to explain them how > drop cascade is implemented. > > So my proposal here is to “fake” a “proper“ subcommand thanks to the new > context variable. If you DROP TYPE foo CASCADE and that in turn drops a > function foo_in(), then an event trigger is fired with > > context = 'DROP TYPE' > tag = 'DROP FUNCTION' > > Same idea when you DROP TABLE … CASCADE and a SEQUENCE and a bunch of > index need to disappear too, you get an usual event trigger fired with > the context set to 'DROP TABLE' this time. > > I don't think we need to arrange for explicitly publishing the context > specific information here. If we need to, we have to find the right > timing spec where we can guarantee still being in the top level command > and where we already have the details filled in, then users can attach a > trigger here and register the information for themselves. I'm not sure I understand how you're using the words "context" and "tag". I think for a drop trigger I would want the function to receive this information: type of object dropped, OID of object dropped, column number in the case of a column drop, flag indicating whether it's a toplevel drop or a cascaded drop. I wouldn't object to also making the currently-in-context toplevel command tag available, but I think most drop triggers wouldn't really care, so I wouldn't personally spend much implementation effort on it if it turns out to be hard. But in general, I don't really know what a "proper" subcommand is or why some subcommands should be more proper than others, or why we should even be concerned about whether something is a subcommand at all. I think it's fine and useful to have triggers that fire in those kinds of places, but I don't see why we should limit ourselves to that. For applications like replication, auditing, and enhanced security, the parse tree and subcommand/non-subcommand status of a particular operation are irrelevant. What you need is an exact description of the operation that got performed (e.g. the default on table X column Y got dropped); you might be able to reverse-engineer that from the parse tree, but it's much better to have the system pass you the information you need more directly. Certainly, there are cases where you might want to have the parse tree, or even the raw command text, available, but I'm not even convinced that that those cases will be the most commonly used ones unless, of course, they're the only ones we offer, in which case everyone will go down that path by necessity. >> your trigger procedure, the set of magic variables that is available >> will depend on which toplevel event you set the trigger on, but >> hopefully all firings of that toplevel event can provide the same >> magic variables. For example, at ddl_command_start time, you're just >> gonna get the command tag, but at ddl_command_end time you will get >> the command tag plus maybe some other stuff. > > With my proposal above, you could get the same set of information when > being called as a toplevel event or a subevent (one where the context is > not null). That would mean adding object name and schema name lokkups in > the drop cascade code, though. We can also decide not to do that extra > lookup and just publish the object id which we certainly do have. > > This way, the timing spec of a sub-event can still be of the same kind > as the top-level event ones, we still have before and after lock entry > points, same with lookup if we add that feature, etc. Again, I'm not understanding the distinction between toplevel events and sub-events. I don't see any need for such a distinction. I think there are just events, and some of them happen at command start/end and others happen somewhere in the middle. As long as it's a safe and useful place to fire a trigger, who cares? > Given the scope of this mini expression language, we can easily bypass > calling the executor in v1 here, and reconsider later if we want to > allow calling a UDF in the WHEN clause… I don't think it's an easy > feature to add in, though. Or a necessary one. AFAICS, the main benefit of WHEN clauses on regular triggers is that you can prevent the AFTER trigger queue from getting huge, and maybe a save a little on the cost of invoking a trigger function just to exit again. But neither of those should be relevant here - nobody does that much DDL, and anybody writing command triggers should understand that this is advanced magic not intended for beginners. Wizards below level 10 need not apply. BTW, in reference to your other email, the reason I suggested ddl_command_start rather than just command_start is because we already know that there are cases here we'll never be able to handle, like before-begin. I think it's better to be explicit about what the scope of the feature is (i.e. DDL) rather than just calling it command_start and, oh, by the way, the commands we either don't know how to support or didn't get around to supporting yet aren't included. The latter approach is basically a guaranteed compatibility break every time we get around to adding a few more commands, and I'm not keen on that. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: