Re: Command Triggers patch v18 - Mailing list pgsql-hackers
From | Dimitri Fontaine |
---|---|
Subject | Re: Command Triggers patch v18 |
Date | |
Msg-id | 87vclm5w0t.fsf@hi-media-techno.com Whole thread Raw |
In response to | Re: Command Triggers patch v18 (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: Command Triggers patch v18
|
List | pgsql-hackers |
Robert Haas <robertmhaas@gmail.com> writes: > I'm thinking of things like extension whitelisting. When some > unprivileged user says "CREATE EXTENSION harmless", and harmless is > marked as superuser-only, we might like to have a hook that gets > called *at permissions-checking time* and gets to say, oh, well, that > extension is on the white-list, so we're going to allow it. I think > you can come up with similar cases for other commands, where in > general the operation is restricted to superusers or database owners > or table owners but in specific cases you want to allow others to do > it. I did that another way in previous incarnations of the patch, which was to allow for INSTEAD OF event trigger backed by a SECURITY DEFINER function. When the extension is whitelisted, prevent against recursion then CREATE EXTENSION in the security definer function, then signal that the execution should now be aborted. That was too dangerous given the lack of policy about where exactly the user code is fired, but I think we could now implement that for some of the event timing specs we're listing. Only some of them, I guess only those that are happening before we lock the objects. I would then prefer using the INSTEAD OF words that are way more easy to grasp than AT. >>>>> CREATE EVENT TRIGGER name ON event_name (event_subtype_name [, ...]) >>>>> EXECUTE PROCEDURE function_name(args); >>>> >>>> create event trigger prohibit_some_ddl >>>> preceding <timing spec> >>>> when tag in ('CREATE TABLE', 'ALTER TABLE') >>>> execute procedure throw_an_error(); > > I guess that would make sense if you think there would ever be more > than one choice for <trigger variable>. I'm not immediately seeing a > use case for that, though - I was explicitly viewing the syntax foo So, the variables in question are tag, objectid, objectname, schemaname and from a very recent email context. On reflexion, I think the variable here would only be either tag or context, and that's it. > More generally, my thought on the structure of this is that you're > going to have certain toplevel events, many of which will happen at > only a single place in the code, like "an object got dropped" or "a > DDL command started" or "a DDL command ended". So we give those > names, like sql_drop, ddl_command_start, and ddl_command_end. Inside I really dislike mixing sql_drop and ddl_command_start as being the same kind of objects here, even if I can bend my head in the right angle and see that it's a fair view when looking at how it's implemented. I can't see a way to explain that to users without having to explain them how drop cascade is implemented. So my proposal here is to “fake” a “proper“ subcommand thanks to the new context variable. If you DROP TYPE foo CASCADE and that in turn drops a function foo_in(), then an event trigger is fired with context = 'DROP TYPE' tag = 'DROP FUNCTION' Same idea when you DROP TABLE … CASCADE and a SEQUENCE and a bunch of index need to disappear too, you get an usual event trigger fired with the context set to 'DROP TABLE' this time. I don't think we need to arrange for explicitly publishing the context specific information here. If we need to, we have to find the right timing spec where we can guarantee still being in the top level command and where we already have the details filled in, then users can attach a trigger here and register the information for themselves. > your trigger procedure, the set of magic variables that is available > will depend on which toplevel event you set the trigger on, but > hopefully all firings of that toplevel event can provide the same > magic variables. For example, at ddl_command_start time, you're just > gonna get the command tag, but at ddl_command_end time you will get > the command tag plus maybe some other stuff. With my proposal above, you could get the same set of information when being called as a toplevel event or a subevent (one where the context is not null). That would mean adding object name and schema name lokkups in the drop cascade code, though. We can also decide not to do that extra lookup and just publish the object id which we certainly do have. This way, the timing spec of a sub-event can still be of the same kind as the top-level event ones, we still have before and after lock entry points, same with lookup if we add that feature, etc. > Now, we COULD stop there. I mean, we could document that you can > create a trigger on ddl_command_start and every DDL command will fire > that trigger, and if the trigger doesn't care about some DDL > operations, then it can look at the command tag and return without > doing anything for the operations it doesn't care about. The only > real disadvantage of doing it that way is speed, and maybe a bit of > code complexity within the trigger. So my further thought was that Within my “context proposal”, you also lose the ability to refer to sub events as plain events with a context, which I find so much cleaner. > we'd allow you to specify an optional filter list to restrict which > events would fire the trigger, and the exact meaning of that filter > list would vary depending on the toplevel event you've chosen - i.e. > for ddl_command_start, the filter would be a list of commands, but for > sql_drop it would be a list of object types. We could make that more > complicated if we think that an individual toplevel event will need > more than one kind of filtering. For example, if you wanted to filter > sql_drop events based on the object type AND/OR the schema name, then > the syntax I proposed would obviously not be adequate. I'm just not > convinced we need that, especially because you'd then need to set up a > dependency between the command trigger and the schema, since obviously > the command trigger must be dropped if the schema is. Maybe there are > better examples; I just can't think of any. This WHEN syntax allows us to add support for more things in the way you're describing down the road, and as you say for first version of that we can limit ourselves to only supporting two variables, context and tag, and a single operation, in against a list of string literals. Thinking some more about it we need to add the AND operator in between several expressions here. We don't need to implement OR because that's easy to implement by creating more than one event trigger that calls the same function. We could choose to implement not in, too. Given the scope of this mini expression language, we can easily bypass calling the executor in v1 here, and reconsider later if we want to allow calling a UDF in the WHEN clause… I don't think it's an easy feature to add in, though. >> The namespace is often resolved first, sometime even before permission >> checks are done. See CreateConversionCommand() for an example of that, >> but really that happens about everywhere IIRC. > > Ah! I was thinking of ALTER commands, but you're right: CREATE > commands are different. In many ways a CREATE command is more like an > operation on the schema than it is an operation on the object itself > (since that doesn't exist yet). Those all need to have permissions > checks and locking *on the schema* intertwined with the schema lookup, > as RangeVarGetAndCheckCreationNamespace() does. Unfortunately, the > equivalent code for non-relations is broken, and I didn't have time to > fix it in time for 9.2. I think that would be a good thing to get > done for 9.3, though: it's only about 2 days worth of work, I think. Meanwhile we can easily enough refrain from exposing those timing specs to CREATE command events, though. > Yeah. This might even be a case where we should write the > documentation first and then make the implementation match it, rather > than the other way around. I still think we might want to double check how the code is actually written in each case (create vs alter vs drop, relation commands vs other types of objects, etc), so that will end up intertwined. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
pgsql-hackers by date: