Re: Event Triggers: adding information - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Event Triggers: adding information |
Date | |
Msg-id | CA+TgmoYxRtC-w1R1HTLqE3s0CB6NtRY04JK+TSM1cCLbtrQAZg@mail.gmail.com Whole thread Raw |
In response to | Re: Event Triggers: adding information (Dimitri Fontaine <dimitri@2ndQuadrant.fr>) |
Responses |
Re: Event Triggers: adding information
|
List | pgsql-hackers |
On Thu, Jan 17, 2013 at 4:43 PM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> Goal: Every time an ALTER command is used on object *that actually >> exists*, we will call some user-defined function and pass the object >> type, the OID of the object, and some details about what sort of >> alteration the user has requested. > > Ok, in current terms of the proposed patch, it's a ddl_command_end event > trigger, and you can choose when to fire it in utility.c. The current > place is choosen to be just after the AlterTable() call, /me scratches my head. Depends. What I described would fire before the command was executed, not after, so it's not quite the same thing, but it might be just as good for your purposes. It wasn't so much intended as "we should do this exact thing" as much as "this is the kind of thought process that you need to go through if you want to safely run an event trigger in the middle of another command". Now, maybe we don't actually need to do that to serve your purposes. As already discussed, it seems like passing information to ddl_command_end is for most purposes sufficient for what you need - and we can do that without any special gymnastics, because that way it runs completely after the command, not in the middle. If I understand correctly, and I may not, there are basically two problems with that approach: (1) It's not exactly clear how to pass the information about the statement through to the ddl_command_end trigger. I know you've proposed a couple of things, but none of them seem completely satisfying. At least not to me. (2) If the method of transmission is "the OIDs of the affected objects", which I believe is your current proposal, the ddl_command_end trigger can't go look those objects up in the catalog, because the catalog entries are already gone by that point. It's possible we could solve both of those problems without running event triggers in the middle of commands at all. In which case, my whole example is moot for now. But I think it would be smart to get ddl_command_end (without any additional information) committed first, and then argue about these details, so that we at least make some definable progress here. > because that > sounded logical if you believe in CommandCounterIncrement. I'm somewhat bemused by this comment. I don't think CommandCounterIncrement() is an article of faith. If you execute a command to completion, and do a CommandCounterIncrement(), then whatever you do next will look just like a new command, so it should be safe to run user-provided code there. But if you stop in the middle of a command, do a CommandCounterIncrement(), run some user-provided code, and then continue on, the CommandCounterIncrement() by itself protects you from nothing. If the code is not expecting arbitrary operations to be executed at that point, and you execute them, you get to keep both pieces. Indeed, there are some places in the code where inserting a CommandCounterIncrement() *by itself* could be unsafe. I don't believe that's a risk in ProcessUtility(), but that doesn't mean there aren't any risks in ProcessUtility(). >> So, instead, what we need to do is go into each function that >> implements ALTER, and modify it so that after it does the dance where >> we check permissions, take locks, and look up names, we call the >> trigger function. That's a bit of a nuisance, because we probably >> have a bunch of call sites to go fix, but in theory it should be >> doable. The problem of course, is that it's not intrinsically safe to >> call user-defined code there. If it drops the object and creates a >> new one, say, hilarity will probably ensue. > > You're trying to find a dangerous point when to fire the trigger here, Yes, I am. I'm not asking you to implement what I proposed there - I'm just giving an example of how to make a dangerous place safe. You've ALSO found a dangerous point when to fire the trigger. The difference is that my example dangerous point is probably fixable to be safe, whereas your actual dangerous point is probably unfixable, because there are many current paths of execution that go through that function in an extremely ad-hoc fashion, and there may be more in the future. ddl_command_start and ddl_command_end are safe enough *when restricted to toplevel commands*, but that's about as far as it goes. Perhaps there are other special cases that are also safe, but just throwing everything into the pot with no analysis and no future-proofing certainly isn't. >> Now, there is a further problem: all of the information about the >> ALTER statement that we want to provide to the trigger may not be >> available at that point. Simple cases like ALTER .. RENAME won't >> require anything more than that, but things like ALTER TABLE .. ADD >> FOREIGN KEY get tricky, because while at this point we have a solid >> handle on the identity of the relation to which we're adding the >> constraint, we do NOT yet have knowledge of or a lock on the TARGET >> relation, whose OID could still change on us up to much later in the >> computation. To get reliable information about *that*, we'll need to >> refactor the sequence of events inside ALTER TABLE. > > The only valid answer here has already been given by Tom. You can only > provide the information if it's available in the catalogs. So again, > it's a ddl_command_end event. It can not happen before. I don't see why you say that. It's perfectly possible to have that information available at the right time; the work just hasn't been done yet. >> Hopefully you can see what I'm driving toward here. In a design like >> this, we can pretty much prove - after returning from the event >> trigger - that we're in a state no different from what would have been >> created by executing a series of commands - lock table, then SELECT >> event_trigger_func(), then the actual ALTER in sequence. Maybe >> there's a flaw in that analysis - that's what patch review is for - >> but it sounds pretty darn tight to me. > > The only case when we need to do a lookup BEFORE actually running the > command is when that command is a DROP, because that's the only timing > when the information we want is still in the catalogs. > > So that's the only case where we do a double object lookup, and we take > a ShareUpdateExclusiveLock lock on the object when doing so, so that we > can't lose it from another concurrent activity. As noted, I don't think that's sufficient to really guarantee no shenanigans. One idea would be to save off all the names of the objects actually dropped at the time the drops are done, and then pass it to the event trigger afterward. But frankly I think that's way beyond the scope of what we should try to get done for 9.3. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: