Thread: sql_drop Event Trigger
Hi, I took the liberty to create a new thread for $subject, because the main comments I've been receiving about Event Triggers at this point is how difficult it is to try and follow our discussions about them. In order for everybody interested to be able to easily get the important bits of information from this patch series and review, I'm now going to work on a wiki page that I will then update when needed. The important messages you will need to refer to for more context in this thread are: http://www.postgresql.org/message-id/m21udesaz3.fsf@2ndQuadrant.fr http://www.postgresql.org/message-id/CA+TgmoZz6MXQ5zX6dopc_xaHVkdwhEhgDFJeAWsRNs+N7e_ueA@mail.gmail.com So please find attached to this email an implementation of the sql_drop event trigger, that refrains on exposing any new information to the users. COLUMNS=72 git diff --stat master.. doc/src/sgml/event-trigger.sgml | 98 +++++++- doc/src/sgml/func.sgml | 47 +++- src/backend/catalog/dependency.c | 7 + src/backend/commands/event_trigger.c | 233 ++++++++++++++++++- src/backend/tcop/utility.c | 23 +- src/backend/utils/cache/evtcache.c | 2 + src/include/catalog/pg_proc.h | 4 +- src/include/commands/event_trigger.h | 18 ++ src/include/utils/builtins.h | 3 + src/include/utils/evtcache.h | 3 +- src/test/regress/expected/event_trigger.out | 53 +++++ src/test/regress/sql/event_trigger.sql | 37 +++ 12 files changed, 519 insertions(+), 9 deletions(-) The implementation follows Robert ideas in that we accumulate information about objects we are dropping then provide it to the Event Trigger User Function. The way to provide it is using a Set Returning Function called pg_dropped_objects() and that is only available when running a "sql_drop" event trigger. This functions returns a set of classid, objid, objsubid (as in pg_depend), but you have to remember that you can't use them for catalog lookups as the objects are already dropped: we can't both decide not to add any concurrency hazards, no new lookups and locking nor extra work in dependency.c *and* get at the OID of the DROP CASCADEd objects before the drop happens, as far as I understand it. I hope to complement the information available in a follow-up patch, where I intend to provide object name, id, kind, schema name and operation name in all supported operations. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Attachment
Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes: > So please find attached to this email an implementation of the sql_drop > event trigger, that refrains on exposing any new information to the > users. Already a v1 of that patch, per comments from Álvaro I reuse the ObjectAddresses facility rather than building my own List of object addresses. Note that with that in place it's now easy to also add support for the DROP OWNED BY command, but that's left for a future patch as I expect some amount of discussion to go about it. Also, I removed the code that was doing de-deduplication of the object addresses we collect, now trusting performMultipleDeletions() not to screw us up. There's a use case that needs particular attention here, though: DROP TABLE foo, foo; I'm not sure we want to deduplicate foo in the pg_dropped_objects() output in that case, so I've not done so in this version of the patch. Also, Álvaro is concerned that the cost of deduplicating might be higher than what we want to take here. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Attachment
On Wed, Jan 30, 2013 at 12:59 PM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes: >> So please find attached to this email an implementation of the sql_drop >> event trigger, that refrains on exposing any new information to the >> users. > > Already a v1 of that patch, per comments from Álvaro I reuse the > ObjectAddresses facility rather than building my own List of object > addresses. > > Note that with that in place it's now easy to also add support for the > DROP OWNED BY command, but that's left for a future patch as I expect > some amount of discussion to go about it. > > Also, I removed the code that was doing de-deduplication of the object > addresses we collect, now trusting performMultipleDeletions() not to > screw us up. There's a use case that needs particular attention here, > though: > > DROP TABLE foo, foo; > > I'm not sure we want to deduplicate foo in the pg_dropped_objects() > output in that case, so I've not done so in this version of the patch. > Also, Álvaro is concerned that the cost of deduplicating might be higher > than what we want to take here. Taking a first look at this, I think the idea of pg_dropped_objects() is really pretty clever. I like it. I assure we will end up with several functions of this type eventually, so it might be good to adopt some kind of distinguishing naming convention for this type of function. pg_event_trigger_context_dropped_objects() seems far too verbose, but that's the idea. With this approach, there's no real need to introduce a new event type. We could just make ddl_command_end triggers able to use this, and we're done. The point of sql_drop was that it would have been called once *per dropped object*, not once per command. But, actually, thinking about this, for something like Slony, exposing pg_dropped_objects() to ddl_command_end triggers should be just as good, and maybe a whole lot better, than what I was proposing. Does it work for you to rip out sql_drop and just make pg_dropped_objects() - perhaps renamed - available in ddl_command_end? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > Taking a first look at this, I think the idea of pg_dropped_objects() > is really pretty clever. I like it. I assure we will end up with > several functions of this type eventually, so it might be good to > adopt some kind of distinguishing naming convention for this type of > function. pg_event_trigger_context_dropped_objects() seems far too > verbose, but that's the idea. Thanks. Agreed that we will have more of them. In the attached version 3 of the patch, it got renamed to pg_event_trigger_dropped_objects(). > With this approach, there's no real need to introduce a new event > type. We could just make ddl_command_end triggers able to use this, > and we're done. The point of sql_drop was that it would have been > called once *per dropped object*, not once per command. But, Well, from the beginning of the sql_drop discussion, it's been clear that it's meant to allow for users to easily attach their function to any drop that might appear, whatever the command at origin of that drop. So in the attached, I've made the sql_drop an event triggers called once per object, which means that currently you only know an object is getting DROPed as part of a DROP TABLE command. > actually, thinking about this, for something like Slony, exposing > pg_dropped_objects() to ddl_command_end triggers should be just as > good, and maybe a whole lot better, than what I was proposing. It also changes the protocol to use for getting at the information related to the objects. I think we will have to have the out parameters of the function to grow to include the next information we're going to make available to TG_* variables in the next patch of the series. > Does it work for you to rip out sql_drop and just make > pg_dropped_objects() - perhaps renamed - available in ddl_command_end? I did rename pg_dropped_objects() and it's now available in ddl_command_end, but I kept the new "sql_drop" event, which is now called once per object dropped, as intended (but without any useful information yet). What do you think? -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Attachment
On Mon, Feb 4, 2013 at 11:59 AM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > Thanks. Agreed that we will have more of them. In the attached version 3 > of the patch, it got renamed to pg_event_trigger_dropped_objects(). Works for me. >> With this approach, there's no real need to introduce a new event >> type. We could just make ddl_command_end triggers able to use this, >> and we're done. The point of sql_drop was that it would have been >> called once *per dropped object*, not once per command. But, > > Well, from the beginning of the sql_drop discussion, it's been clear > that it's meant to allow for users to easily attach their function to > any drop that might appear, whatever the command at origin of that drop. What precludes us from doing that in ddl_command_end? ISTM we can just extend the ddl_command_start/end triggers to a slightly broader range of commands and be done with it. >> actually, thinking about this, for something like Slony, exposing >> pg_dropped_objects() to ddl_command_end triggers should be just as >> good, and maybe a whole lot better, than what I was proposing. > > It also changes the protocol to use for getting at the information > related to the objects. I think we will have to have the out parameters > of the function to grow to include the next information we're going to > make available to TG_* variables in the next patch of the series. > >> Does it work for you to rip out sql_drop and just make >> pg_dropped_objects() - perhaps renamed - available in ddl_command_end? > > I did rename pg_dropped_objects() and it's now available in > ddl_command_end, but I kept the new "sql_drop" event, which is now > called once per object dropped, as intended (but without any useful > information yet). > > What do you think? Well, having spent a year or more trying to convince you that we need sql_drop - mostly because of the complexities of passing an array of arguments to the trigger function - I now think we don't, because the pg_event_trigger_dropped_objects() bit solves that problem rather elegantly. It seems to me with just a little bit of hacking we should be able to make this work by adding that function and changing nothing else. I might be wrong, of course. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > Well, having spent a year or more trying to convince you that we need > sql_drop - mostly because of the complexities of passing an array of > arguments to the trigger function - I now think we don't, because the > pg_event_trigger_dropped_objects() bit solves that problem rather > elegantly. It seems to me with just a little bit of hacking we should > be able to make this work by adding that function and changing nothing > else. I might be wrong, of course. It's not exactly like we don't need to add anything else than just the function to support the feature, but it's not that much either. Please see attached. doc/src/sgml/event-trigger.sgml | 15 +- doc/src/sgml/func.sgml | 48 ++++++- src/backend/access/transam/xact.c | 8 +- src/backend/catalog/dependency.c | 38 +---- src/backend/commands/event_trigger.c | 123 ++++++++++++++++- src/backend/tcop/utility.c | 28 +++- src/include/catalog/dependency.h | 31 ++++- src/include/catalog/pg_proc.h | 4 +- src/include/commands/event_trigger.h | 21 +++ src/include/utils/builtins.h | 3 + src/test/regress/expected/event_trigger.out | 67 ++++++++- src/test/regress/sql/event_trigger.sql | 39 +++++- 12 files changed, 374 insertions(+), 51 deletions(-) Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Attachment
And already a v1. Álvaro did spot a line I did remove by mistake in the docs, and some extra whitespace changes that pgindent will change anyway and that as such I shouldn't force you to read and discard. It's a 3 lines change set from before. Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes: > Robert Haas <robertmhaas@gmail.com> writes: >> Well, having spent a year or more trying to convince you that we need >> sql_drop - mostly because of the complexities of passing an array of >> arguments to the trigger function - I now think we don't, because the >> pg_event_trigger_dropped_objects() bit solves that problem rather >> elegantly. It seems to me with just a little bit of hacking we should >> be able to make this work by adding that function and changing nothing >> else. I might be wrong, of course. > > It's not exactly like we don't need to add anything else than just the > function to support the feature, but it's not that much either. Please > see attached. > > doc/src/sgml/event-trigger.sgml | 15 +- > doc/src/sgml/func.sgml | 48 ++++++- > src/backend/access/transam/xact.c | 8 +- > src/backend/catalog/dependency.c | 38 +---- > src/backend/commands/event_trigger.c | 123 ++++++++++++++++- > src/backend/tcop/utility.c | 28 +++- > src/include/catalog/dependency.h | 31 ++++- > src/include/catalog/pg_proc.h | 4 +- > src/include/commands/event_trigger.h | 21 +++ > src/include/utils/builtins.h | 3 + > src/test/regress/expected/event_trigger.out | 67 ++++++++- > src/test/regress/sql/event_trigger.sql | 39 +++++- > 12 files changed, 374 insertions(+), 51 deletions(-) > > Regards, > -- > Dimitri Fontaine > http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Attachment
Robert Haas escribió: > On Mon, Feb 4, 2013 at 11:59 AM, Dimitri Fontaine > <dimitri@2ndquadrant.fr> wrote: > > Thanks. Agreed that we will have more of them. In the attached version 3 > > of the patch, it got renamed to pg_event_trigger_dropped_objects(). > > Works for me. > > >> With this approach, there's no real need to introduce a new event > >> type. We could just make ddl_command_end triggers able to use this, > >> and we're done. The point of sql_drop was that it would have been > >> called once *per dropped object*, not once per command. But, > > > > Well, from the beginning of the sql_drop discussion, it's been clear > > that it's meant to allow for users to easily attach their function to > > any drop that might appear, whatever the command at origin of that drop. > > What precludes us from doing that in ddl_command_end? ISTM we can > just extend the ddl_command_start/end triggers to a slightly broader > range of commands and be done with it. I thought there was the idea that the list of objects to drop was to be acquired before actually doing the deletion; so that the trigger function could, for instance, get the name of the table being dropped. I don't see that it works if we only provide pg_event_trigger_dropped_objects to ddl_command_end events. Am I missing something? -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Dimitri Fontaine escribió: > And already a v1. > > Álvaro did spot a line I did remove by mistake in the docs, and some > extra whitespace changes that pgindent will change anyway and that as > such I shouldn't force you to read and discard. The bigger change I mentioned was the stuff in dependency.c -- I wasn't too happy about exposing the whole ObjectAddresses stuff to the outside world. The attached version only exposes simple accessors to let an external user of that to iterate on such arrays. Some more commentary is probably needed on those new functions. Also, if we're going to extend things in this way we probably need to get "extra" out alongside the object array itself. A larger issue with the patch is handling of subxacts. A quick test doesn't reveal any obvious misbehavior, but having the list of objects dropped by a global variable might be problematic. What if, say, the event trigger function does something funny in an EXCEPTION block and it fails? Some clever test case is needed here, I think. Also, if we reset the variable at EOXact, do we also need to do something at EOSubXact? -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > I thought there was the idea that the list of objects to drop was to be > acquired before actually doing the deletion; so that the trigger > function could, for instance, get the name of the table being dropped. > I don't see that it works if we only provide > pg_event_trigger_dropped_objects to ddl_command_end events. Am I > missing something? Tom and Robert have been rightfully insisting on how delicate it has been to come up with the right behavior for performMultipleDeletions, and that's not something we can easily reorganise. So the only way to get at the information seems to be what Robert insisted that I do, that is storing the information about the objects being dropped for later processing. Of course, later processing means that the objects are already dropped and that you can't do much. The idea is to provide more than just the OID of the object, we have yet to decide if adding a catalog cache lookup within performMultipleDeletions() is ok. If it is, we will extend the pg_event_trigger_dropped_objects() definition to also return the object name and its schema name, at a minimum. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > The bigger change I mentioned was the stuff in dependency.c -- I wasn't > too happy about exposing the whole ObjectAddresses stuff to the outside > world. The attached version only exposes simple accessors to let an > external user of that to iterate on such arrays. Some more commentary > is probably needed on those new functions. Also, if we're going to > extend things in this way we probably need to get "extra" out alongside > the object array itself. Thanks for that. I added some comments on those new functions, and made it so that we call get_object_addresses_numelements() only once per loop rather than at each step in the loop. See attached. > A larger issue with the patch is handling of subxacts. A quick test > doesn't reveal any obvious misbehavior, but having the list of objects > dropped by a global variable might be problematic. What if, say, the > event trigger function does something funny in an EXCEPTION block and it > fails? Some clever test case is needed here, I think. Also, if we > reset the variable at EOXact, do we also need to do something at > EOSubXact? Now that you mention it, the case I'd be worried about is: BEGIN; SAVEPOINT a; DROP TABLE foo; ROLLBACK TO SAVEPOINT a; DROP TABLE bar; COMMIT; As we currently have no support for on-commit triggers or the like, the user function is going to run "as part" of the DROP TABLE foo; command, and its effects are all going to disappear at the next ROLLBACK TO SAVEPOINT anyway. If the event trigger user function fails in an EXCEPTION block, I suppose that the whole transaction is going to get a ROLLBACK, which will call AbortTransaction() or CleanupTransaction(), which will reset the static variable EventTriggerSQLDropInProgress. And the list itself is gone away with the memory context reset. I think the only missing detail is to force EventTriggerSQLDropList back to NULL from within AtEOXact_EventTrigger(), and I've done so in the attached. As we're only looking at the list when the protecting boolean is true, I don't think it's offering anything else than clarity, which makes it worthwile already. You will find both the patch-on-top-of-your-patch (named .2.b) and the new whole patch attached (named .3), as it makes things way easier IME. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Attachment
Dimitri Fontaine escribió: > Alvaro Herrera <alvherre@2ndquadrant.com> writes: > > I thought there was the idea that the list of objects to drop was to be > > acquired before actually doing the deletion; so that the trigger > > function could, for instance, get the name of the table being dropped. > > I don't see that it works if we only provide > > pg_event_trigger_dropped_objects to ddl_command_end events. Am I > > missing something? > > Tom and Robert have been rightfully insisting on how delicate it has > been to come up with the right behavior for performMultipleDeletions, > and that's not something we can easily reorganise. Well, I don't necessarily suggest that. But how about something like this in performMultipleDeletions: /* * Fire event triggers for all objects to be dropped */if (EventTriggerSQLDropInProgress){ for (i = 0; i < targetObjects->numrefs;i++) { ObjectAddress *thisobj; thisobj = targetObjects->refs + i; if (EventTriggerSQLDropInProgress && EventTriggerSupportsObjectType(getObjectClass(thisobj))) { add_exact_object_address(thisobj, EventTriggerSQLDropList); } } /* invoke sql_drop triggers */ EventTriggerSQLDrop(); /* EventTriggerSQLDropList remains set for ddl_command_end triggers */} /* and delete them */for (i = 0; i < targetObjects->numrefs; i++){ ObjectAddress *thisobj; thisobj = targetObjects->refs + i; deleteOneObject(thisobj, &depRel, flags);} -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > Well, I don't necessarily suggest that. But how about something like > this in performMultipleDeletions: [edited snippet of code] > /* invoke sql_drop triggers */ > EventTriggerSQLDrop(); > > /* EventTriggerSQLDropList remains set for ddl_command_end triggers */ > } > > /* and delete them */ > for (i = 0; i < targetObjects->numrefs; i++) ... > deleteOneObject(thisobj, &depRel, flags); My understanding of Tom and Robert comments is that it is very unsafe to run random user code at this point, so that can not be an Event Trigger call point. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Dimitri Fontaine escribió: > Alvaro Herrera <alvherre@2ndquadrant.com> writes: > > Well, I don't necessarily suggest that. But how about something like > > this in performMultipleDeletions: > > [edited snippet of code] > > > /* invoke sql_drop triggers */ > > EventTriggerSQLDrop(); > > > > /* EventTriggerSQLDropList remains set for ddl_command_end triggers */ > > } > > > > /* and delete them */ > > for (i = 0; i < targetObjects->numrefs; i++) > ... > > deleteOneObject(thisobj, &depRel, flags); > > My understanding of Tom and Robert comments is that it is very unsafe to > run random user code at this point, so that can not be an Event Trigger > call point. Hmm, quoth http://www.postgresql.org/message-id/23345.1358476518@sss.pgh.pa.us : > I'd really like to get to a point where we can > define things as happening like this: > > * collect information needed to interpret the DDL command > (lookup and lock objects, etc) > * fire "before" event triggers, if any (and if so, recheck info) > * do the command > * fire "after" event triggers, if any Note that in the snippet I posted above objects have already been looked up and locked (first phase above). One thing worth considering, of course, is the "if so, recheck info" parenthical remark; for example, what happens if the called function decides to, say, add a column to every dropped table? Or, worse, what happens if the event trigger function adds a column to table "foo_audit" when table "foo" is dropped, and you happen to drop both in the same command. Also, what if the function decides to drop table "foo_audit" when table "foo" is dropped? I think these things are all worth adding to a regress test for this feature, to ensure that behavior is sane, and also to verify that system catalogs after this remains consistent (for example we don't leave dangling pg_attribute rows, etc). Maybe the answer to all this is to run the lookup algorithm all over again and ensure that the two lists are equal, and throw an error otherwise, i.e. all scenarios above should be considered unsupported. That seems safest. But I don't think "code structure convenience" is the only reason to do things this way instead of postponing firing the trigger until the end. I think complete support for drop event triggers really needs to have the objects to be dropped still in catalogs, so that they can be looked up; for instance, user code might want to check the names of the columns and take particular actions if particular names or particular types are present. That doesn't seem an easy thing to do if all you get is pg_dropped_objects(), because how do you know which columns belong to which tables? -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > Hmm, quoth > http://www.postgresql.org/message-id/23345.1358476518@sss.pgh.pa.us : > >> I'd really like to get to a point where we can >> define things as happening like this: >> >> * collect information needed to interpret the DDL command >> (lookup and lock objects, etc) >> * fire "before" event triggers, if any (and if so, recheck info) >> * do the command >> * fire "after" event triggers, if any > > Note that in the snippet I posted above objects have already been looked > up and locked (first phase above). Ok, I like what you did, but what you did is reinstall the "sql_drop" event and is not complete, as you mention we have some hard problems to solve there. > But I don't think "code structure convenience" is the only reason to do > things this way instead of postponing firing the trigger until the end. > I think complete support for drop event triggers really needs to have > the objects to be dropped still in catalogs, so that they can be looked > up; for instance, user code might want to check the names of the > columns and take particular actions if particular names or particular > types are present. That doesn't seem an easy thing to do if all you get > is pg_dropped_objects(), because how do you know which columns belong to > which tables? Agreed. In terms of Robert's reviewing, though, I think you're talking about another patch entirely, that will get worked on in the 9.4 cycle. And in my terms, doing all that work now is useless anyway because we are not exposing any object specific information that allow the user to do any actual catalog lookup anyway, yet. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Tue, Feb 5, 2013 at 10:42 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > A larger issue with the patch is handling of subxacts. A quick test > doesn't reveal any obvious misbehavior, but having the list of objects > dropped by a global variable might be problematic. What if, say, the > event trigger function does something funny in an EXCEPTION block and it > fails? Some clever test case is needed here, I think. Also, if we > reset the variable at EOXact, do we also need to do something at > EOSubXact? This is an awfully good point, although I think the issue has to do with command boundaries more than subtransactions. Suppose you create two ddl_command_end event triggers, A and B. A contains a DROP IF EXISTS command. Someone runs a toplevel DROP command. Now, A is going to fire first, and that's going to recursively invoke A (which will do nothing the second time) and then B; on return from B, we'll finish running the event triggers for the toplevel command, executing B again. If the list of dropped objects is stored in a global variable, it seems like there are a number of ways this can go wrong. I have not tested the actual behavior of the latest patch, but I think we want to define things so that the pg_event_trigger_dropped_objects() function returns, specifically, the list of objects dropped by the command which caused the event trigger to fire. In other words, in the above example, the first, recursive invocation of B should see the object removed by A's DROP-IF-EXISTS, and the second invocation should see the object removed by the toplevel command. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > variable, it seems like there are a number of ways this can go wrong. Yeah, I think the current behavior might be surprising. > I have not tested the actual behavior of the latest patch, but I think > we want to define things so that the > pg_event_trigger_dropped_objects() function returns, specifically, the > list of objects dropped by the command which caused the event trigger > to fire. In other words, in the above example, the first, recursive > invocation of B should see the object removed by A's DROP-IF-EXISTS, > and the second invocation should see the object removed by the > toplevel command. I disagree with that. I don't see why the enclosing event trigger shouldn't be aware of all the objects dropped by the command that just ran to completion, *including* the effects of any event trigger fired recursively or not. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Wed, Feb 6, 2013 at 9:36 AM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> variable, it seems like there are a number of ways this can go wrong. > > Yeah, I think the current behavior might be surprising. > >> I have not tested the actual behavior of the latest patch, but I think >> we want to define things so that the >> pg_event_trigger_dropped_objects() function returns, specifically, the >> list of objects dropped by the command which caused the event trigger >> to fire. In other words, in the above example, the first, recursive >> invocation of B should see the object removed by A's DROP-IF-EXISTS, >> and the second invocation should see the object removed by the >> toplevel command. > > I disagree with that. I don't see why the enclosing event trigger > shouldn't be aware of all the objects dropped by the command that just > ran to completion, *including* the effects of any event trigger fired > recursively or not. Well, that could result in some DROP events being reported more than once, which I assume would be undesirable for someone hoping to use this for replication. (Eventually, we'll have to face the same problem for CREATE events, too.) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Dimitri Fontaine escribió: > Robert Haas <robertmhaas@gmail.com> writes: > > I have not tested the actual behavior of the latest patch, but I think > > we want to define things so that the > > pg_event_trigger_dropped_objects() function returns, specifically, the > > list of objects dropped by the command which caused the event trigger > > to fire. In other words, in the above example, the first, recursive > > invocation of B should see the object removed by A's DROP-IF-EXISTS, > > and the second invocation should see the object removed by the > > toplevel command. > > I disagree with that. I don't see why the enclosing event trigger > shouldn't be aware of all the objects dropped by the command that just > ran to completion, *including* the effects of any event trigger fired > recursively or not. Not sure about that. If the trigger records objects dropped in a table, aren't they going to show up there twice if you do that? -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Robert Haas <robertmhaas@gmail.com> writes: >> I disagree with that. I don't see why the enclosing event trigger >> shouldn't be aware of all the objects dropped by the command that just >> ran to completion, *including* the effects of any event trigger fired >> recursively or not. > > Well, that could result in some DROP events being reported more than > once, which I assume would be undesirable for someone hoping to use > this for replication. Any command might have an event trigger attached doing a DROP, so that you don't know where to expect it, and it's well possible that in your example both the event triggers have been installed by different tools. -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes: > Alvaro Herrera <alvherre@2ndquadrant.com> writes: >> I thought there was the idea that the list of objects to drop was to be >> acquired before actually doing the deletion; so that the trigger >> function could, for instance, get the name of the table being dropped. > Tom and Robert have been rightfully insisting on how delicate it has > been to come up with the right behavior for performMultipleDeletions, > and that's not something we can easily reorganise. > So the only way to get at the information seems to be what Robert > insisted that I do, that is storing the information about the objects > being dropped for later processing. I might be forgetting something, but doesn't dependency.c work by first constructing a list of all the objects it's going to drop, and only then dropping them? Could we inject a "pre deletion" event trigger call at the point where the list is completed? regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes: > I might be forgetting something, but doesn't dependency.c work by first > constructing a list of all the objects it's going to drop, and only then > dropping them? Could we inject a "pre deletion" event trigger call at > the point where the list is completed? What happens if the event trigger itself deletes objects? From the list? Then we have to redo all the preparatory steps, and I don't think we agreed on a way to do it. Plus, as we seem to be chasing minimal set of features per patch, I would think that getting the list of Objects OIDs that are already dropped is a well enough defined set of minimal feature for this very patch, that we will be in a position to extend later given some time. I still think we should think about the set of information we're going to be publishing first, because without publishing some more all we're doing here is moot anyway. Also, for most cases that I can think of, it's not a problem for the dropped object to not exist anymore in the catalogs by the time you get the information, if you get the object's name and schema and maybe some other properties. -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Dimitri Fontaine escribió: > Tom Lane <tgl@sss.pgh.pa.us> writes: > > I might be forgetting something, but doesn't dependency.c work by first > > constructing a list of all the objects it's going to drop, and only then > > dropping them? Could we inject a "pre deletion" event trigger call at > > the point where the list is completed? Yes, this is what I'm proposing. > What happens if the event trigger itself deletes objects? From the list? I replied to this above: raise an error. (How to do that is an open implementation question, but I proposed a simple idea upthread). -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > Dimitri Fontaine escribió: >> Tom Lane <tgl@sss.pgh.pa.us> writes: >> > I might be forgetting something, but doesn't dependency.c work by first >> > constructing a list of all the objects it's going to drop, and only then >> > dropping them? Could we inject a "pre deletion" event trigger call at >> > the point where the list is completed? > > Yes, this is what I'm proposing. So, I add back the "sql_drop" event and implement it in dependency.c, and keep the list for later processing from "ddl_command_end", right? Robert, you specifically opposed to "sql_drop" and I just removed it from the patch. What do you think now? Also, should that be a follow-up patch to the current one for your reviewing purposes? >> What happens if the event trigger itself deletes objects? From the list? > > I replied to this above: raise an error. (How to do that is an open > implementation question, but I proposed a simple idea upthread). Well, can the list of objects that get dropped for CASCADE change in between releases? If it ever does, that means that your event trigger that used to work before is not broken after upgrade. Is that ok? Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes: > Tom Lane <tgl@sss.pgh.pa.us> writes: >> I might be forgetting something, but doesn't dependency.c work by first >> constructing a list of all the objects it's going to drop, and only then >> dropping them? Could we inject a "pre deletion" event trigger call at >> the point where the list is completed? > What happens if the event trigger itself deletes objects? From the list? Throw an error. Per previous discussion, the trigger does not get to do anything that would affect the results of "parse analysis" of the command, and that list is exactly those results. > Plus, as we seem to be chasing minimal set of > features per patch, I would think that getting the list of Objects OIDs > that are already dropped is a well enough defined set of minimal feature > for this very patch, that we will be in a position to extend later given > some time. Well, a list of object OIDs is of exactly zero use once the command has been carried out. So I don't think that that represents a useful or even very testable feature on its own, if there's no provision to fire user code while the OIDs are still in the catalogs. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes: > Well, a list of object OIDs is of exactly zero use once the command > has been carried out. So I don't think that that represents a useful > or even very testable feature on its own, if there's no provision to > fire user code while the OIDs are still in the catalogs. For the same reason I want to talk about which information we publish. I can see why having access to the catalogs is a more general answer here though. Now, I've just been asked to remove "sql_drop" and I don't want to be adding it again before I know that whoever will commit the patch agrees with an explicit return of that feature from the dead. Please… -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Wed, Feb 6, 2013 at 9:44 AM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >>> I disagree with that. I don't see why the enclosing event trigger >>> shouldn't be aware of all the objects dropped by the command that just >>> ran to completion, *including* the effects of any event trigger fired >>> recursively or not. >> >> Well, that could result in some DROP events being reported more than >> once, which I assume would be undesirable for someone hoping to use >> this for replication. > > Any command might have an event trigger attached doing a DROP, so that > you don't know where to expect it, and it's well possible that in your > example both the event triggers have been installed by different tools. It certainly is; in fact, it's likely. So let's say that B is a replication trigger. Don't you want it to hear about each drop exactly once? If not, how will you avoid errors when you go to replay the events you've captured on another machine? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Feb 6, 2013 at 11:04 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes: >> Alvaro Herrera <alvherre@2ndquadrant.com> writes: >>> I thought there was the idea that the list of objects to drop was to be >>> acquired before actually doing the deletion; so that the trigger >>> function could, for instance, get the name of the table being dropped. > >> Tom and Robert have been rightfully insisting on how delicate it has >> been to come up with the right behavior for performMultipleDeletions, >> and that's not something we can easily reorganise. > >> So the only way to get at the information seems to be what Robert >> insisted that I do, that is storing the information about the objects >> being dropped for later processing. > > I might be forgetting something, but doesn't dependency.c work by first > constructing a list of all the objects it's going to drop, and only then > dropping them? Could we inject a "pre deletion" event trigger call at > the point where the list is completed? In theory, sure, but in practice, there are multiple ways that can break things. The possibility that the event trigger that runs at that point might drop one of the objects involved has already been mentioned, but there are others. For example, it might *create* a new object that depends on one of the objects to be dropped, potentially leading to an inconsistent pg_depend. It could also ALTER the object, or something that the object depends on. For example, suppose we're running an event trigger in the middle of a DROP TABLE command, and the event trigger creates a _SELECT rule on the table, transforming it into a view. Or, suppose it opens (and leaves open) a cursor scanning that table (normally, CheckTableNotInUse prevents that, but here we've already done that). Alternatively, suppose we're dropping a view which the event trigger redefines to depend on a completely different set of objects. I don't deny that code can be written to handle all of those cases correctly. But it's going to be a major refactoring, and the idea that we should be starting to design it in February seems ludicrous to me. It'll be May by the time we get this one patch right. Of 2014. And there are more than 70 other patches that need attention in this CommitFest. I have thus far mostly avoided getting sucked into the annual argument about which things should be evicted from this CommitFest because, frankly, I have better things to do with my time than argue about what the feature freeze date is with people who should know better. But the problem isn't going to go away because we don't talk about it. Has anyone else noticed that "final triage" is scheduled to end tomorrow, and we haven't done any triage of any kind and, at least with respect to this feature, are effectively still receiving new patch submissions? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Feb 6, 2013 at 12:05 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Wed, Feb 6, 2013 at 9:44 AM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: >> Robert Haas <robertmhaas@gmail.com> writes: >>>> I disagree with that. I don't see why the enclosing event trigger >>>> shouldn't be aware of all the objects dropped by the command that just >>>> ran to completion, *including* the effects of any event trigger fired >>>> recursively or not. >>> >>> Well, that could result in some DROP events being reported more than >>> once, which I assume would be undesirable for someone hoping to use >>> this for replication. >> >> Any command might have an event trigger attached doing a DROP, so that >> you don't know where to expect it, and it's well possible that in your >> example both the event triggers have been installed by different tools. > > It certainly is; in fact, it's likely. So let's say that B is a > replication trigger. Don't you want it to hear about each drop > exactly once? If not, how will you avoid errors when you go to replay > the events you've captured on another machine? In this case, the "hygenic" change that we're thinking of making to Slony, at least initially, is for the trigger to check to see if the table is replicated, and raise an exception if it is. That forces the Gentle User to submit the Slony SET DROP TABLE command <http://slony.info/documentation/2.1/stmtsetdroptable.html>. Now, if we stipulate that imposition, then, for this kind of event, it becomes unnecessary for event triggers to get *overly* concerned about capturing more about dropping tables. After all, SET DROP TABLE already knows how to replicate its action, so what happens, in that case is: - User submits SET DROP TABLE - SET DROP TABLE drops the triggers for the table, cleans out Slony configuration surroundingthe table, forwards request to other nodes - User submits DROP TABLE - Slony is no longer involved with that table; there's nothing special anymore about replicatingthis; perhaps we capture and forward it via event trigger. I'm not sure if that makes thinking about this easier, I hope so :-). -- When confronted by a difficult problem, solve it by reducing it to the question, "How would the Lone Ranger handle this?"
On Wed, Feb 6, 2013 at 11:30 AM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > Alvaro Herrera <alvherre@2ndquadrant.com> writes: >> Dimitri Fontaine escribió: >>> Tom Lane <tgl@sss.pgh.pa.us> writes: >>> > I might be forgetting something, but doesn't dependency.c work by first >>> > constructing a list of all the objects it's going to drop, and only then >>> > dropping them? Could we inject a "pre deletion" event trigger call at >>> > the point where the list is completed? >> >> Yes, this is what I'm proposing. > > So, I add back the "sql_drop" event and implement it in dependency.c, > and keep the list for later processing from "ddl_command_end", right? > > Robert, you specifically opposed to "sql_drop" and I just removed it > from the patch. What do you think now? Also, should that be a follow-up > patch to the current one for your reviewing purposes? Well, if it has a different firing point than ddl_command_end, then there could well be some point to having it after all. But I'm far from convinced that the proposed firing point can be made safe without a major refactoring. I think this is the sort of things where "design before code" ought to be the cardinal rule. >> I replied to this above: raise an error. (How to do that is an open >> implementation question, but I proposed a simple idea upthread). > > Well, can the list of objects that get dropped for CASCADE change in > between releases? If it ever does, that means that your event trigger > that used to work before is not broken after upgrade. Is that ok? That particular problem does not bother me, personally. But the possibility of ending up with corrupt system catalog contents does. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Feb 6, 2013 at 12:28 PM, Christopher Browne <cbbrowne@gmail.com> wrote: > On Wed, Feb 6, 2013 at 12:05 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> On Wed, Feb 6, 2013 at 9:44 AM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: >>> Robert Haas <robertmhaas@gmail.com> writes: >>>>> I disagree with that. I don't see why the enclosing event trigger >>>>> shouldn't be aware of all the objects dropped by the command that just >>>>> ran to completion, *including* the effects of any event trigger fired >>>>> recursively or not. >>>> >>>> Well, that could result in some DROP events being reported more than >>>> once, which I assume would be undesirable for someone hoping to use >>>> this for replication. >>> >>> Any command might have an event trigger attached doing a DROP, so that >>> you don't know where to expect it, and it's well possible that in your >>> example both the event triggers have been installed by different tools. >> >> It certainly is; in fact, it's likely. So let's say that B is a >> replication trigger. Don't you want it to hear about each drop >> exactly once? If not, how will you avoid errors when you go to replay >> the events you've captured on another machine? > > In this case, the "hygenic" change that we're thinking of making to Slony, > at least initially, is for the trigger to check to see if the table is > replicated, > and raise an exception if it is. > > That forces the Gentle User to submit the Slony SET DROP TABLE > command <http://slony.info/documentation/2.1/stmtsetdroptable.html>. > > Now, if we stipulate that imposition, then, for this kind of event, it > becomes unnecessary for event triggers to get *overly* concerned about > capturing more about dropping tables. After all, SET DROP TABLE > already knows how to replicate its action, so what happens, in that > case is: > > - User submits SET DROP TABLE > - SET DROP TABLE drops the triggers for the table, cleans out > Slony configuration surrounding the table, forwards request > to other nodes > > - User submits DROP TABLE > - Slony is no longer involved with that table; there's nothing special > anymore about replicating this; perhaps we capture and forward > it via event trigger. > > I'm not sure if that makes thinking about this easier, I hope so :-). Well, that means you wouldn't necessarily mind getting duplicate DROP events for the same object, but they don't benefit you in any way, either. And, I'm not sure we can conclude from this that duplicate events will be OK in every use case. For example, for a logging hook, it's pessimal, because now you're logging duplicate messages for the same drops and there's no easy way to fix it so you don't. Also, it means you're really going to need the schema name and table name for this to be useful; Tom was just complaining upthread that without that information the features isn't useful, so perhaps we should conclude from your email that he is correct. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: >> Robert, you specifically opposed to "sql_drop" and I just removed it >> from the patch. What do you think now? Also, should that be a follow-up >> patch to the current one for your reviewing purposes? > > Well, if it has a different firing point than ddl_command_end, then > there could well be some point to having it after all. But I'm far > from convinced that the proposed firing point can be made safe without > a major refactoring. I think this is the sort of things where "design > before code" ought to be the cardinal rule. Ok se we are in agreement here. I think we should see about getting the dropped_objects.3.patch.gz in (pending review), then continue with that patch series: - publishing some object specific information in TG_* - deciding on a design for generated commands, maybe commit it if it happens to look a lot like what I already have donethose past years - adding a function pg_get_normalized_command_string(parsetree) that takes as input internal (Node *) and provide a textas output Note: all that code exists already, in a more or less complete form, and has been around for between 1 and 2 years. I'm *not* trying to push *new* things in the current commit fest, only to make it so that the current patch series deliver a minimum set of features that is usable by itself. Have a look at my slides from FOSDEM where I tried to share my vision here. I don't have a use case for Event Triggers without them publishing object or command specific information, as is currently the case in our tree: http://tapoueh.org/images/confs/Fosdem2013_Event_Triggers.pdf Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Mon, Feb 11, 2013 at 9:53 AM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >>> Robert, you specifically opposed to "sql_drop" and I just removed it >>> from the patch. What do you think now? Also, should that be a follow-up >>> patch to the current one for your reviewing purposes? >> >> Well, if it has a different firing point than ddl_command_end, then >> there could well be some point to having it after all. But I'm far >> from convinced that the proposed firing point can be made safe without >> a major refactoring. I think this is the sort of things where "design >> before code" ought to be the cardinal rule. > > Ok se we are in agreement here. I think we should see about getting the > dropped_objects.3.patch.gz in (pending review), ... Wait, I'm confused. I had a note to myself to come back and review this, but now that I look at it, I didn't think that patch was pending review. Alvaro, Tom, and I all made comments that seems to impinge upon that design rather heavily. No? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > Wait, I'm confused. I had a note to myself to come back and review > this, but now that I look at it, I didn't think that patch was pending > review. Alvaro, Tom, and I all made comments that seems to impinge > upon that design rather heavily. No? The current design follows exactly your comments and design requests. Tom and Álvaro comments are the ones you did answer to saying that it's not 9.3 material, but next release at best, subject to heavy refactoring. What did I miss? -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Thu, Feb 14, 2013 at 3:39 PM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> Wait, I'm confused. I had a note to myself to come back and review >> this, but now that I look at it, I didn't think that patch was pending >> review. Alvaro, Tom, and I all made comments that seems to impinge >> upon that design rather heavily. No? > > The current design follows exactly your comments and design requests. > Tom and Álvaro comments are the ones you did answer to saying that it's > not 9.3 material, but next release at best, subject to heavy refactoring. > > What did I miss? Well, there's this, upon which we surely have not achieved consensus: http://www.postgresql.org/message-id/CA+TgmobQ6NGsxGuiHWqcygF0Q+7Y9zHNERePo3S1vsWKKNw2TQ@mail.gmail.com And then Tom also wrote this, which is kind of a good point, too: > Well, a list of object OIDs is of exactly zero use once the command > has been carried out. So I don't think that that represents a useful > or even very testable feature on its own, if there's no provision to > fire user code while the OIDs are still in the catalogs. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > Well, there's this, upon which we surely have not achieved consensus: > > http://www.postgresql.org/message-id/CA+TgmobQ6NGsxGuiHWqcygF0Q+7Y9zHNERePo3S1vsWKKNw2TQ@mail.gmail.com Sub-Transaction Handling. I fail to come up with a regression test showing any problem here, and Álvaro is now working on the patch so he might be finding both a failure test and a fix. > And then Tom also wrote this, which is kind of a good point, too: > >> Well, a list of object OIDs is of exactly zero use once the command >> has been carried out. So I don't think that that represents a useful >> or even very testable feature on its own, if there's no provision to >> fire user code while the OIDs are still in the catalogs. That's the reason why I've been proposing that we first add some information to the event triggers, then see about the DROP support. You might want to realize that the current event triggers implementation is not even publishing the object ID now, only the command tag and the name of the event. We still don't have any way to make any use of the whole thing, apart from maybe "block all commands" (but all is in fact a subset of known DDL). I don't think we still are able to solve *any* use case with what's currently commited. At all. Given that, it's really hard for me to get excited on that very point. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Sun, Feb 17, 2013 at 4:12 PM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: >>> Well, a list of object OIDs is of exactly zero use once the command >>> has been carried out. So I don't think that that represents a useful >>> or even very testable feature on its own, if there's no provision to >>> fire user code while the OIDs are still in the catalogs. > > That's the reason why I've been proposing that we first add some > information to the event triggers, then see about the DROP support. I think the question of the interface to the data and the data to expose are pretty tightly related. You can't exactly get one right and the other one wrong and say, OK, we'll fix it later. > You might want to realize that the current event triggers implementation > is not even publishing the object ID now, only the command tag and the > name of the event. I know that. I also know that after I committed this patch in July, many months went by before we had any further discussion of next steps. I'll admit that some of this stuff was on the table for the November CommitFest, but I also won't accept complete blame for the fact that we're not further along than we are. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Here's an implementation of what we discussed: the event is now DDL_DROP and is called inside performDeletion and performMultipleDeletions; it receives the list of objects via pg_dropped_objects (though it now has a more verbose name), and the objects are still catalogued when the trigger is called. This event does *not* receive the parsetree (as opposed to ddl_command_start/end); also, there's no tag filtering support, so if you install a function in that event, it will run every time a DROP of any kind occurs. I have added some protections so that these do not fire on undesirable events (such as dropping an event trigger); also event triggers and other object types are filtered out of pg_dropped_objects, in case something like DROP OWNED happens. (So for instance you could get an empty pg_dropped_objects if you did DROP OWNED and the mentioned user only owns an event trigger). Maybe this needs to be reconsidered. There's also code to re-obtain the list of objects to drop after the event trigger functions have run; the second list is compared to the first one, and if they differ, an error is raised. It's rather annoying that performMultipleDeletions and performDeletion contain almost exactly the same code. I think maybe it's time to morph the latter into a thin layer on top of the former. (small remaiing issue: docs need updated). -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On Wed, Feb 20, 2013 at 10:48 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > I have added some protections so that these do not fire on undesirable > events (such as dropping an event trigger); also event triggers and > other object types are filtered out of pg_dropped_objects, in case > something like DROP OWNED happens. (So for instance you could get an > empty pg_dropped_objects if you did DROP OWNED and the mentioned user > only owns an event trigger). Maybe this needs to be reconsidered. This is tricky. The whole point of skipping the ddl_command_start/ddl_command_end triggers when the command is one that operates on event triggers is that we don't want a user to install an event trigger that prevents said user from dropping that event trigger afterwards. We're currently safe from that, but with the behavior you describe, we won't be: an event trigger that unconditionally thows an error will block all drops, even of event triggers. I am inclined to suggest that we go back to an earlier suggestion I made, which Dimitri didn't like, but which I still think might be the best way forward: add a SUSET GUC that disables event triggers. If we do that, then our answer to that problem, for the current event triggers and all we might add in the future, can be: well, if that happens, set the disable_event_triggers GUC and retry the drop. OTOH, if we DON'T do that, then this problem is going to come back up every time we add a new firing point, and it may be really tough to make sure we're secure against this in all cases. If you don't like that suggestion I'm open to other options, but I think we should try hard to preserve the invariant that a superuser can always drop an event trigger without interference from the event trigger system itself. > There's also code to re-obtain the list of objects to drop after the > event trigger functions have run; the second list is compared to the > first one, and if they differ, an error is raised. This is definitely an improvement. I am not 100% clear on whether this is sufficient, but it sure seems a lot better than punting. I think there might be a possible failure mode if somebody creates a new object that depends on one of the objects to be dropped while the event trigger is running. Since system catalogs are normally read with SnapshotNow, I think it might be possible to create a situation where the first and second searches return different results even though the trigger hasn't done anything naughty. I believe that I added some locking in 9.2 that will prevent this in the case where you create a table that depends on a concurrently-dropped schema, but there are other cases that have this problem, such as a function being added to a concurrently-dropped schema. Now, IMHO, that isn't necessarily a show-stopper for this patch, because that same phenomenon can cause catalog corruption as things stand. So in the scenario where an event trigger causes a transaction to fail because of this issue, it will actually be *preventing* catalog corruption. However, if this goes in, it might bump up the priority of fixing some of those locking issues, since it may make them more visible. > It's rather annoying that performMultipleDeletions and performDeletion > contain almost exactly the same code. I think maybe it's time to morph > the latter into a thin layer on top of the former. Yeah, or at least factor out the code you need for this into its own function. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, Feb 21, 2013 at 12:25 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> There's also code to re-obtain the list of objects to drop after the >> event trigger functions have run; the second list is compared to the >> first one, and if they differ, an error is raised. > > This is definitely an improvement. I am not 100% clear on whether > this is sufficient, but it sure seems a lot better than punting. What if the object that gets whacked around is one of the named objects rather than some dependency thereof? Suppose for example that the event trigger drops the same object that the user tried to drop. We need to error out cleanly in that case, not blindly proceed with the drop. (In the worst case, somebody could create an unrelated object with the same OID and we could latch onto and drop that one. Seems remote unless the user has a way to fiddle with the OID counter, but if it happened it would be bad.) -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas escribió: > On Wed, Feb 20, 2013 at 10:48 PM, Alvaro Herrera > <alvherre@2ndquadrant.com> wrote: > > I have added some protections so that these do not fire on undesirable > > events (such as dropping an event trigger); also event triggers and > > other object types are filtered out of pg_dropped_objects, in case > > something like DROP OWNED happens. (So for instance you could get an > > empty pg_dropped_objects if you did DROP OWNED and the mentioned user > > only owns an event trigger). Maybe this needs to be reconsidered. > > This is tricky. The whole point of skipping the > ddl_command_start/ddl_command_end triggers when the command is one > that operates on event triggers is that we don't want a user to > install an event trigger that prevents said user from dropping that > event trigger afterwards. We're currently safe from that, but with > the behavior you describe, we won't be: an event trigger that > unconditionally thows an error will block all drops, even of event > triggers. You're misunderstanding. If you do DROP EVENT TRIGGER, the DDL_DROP event won't fire at all. So no matter how messed up your system is, you can always fix it by simply dropping the event trigger. What I was saying is that if you have some command other than DROP EVENT TRIGGER, which happens to drop an event trigger, said event trigger will not be present in the pg_dropped_objects results. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Robert Haas escribió: > On Thu, Feb 21, 2013 at 12:25 PM, Robert Haas <robertmhaas@gmail.com> wrote: > >> There's also code to re-obtain the list of objects to drop after the > >> event trigger functions have run; the second list is compared to the > >> first one, and if they differ, an error is raised. > > > > This is definitely an improvement. I am not 100% clear on whether > > this is sufficient, but it sure seems a lot better than punting. > > What if the object that gets whacked around is one of the named > objects rather than some dependency thereof? Suppose for example that > the event trigger drops the same object that the user tried to drop. > We need to error out cleanly in that case, not blindly proceed with > the drop. An error is raised, which I think is sane. I think this peculiar situation warrants its own few lines in the new regression test. One funny thing I noticed is that if I add a column in a table being dropped, the targetObjects list does not change after the trigger has run. The reason for this is that the table's attributes are not present in the targetObjects list; instead they are dropped manually by calling DeleteAttributeTuples(). I saw that you can end up with lingering pg_attribute entries that way. > (In the worst case, somebody could create an unrelated object with the > same OID and we could latch onto and drop that one. Seems remote > unless the user has a way to fiddle with the OID counter, but if it > happened it would be bad.) Hmm. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Thu, Feb 21, 2013 at 12:47 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > You're misunderstanding. If you do DROP EVENT TRIGGER, the DDL_DROP > event won't fire at all. So no matter how messed up your system is, you > can always fix it by simply dropping the event trigger. > > What I was saying is that if you have some command other than DROP EVENT > TRIGGER, which happens to drop an event trigger, said event trigger will > not be present in the pg_dropped_objects results. Hmm. But, that means that if some other object manages to depend on an event trigger, and you drop the event trigger with CASCADE taking the other object with it, then some other event trigger being used for, say, replication might fail to see the drop. Right now that's not possible but it seems potentially fragile. Not that I have a great idea. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, Feb 21, 2013 at 12:52 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: >> What if the object that gets whacked around is one of the named >> objects rather than some dependency thereof? Suppose for example that >> the event trigger drops the same object that the user tried to drop. >> We need to error out cleanly in that case, not blindly proceed with >> the drop. > > An error is raised, which I think is sane. I think this peculiar > situation warrants its own few lines in the new regression test. Definitely. > One funny thing I noticed is that if I add a column in a table being > dropped, the targetObjects list does not change after the trigger has > run. The reason for this is that the table's attributes are not present > in the targetObjects list; instead they are dropped manually by calling > DeleteAttributeTuples(). I saw that you can end up with lingering > pg_attribute entries that way. I venture to guess that this is exactly the sort of thing that made Tom argue upthread that we shouldn't be putting a firing point in the middle of the drop operation. Any slip-ups here will result in corrupt catalogs, and it's not exactly future-proof either. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas escribió: > On Thu, Feb 21, 2013 at 12:52 PM, Alvaro Herrera > <alvherre@2ndquadrant.com> wrote: > > One funny thing I noticed is that if I add a column in a table being > > dropped, the targetObjects list does not change after the trigger has > > run. The reason for this is that the table's attributes are not present > > in the targetObjects list; instead they are dropped manually by calling > > DeleteAttributeTuples(). I saw that you can end up with lingering > > pg_attribute entries that way. > > I venture to guess that this is exactly the sort of thing that made > Tom argue upthread that we shouldn't be putting a firing point in the > middle of the drop operation. Any slip-ups here will result in > corrupt catalogs, and it's not exactly future-proof either. Well, is this kind of thing enough to punt the whole patch, or can we chalk it off as the user's problem? We can word the docs to state that we try to detect actions that are known to cause corruption but that some might slip past, and that it's the user's responsibility to ensure that the event trigger functions behave sanely. Another idea I just had was to scan the catalogs after the event trigger and see if the Xmin for each tuple IsCurrentTransaction(), and if so throw an error. I think that would be more accurate than the current implementation, but rather a lot of trouble. I'm open to implementing that if we consider that idea watertight enough that this patch is considered commitable. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > Robert Haas escribi�: >> I venture to guess that this is exactly the sort of thing that made >> Tom argue upthread that we shouldn't be putting a firing point in the >> middle of the drop operation. Any slip-ups here will result in >> corrupt catalogs, and it's not exactly future-proof either. > Well, is this kind of thing enough to punt the whole patch, or can we > chalk it off as the user's problem? I don't really think that we want any events in the first release that are defined so that a bogus trigger can cause catalog corruption. That will, for example, guarantee that we can *never* open up the feature to non-superusers. I think we'd be painting ourselves into a corner that we could not get out of. Maybe down the road we'll conclude that there's no other way and we're willing to put up with an unsafe feature, but I don't want to take that step under time pressure to ship something in 9.3. > Another idea I just had was to scan the catalogs after the event trigger > and see if the Xmin for each tuple IsCurrentTransaction(), and if so > throw an error. You mean examine every row in every catalog? Doesn't sound like a great plan. I thought the proposal was to recompute the set of drop target objects, and complain if that had changed. regards, tom lane
On Thu, Feb 28, 2013 at 2:14 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Alvaro Herrera <alvherre@2ndquadrant.com> writes: >> Robert Haas escribió: >>> I venture to guess that this is exactly the sort of thing that made >>> Tom argue upthread that we shouldn't be putting a firing point in the >>> middle of the drop operation. Any slip-ups here will result in >>> corrupt catalogs, and it's not exactly future-proof either. > >> Well, is this kind of thing enough to punt the whole patch, or can we >> chalk it off as the user's problem? > > I don't really think that we want any events in the first release that > are defined so that a bogus trigger can cause catalog corruption. > That will, for example, guarantee that we can *never* open up the > feature to non-superusers. I think we'd be painting ourselves into a > corner that we could not get out of. I agree, although my feelings on this point are actually somewhat stronger than that. > Maybe down the road we'll conclude that there's no other way and we're > willing to put up with an unsafe feature, but I don't want to take that > step under time pressure to ship something in 9.3. I'm opposed to doing it under any circumstances, ever. Right now, there's a very limited number of things which can result in foreign key constraints between system catalogs being violated. So, when it happens, from a support point of view, we don't have many things to investigate. If we add "badly written event triggers" to the list, it's going to open up a can of worms so large it'll be gravitationally rounded. > I thought the proposal was to recompute the set of drop target objects, > and complain if that had changed. Alvaro did that, but it isn't sufficient to prevent catalog corruption. It seems to me that a better way to do this might be to look up the names of all the objects being dropped, as we get rid of them, and pass that information off to the ddl_command_end trigger via something like the pg_dropped_objects() function Dimitri proposed. Then we don't have to deal with the massive grottiness of running a command right in the middle of the deletions, which I continue to maintain will have many as-yet-unforeseen failure modes. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Tom Lane escribió: > Alvaro Herrera <alvherre@2ndquadrant.com> writes: > > Robert Haas escribi�: > >> I venture to guess that this is exactly the sort of thing that made > >> Tom argue upthread that we shouldn't be putting a firing point in the > >> middle of the drop operation. Any slip-ups here will result in > >> corrupt catalogs, and it's not exactly future-proof either. > > > Well, is this kind of thing enough to punt the whole patch, or can we > > chalk it off as the user's problem? > > I don't really think that we want any events in the first release that > are defined so that a bogus trigger can cause catalog corruption. > That will, for example, guarantee that we can *never* open up the > feature to non-superusers. I think we'd be painting ourselves into a > corner that we could not get out of. Roger. > > Another idea I just had was to scan the catalogs after the event trigger > > and see if the Xmin for each tuple IsCurrentTransaction(), and if so > > throw an error. > > You mean examine every row in every catalog? Doesn't sound like a great > plan. No, I mean the rows that are part of the set of objects to be deleted. > I thought the proposal was to recompute the set of drop target objects, > and complain if that had changed. Yeah, that's what the patch I submitted upthread does. The problem is that pg_attribute rows are not in that set; they are deleted manually by heap_drop_with_catalog by calling DeleteAttributeTuples. So if you add a column to a table in the trigger function, the sets are identical and that logic doesn't detect that things are amiss. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Robert Haas escribió: > On Thu, Feb 28, 2013 at 2:14 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Alvaro Herrera <alvherre@2ndquadrant.com> writes: > > Maybe down the road we'll conclude that there's no other way and we're > > willing to put up with an unsafe feature, but I don't want to take that > > step under time pressure to ship something in 9.3. > > I'm opposed to doing it under any circumstances, ever. Right now, > there's a very limited number of things which can result in foreign > key constraints between system catalogs being violated. So, when it > happens, from a support point of view, we don't have many things to > investigate. If we add "badly written event triggers" to the list, > it's going to open up a can of worms so large it'll be gravitationally > rounded. Well, that's a good point, but I think that was a foreseeable consequence of event triggers themselves. I agree that we need to get this as robust and bulletproof as possible. > > I thought the proposal was to recompute the set of drop target objects, > > and complain if that had changed. > > Alvaro did that, but it isn't sufficient to prevent catalog corruption. > > It seems to me that a better way to do this might be to look up the > names of all the objects being dropped, as we get rid of them, and > pass that information off to the ddl_command_end trigger via something > like the pg_dropped_objects() function Dimitri proposed. Then we > don't have to deal with the massive grottiness of running a command > right in the middle of the deletions, which I continue to maintain > will have many as-yet-unforeseen failure modes. There are two points you're making here: one is about what data do we provide about objects being dropped (you argue OIDs and names are sufficient); and you're also saying calling these triggers at ddl_command_end is the right timing. I disagree about the first one, because I don't think object names are sufficient; Tom argued upthread about being able to look up the objects from the catalogs in the event trigger function. This in turn means that ddl_command_end is not good enough; these functions need to be called earlier than that. Which is what the new DDL_DROP event provides: a point in the execution sequence that's after we've looked up the objects, but before they are gone from the catalogs. Can we get agreement on those points? Otherwise ISTM we're going to continue to argue in circles. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > Robert Haas escribi�: >> It seems to me that a better way to do this might be to look up the >> names of all the objects being dropped, as we get rid of them, and >> pass that information off to the ddl_command_end trigger via something >> like the pg_dropped_objects() function Dimitri proposed. Then we >> don't have to deal with the massive grottiness of running a command >> right in the middle of the deletions, which I continue to maintain >> will have many as-yet-unforeseen failure modes. I share Robert's fear of this. > There are two points you're making here: one is about what data do we > provide about objects being dropped (you argue OIDs and names are > sufficient); and you're also saying calling these triggers at > ddl_command_end is the right timing. I disagree about the first one, > because I don't think object names are sufficient; Tom argued upthread > about being able to look up the objects from the catalogs in the event > trigger function. This in turn means that ddl_command_end is not good > enough; these functions need to be called earlier than that. Which is > what the new DDL_DROP event provides: a point in the execution sequence > that's after we've looked up the objects, but before they are gone from > the catalogs. > Can we get agreement on those points? Otherwise ISTM we're going to > continue to argue in circles. I think it's fairly obvious that (1) dealing with a DROP only after it's happened is pretty limiting; (2) allowing user-defined code to run mid-command is dangerous. What's at issue is the tradeoff we make between these inescapable facts, and I'm not sure if we can get consensus on that. On the whole, though, I'm thinking that the approach Robert suggests is the way to go, because it doesn't foreclose our doing something else later. Whereas if we ship 9.3 with a mid-DROP event, and we then find out that it's even more dangerous than we currently realize, we're going to be between a rock and a hard place. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes: > I think it's fairly obvious that > (1) dealing with a DROP only after it's happened is pretty limiting; > (2) allowing user-defined code to run mid-command is dangerous. > What's at issue is the tradeoff we make between these inescapable > facts, and I'm not sure if we can get consensus on that. > > On the whole, though, I'm thinking that the approach Robert suggests > is the way to go, because it doesn't foreclose our doing something > else later. Whereas if we ship 9.3 with a mid-DROP event, and we then > find out that it's even more dangerous than we currently realize, > we're going to be between a rock and a hard place. The good news is that the patch to do that has already been sent on this list, and got reviewed in details by Álvaro who did offer incremental changes. Version 3 of that patch is to be found in: http://www.postgresql.org/message-id/m2fw19n1hr.fsf@2ndQuadrant.fr Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Dimitri Fontaine escribió: > Tom Lane <tgl@sss.pgh.pa.us> writes: > > I think it's fairly obvious that > > (1) dealing with a DROP only after it's happened is pretty limiting; > > (2) allowing user-defined code to run mid-command is dangerous. > > What's at issue is the tradeoff we make between these inescapable > > facts, and I'm not sure if we can get consensus on that. Mmm. > > On the whole, though, I'm thinking that the approach Robert suggests > > is the way to go, because it doesn't foreclose our doing something > > else later. Whereas if we ship 9.3 with a mid-DROP event, and we then > > find out that it's even more dangerous than we currently realize, > > we're going to be between a rock and a hard place. Makes sense. > The good news is that the patch to do that has already been sent on this > list, and got reviewed in details by Álvaro who did offer incremental > changes. Version 3 of that patch is to be found in: > > http://www.postgresql.org/message-id/m2fw19n1hr.fsf@2ndQuadrant.fr As I recall there were a few more fixes I wanted to do on top of that patch. I will see about that. Thanks for the pointer. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Thu, Feb 28, 2013 at 3:20 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I think it's fairly obvious that > (1) dealing with a DROP only after it's happened is pretty limiting; > (2) allowing user-defined code to run mid-command is dangerous. > What's at issue is the tradeoff we make between these inescapable > facts, and I'm not sure if we can get consensus on that. I'll note that Slony could do something useful with an ON EVENT trigger even if there's NO data provided as to what tables got dropped. We could get a "prevent dropping by accident" test that amounts to: if exists (select 1 from _slony_schema.sl_table where not exists (select 1 from pg_catalog.pg_class where oid= tab_reloid)) then raise exception 'You attempted to drop a replicated table. Shame on you!"; end if; That could be extended to precede the exception by raising a warning for each such table that was found. That's a nice "save the admin from accidentally breaking replication" check. If we're agonizing over "what more do we need to ensure it's useful?", and it's looking pretty open-ended, well, for the above test, I don't need *any* attributes to be passed to me by the event trigger in order to do something that's useful enough. I wouldn't feel horribly badly if things stopped short of having ON DROP do anything extra. If I *really* wanted to do some sophisticated tracking of things, the apropos route might be to set up both a BEFORE and an AFTER trigger, have the BEFORE part capture relevant data in memory (with some question of "how much relevant data") and then, in the AFTER trigger, refer back to the captured data in order to do something. That again doesn't require adding much of anything to the trigger attributes. -- When confronted by a difficult problem, solve it by reducing it to the question, "How would the Lone Ranger handle this?"
Dimitri Fontaine escribió: > The good news is that the patch to do that has already been sent on this > list, and got reviewed in details by Álvaro who did offer incremental > changes. Version 3 of that patch is to be found in: > > http://www.postgresql.org/message-id/m2fw19n1hr.fsf@2ndQuadrant.fr Here's a v4 of that patch. I added support for DROP OWNED, and added object name and schema name available to the pg_dropped_objects function. Since we're now in agreement that this is the way to go, I think this only needs a few more tweaks to get to a committable state, as well as some useful tests and doc changes. (v3 has docs which I didn't include here but are probably useful almost verbatim.) Do we want some more stuff provided by pg_dropped_objects? We now have classId, objectId, objectSubId, object name, schema name. One further thing I think we need is the object's type, i.e. a simple untranslated string "table", "view", "operator" and so on. AFAICT this requires a nearly-duplicate of getObjectDescription. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > Dimitri Fontaine escribió: > >> The good news is that the patch to do that has already been sent on this >> list, and got reviewed in details by Álvaro who did offer incremental >> changes. Version 3 of that patch is to be found in: >> >> http://www.postgresql.org/message-id/m2fw19n1hr.fsf@2ndQuadrant.fr > > Here's a v4 of that patch. I added support for DROP OWNED, and added > object name and schema name available to the pg_dropped_objects > function. Thanks! > Do we want some more stuff provided by pg_dropped_objects? We now have > classId, objectId, objectSubId, object name, schema name. One further > thing I think we need is the object's type, i.e. a simple untranslated > string "table", "view", "operator" and so on. AFAICT this requires a > nearly-duplicate of getObjectDescription. About what missing information to add, please review: http://wiki.postgresql.org/wiki/Event_Triggers#How_to_expose_Information_to_Event_Triggers_Functions I'd like us to provide the same set of extra information from within classic Event Triggers and in the records returned by the function. Maybe a good idea would be to create a datatype for that, and publish a single TG_DETAILS variable from within Event Triggers, of that type, and have the pg_dropped_objects() function return a setof that type? About the implementation and the getObjectDescription() remark, please have a look at your latest revision of the other patch in the series, http://www.postgresql.org/message-id/20130109165829.GB4490@alvh.no-ip.org I think it provides exactly what you need here, and you already did cleanup my work for getting at that… Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Dimitri Fontaine escribió: > Alvaro Herrera <alvherre@2ndquadrant.com> writes: > > Do we want some more stuff provided by pg_dropped_objects? We now have > > classId, objectId, objectSubId, object name, schema name. One further > > thing I think we need is the object's type, i.e. a simple untranslated > > string "table", "view", "operator" and so on. AFAICT this requires a > > nearly-duplicate of getObjectDescription. > > About what missing information to add, please review: > > http://wiki.postgresql.org/wiki/Event_Triggers#How_to_expose_Information_to_Event_Triggers_Functions > > I'd like us to provide the same set of extra information from within > classic Event Triggers and in the records returned by the function. > > Maybe a good idea would be to create a datatype for that, and publish a > single TG_DETAILS variable from within Event Triggers, of that type, and > have the pg_dropped_objects() function return a setof that type? I'm very unsure about that idea. In any case the proposed name seems way too general. Maybe we could have a separate datatype for objects being dropped, which would be specific to pg_event_trigger_dropped_objects(), and not try to mix it with other events' info; but really I don't see much point in a tailored datatype when we can simply declare the right set of columns to the function in the first place. > About the implementation and the getObjectDescription() remark, please > have a look at your latest revision of the other patch in the series, > > http://www.postgresql.org/message-id/20130109165829.GB4490@alvh.no-ip.org > > I think it provides exactly what you need here, and you already did > cleanup my work for getting at that… Hmm, it includes a completely different implementation to get at the object name and schema (I didn't know I had written that previous one -- how ironic), but it doesn't include the "obtypename" (your term) which is what I want here. (BTW I don't like "obtypename" at all; how about "object_type"?) So we would have "object_type", "object_name", "object_schema". Another question. If I do ALTER TABLE foo DROP COLUMN bar, do we need to fire an event trigger for the dropped column? Right now we don't, ISTM we should. And if we want that, then the above set of three properties doesn't cut it. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > I'm very unsure about that idea. In any case the proposed name seems > way too general. Maybe we could have a separate datatype for objects > being dropped, which would be specific to > pg_event_trigger_dropped_objects(), and not try to mix it with other > events' info; but really I don't see much point in a tailored datatype > when we can simply declare the right set of columns to the function in > the first place. I'm not too sure about it either, it's just an excuse to keep the two places in sync (TG_* and the pg_event_trigger_dropped_objects() return type). > Hmm, it includes a completely different implementation to get at the > object name and schema (I didn't know I had written that previous one -- > how ironic), but it doesn't include the "obtypename" (your term) which > is what I want here. (BTW I don't like "obtypename" at all; how about > "object_type"?) So we would have "object_type", "object_name", > "object_schema". IIRC obtypename is not my choice, it already is named that way in the code, maybe not in user visible places, though. > Another question. If I do ALTER TABLE foo DROP COLUMN bar, do we need > to fire an event trigger for the dropped column? Right now we don't, > ISTM we should. And if we want that, then the above set of three > properties doesn't cut it. Do we paint ourselves in a corner by not supporting columns in the first release? Table columns are proper SQL level objects in that they have their own catalog entry and OID and a set of commands to manage them, but that set of command is in fact a *subset* of ALTER TABLE. It feels like drifting a little more into the land of exposing PostgreSQL internals in a way, so I'm not too sure about the proper answer here. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Dimitri Fontaine escribió: > Alvaro Herrera <alvherre@2ndquadrant.com> writes: > > Another question. If I do ALTER TABLE foo DROP COLUMN bar, do we need > > to fire an event trigger for the dropped column? Right now we don't, > > ISTM we should. And if we want that, then the above set of three > > properties doesn't cut it. > > Do we paint ourselves in a corner by not supporting columns in the first > release? Well, the main distinction is that there's no space to refer to them in the current implementation, for lack of objectSubId or (more likely) column name. > Table columns are proper SQL level objects in that they have their own > catalog entry and OID and a set of commands to manage them, but that set > of command is in fact a *subset* of ALTER TABLE. Table columns do not have OIDs; you refer to them by (objectId, objectSubId). The latter is an integer that references pg_attribute.attnum. I am only wondering about ways to drop things at present, without concern for whether it's a straight DROP FOO command or ALTER, etc. > It feels like drifting > a little more into the land of exposing PostgreSQL internals in a way, > so I'm not too sure about the proper answer here. Mumble. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Dimitri Fontaine escribió: > Alvaro Herrera <alvherre@2ndquadrant.com> writes: > > Do we want some more stuff provided by pg_dropped_objects? We now have > > classId, objectId, objectSubId, object name, schema name. One further > > thing I think we need is the object's type, i.e. a simple untranslated > > string "table", "view", "operator" and so on. AFAICT this requires a > > nearly-duplicate of getObjectDescription. > > About what missing information to add, please review: > > http://wiki.postgresql.org/wiki/Event_Triggers#How_to_expose_Information_to_Event_Triggers_Functions That list contains the following items: TG_OBJECTID TG_OBJECTNAME TG_SCHEMANAME TG_OPERATION TG_OBTYPENAME TG_CONTEXT Of the above, TG_OPERATION is redundant with the fact that we're building pg_dropped_objects (i.e. the user code knows it's dealing with a drop) and TG_CONTEXT is not relevant; with the attached patch, we provide all the other bits. I think this is mostly ready to go in. I'll look at your docs, and unless there are more objections will commit later or early tomorrow. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
Alvaro Herrera escribió: > I think this is mostly ready to go in. I'll look at your docs, and > unless there are more objections will commit later or early tomorrow. Actually it still needs a bit more work: the error messages in pg_event_trigger_dropped_object need to be reworked. It's a bit annoying that the function throws an error if the function is called in a CREATE command, rather than returning an empty set; or is it just me? Here's v6 with docs and regression tests too. Note the new function in objectaddress.c; without that, I was getting regression failures because catalogs such as pg_amop and pg_default_acl are not present in its supporting table. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On Mon, Mar 4, 2013 at 11:13 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > Another question. If I do ALTER TABLE foo DROP COLUMN bar, do we need > to fire an event trigger for the dropped column? Right now we don't, > ISTM we should. And if we want that, then the above set of three > properties doesn't cut it. +1. Similar questions arise for object-access-hooks, among other places. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Mar 4, 2013 at 4:59 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > Alvaro Herrera escribió: > >> I think this is mostly ready to go in. I'll look at your docs, and >> unless there are more objections will commit later or early tomorrow. > > Actually it still needs a bit more work: the error messages in > pg_event_trigger_dropped_object need to be reworked. It's a bit > annoying that the function throws an error if the function is called in > a CREATE command, rather than returning an empty set; or is it just me? That seems like a reasonable thing to change. > Here's v6 with docs and regression tests too. Note the new function in > objectaddress.c; without that, I was getting regression failures because > catalogs such as pg_amop and pg_default_acl are not present in its > supporting table. I agree that this is reasonably close to being committable. I do have a bit of concern about the save-and-restore logic for the dropped-object list -- it necessitates that the processing of every DDL command that can potentially drop objects be bracketed with a PG_TRY/PG_CATCH block. While that's relatively easy to guarantee today (but doesn't ALTER TABLE need similar handling?), it seems to me that it might get broken in the future. How hard would it be to pass this information up and down the call stack instead of doing it this way? Or at least move the save/restore logic into something inside the deletion machinery itself so that new callers don't have to worry about it? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas escribió: > On Mon, Mar 4, 2013 at 4:59 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > > Alvaro Herrera escribió: > > > >> I think this is mostly ready to go in. I'll look at your docs, and > >> unless there are more objections will commit later or early tomorrow. > > > > Actually it still needs a bit more work: the error messages in > > pg_event_trigger_dropped_object need to be reworked. It's a bit > > annoying that the function throws an error if the function is called in > > a CREATE command, rather than returning an empty set; or is it just me? > > That seems like a reasonable thing to change. I'm thinking just removing the check altogether, and if the list of objects dropped is empty, then the empty set is returned. That is, if you call the function directly in user-invoked SQL, you get empty; if you call the function in a CREATE event trigger function, you get empty. Since this makes the boolean "drop_in_progress" useless, it'd be removed as well. > I do have > a bit of concern about the save-and-restore logic for the > dropped-object list -- it necessitates that the processing of every > DDL command that can potentially drop objects be bracketed with a > PG_TRY/PG_CATCH block. While that's relatively easy to guarantee > today (but doesn't ALTER TABLE need similar handling?), it seems to me > that it might get broken in the future. How hard would it be to pass > this information up and down the call stack instead of doing it this > way? This would be rather messy, I think; there are too many layers, and too many different functions. > Or at least move the save/restore logic into something inside the > deletion machinery itself so that new callers don't have to worry > about it? Hmm, maybe this can be made to work, I'll see about it. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
> Robert Haas escribió: > > Or at least move the save/restore logic into something inside the > > deletion machinery itself so that new callers don't have to worry > > about it? I don't think that works well; precisely the point of the initialize/finalize pair of functions is to bracket the drop so that the objects reported by the deletion machinery are stored in the right list. I tried this macro: /** Wrap a code fragment that executes a command that may drop database objects,* so that the event trigger environment isappropriately setup.** Note this macro will call EventTriggerDDLCommandEnd if the object type is* supported; caller mustmake sure to call EventTriggerDDLCommandStart by* itself.*/ #define ExecuteDropCommand(isCompleteQuery, codeFragment, has_objtype, objtype) \do { \ slist_head _save_objlist; \ bool _supported; \ \ _supported = has_objtype ? EventTriggerSupportsObjectType(objtype) : true; \ \ if (isCompleteQuery) \ { \ EventTriggerInitializeDrop(&_save_objlist); \ } \ PG_TRY(); \ { \ codeFragment; \ if (isCompleteQuery && _supported) \ { \ EventTriggerDDLCommandEnd(parsetree); \ } \ } \ PG_CATCH(); \ { \ if (isCompleteQuery && _supported) \ { \ EventTriggerFinalizeDrop(_save_objlist);\ } \ PG_RE_THROW(); \ } \ PG_END_TRY(); \ EventTriggerFinalizeDrop(_save_objlist);\} while (0) This looks nice in DropOwned: case T_DropOwnedStmt: if (isCompleteQuery) EventTriggerDDLCommandStart(parsetree); ExecuteDropCommand(isCompleteQuery, DropOwnedObjects((DropOwnedStmt *) parsetree), false, 0); break; And it works for DropStmt too: ExecuteDropCommand(isCompleteQuery, switch (stmt->removeType) { case OBJECT_INDEX: case OBJECT_TABLE: case OBJECT_SEQUENCE: case OBJECT_VIEW: case OBJECT_MATVIEW: case OBJECT_FOREIGN_TABLE: RemoveRelations((DropStmt *) parsetree); break; default: RemoveObjects((DropStmt *) parsetree); break; }, true, stmt->removeType); but a rather serious problem is that pgindent refuses to work completely on this (which is understandable, IMV). My editor doesn't like the braces inside something that looks like a function call, either. We use this pattern (a codeFragment being called by a macro) as ProcessMessageList in inval.c, but the code fragments there are much simpler. And in AlterTable, using the macro would be much uglier because the code fragment is more convoluted. I'm not really sure if it's worth having the above macro if the only caller is DropOwned. Hmm, maybe I should be considering a pair of macros instead -- UTILITY_START_DROP and UTILITY_END_DROP. I'll give this a try. Other ideas are welcome. FWIW, I noticed that the AlterTable case lacks a call to DDLCommandEnd; reporting that to Dimitri led to him noticing that DefineStmt also lacks one. This is a simple bug in the already-committed implementation which should probably be fixed separately from this patch. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Alvaro Herrera escribió: > Hmm, maybe I should be considering a pair of macros instead -- > UTILITY_START_DROP and UTILITY_END_DROP. I'll give this a try. Other > ideas are welcome. This seems to work. See attached; I like the result because there's no clutter and it supports all three cases without a problem. I also added a new output column to pg_event_trigger_dropped_objects, "subobject_name", which is NULL except when a column is being dropped, in which case it contains the column name. Also, the "object type" column now says "table column" instead of "table" when dropping a column. Another question arose in testing: this reports dropping of temp objects, too, but of course not always: particularly not when temp objects are dropped at the end of a session (or the beginning of a session that reuses a previously used temp schema). I find this rather inconsistent and I wonder if we should instead suppress reporting of temp objects. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
Okay, I added a couple of lines to skip reporting dropped temp schemas, and to skip any objects belonging to any temp schema (not just my own, note). Not posting a new version because the change is pretty trivial. Now, one last thing that comes up is what about objects that don't have straight names (such as pg_amop, pg_amproc, pg_default_acl etc etc), the only thing you get is a catalog OID and an object OID ... but they are pretty useless because by the time you get to the ddl_command_end trigger, the objects are gone from the catalog. Maybe we should report *something* about those. Say, perhaps the object description ... but if we want that, it should be untranslated (i.e. not just what getObjectDescription gives you, because that may be translated, so we would need to patch it so that it only translates if the caller requests it) Another example is reporting of functions: right now you get the function name .. but if there are overloaded functions there's no way to know wihch one was dropped. Example: alvherre=# create function f(int, int) returns int language sql as $$ select $1 + $2; $$; CREATE FUNCTION alvherre=# create function f(int, int, int) returns int language sql as $$ select $1 + $2 + $3; $$; CREATE FUNCTION alvherre=# drop function f(int, int); DROP FUNCTION alvherre=# select * from dropped_objects ; type | schema | object | subobj | curr_user | sess_user ----------+--------+-----------+--------+-----------+-----------function | public | f | | alvherre | alvherre Maybe we could use the "subobject_name" field (what you see as subobj above) to store the function signature (perhaps excluding the function name), for example. So you'd get object="f" subobject="(int,int)". Or maybe we should stash the whole function signature as name and leave subobject NULL. The reason I'm worrying about this is that it might be important for some use cases. For instance, replication cases probably don't care about that at all. But if we want to be able to use event triggers for auditing user activity, we need this info. Thoughts? -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Alvaro Herrera escribió: > Now, one last thing that comes up is what about objects that don't have > straight names (such as pg_amop, pg_amproc, pg_default_acl etc etc), the > only thing you get is a catalog OID and an object OID ... but they are > pretty useless because by the time you get to the ddl_command_end > trigger, the objects are gone from the catalog. Maybe we should report > *something* about those. Here's another idea --- have three columns, "type", "schema" (as in the current patch and as shown above), and a third one for object identity. For tables and other objects that have simple names, the identity would be their names. For columns, it'd be <tablename>.<columnname>. For functions, it'd be the complete signature. And so on. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Tue, Mar 5, 2013 at 5:37 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > Okay, I added a couple of lines to skip reporting dropped temp schemas, > and to skip any objects belonging to any temp schema (not just my own, > note). Not posting a new version because the change is pretty trivial. > > Now, one last thing that comes up is what about objects that don't have > straight names (such as pg_amop, pg_amproc, pg_default_acl etc etc), the > only thing you get is a catalog OID and an object OID ... but they are > pretty useless because by the time you get to the ddl_command_end > trigger, the objects are gone from the catalog. Maybe we should report > *something* about those. Say, perhaps the object description ... but if > we want that, it should be untranslated (i.e. not just what > getObjectDescription gives you, because that may be translated, so we > would need to patch it so that it only translates if the caller requests > it) Broadly, I suggest making the output format match as exactly as possible what commands like COMMENT and SECURITY LABEL accept as input. We've already confronted all of these notational issues there.Columns are identified as COLUMN table.name; functionsas FUNCTION function_name(argtypes); etc. Of course it's fine to split the object type off into a separate column, but it should have the same name here that it does there. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > Here's another idea --- have three columns, "type", "schema" (as in the > current patch and as shown above), and a third one for object identity. > > For tables and other objects that have simple names, the identity would > be their names. For columns, it'd be <tablename>.<columnname>. For > functions, it'd be the complete signature. And so on. Sounds very good as an extra column yes. Robert Haas <robertmhaas@gmail.com> writes: > Broadly, I suggest making the output format match as exactly as > possible what commands like COMMENT and SECURITY LABEL accept as > input. We've already confronted all of these notational issues there. > Columns are identified as COLUMN table.name; functions as FUNCTION > function_name(argtypes); etc. Of course it's fine to split the object > type off into a separate column, but it should have the same name here > that it does there. I would like the format to be easily copy/paste'able to things such as regclass/regtype/regprocedure casts, and apparently the COMMENT input format seems to be the same as that one, so +1 from me. -- Dimitri Fontaine 06 63 07 10 78 http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Tue, Mar 5, 2013 at 12:45 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > Hmm, maybe I should be considering a pair of macros instead -- > UTILITY_START_DROP and UTILITY_END_DROP. I'll give this a try. Other > ideas are welcome. That seems like a possibly promising idea. I do wonder how well any of this is going to scale. Presumably people are going to want similar things for CREATE and (hardest) ALTER. Seems like ProcessUtility() could get pretty messy and confusing. But I don't have a better idea, either. :-( -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas escribió: > On Tue, Mar 5, 2013 at 12:45 PM, Alvaro Herrera > <alvherre@2ndquadrant.com> wrote: > > Hmm, maybe I should be considering a pair of macros instead -- > > UTILITY_START_DROP and UTILITY_END_DROP. I'll give this a try. Other > > ideas are welcome. > > That seems like a possibly promising idea. I do wonder how well any > of this is going to scale. I did followup with a patch implementing that; did you see it? > Presumably people are going to want > similar things for CREATE and (hardest) ALTER. Seems like > ProcessUtility() could get pretty messy and confusing. But I don't > have a better idea, either. :-( Well, the first thing that we need to settle is the user interface. Normalized command string don't seem to cut it; requiring users to write SQL parsers is rather unfriendly IMHO. The current idea of having a function that returns objects affected by the command seems relatively sensible. For drops, it seems pretty straighforward so far. For CREATE it's probably somewhat more involved, but seems doable in principle (but yes, we're going to have to sprinkle ProcessUtility() with a lot of UTILITY_START/END_CREATE calls). Not sure about ALTER; maybe we will need a completely different idea to attack that. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Fri, Mar 8, 2013 at 9:18 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > Robert Haas escribió: >> On Tue, Mar 5, 2013 at 12:45 PM, Alvaro Herrera >> <alvherre@2ndquadrant.com> wrote: >> > Hmm, maybe I should be considering a pair of macros instead -- >> > UTILITY_START_DROP and UTILITY_END_DROP. I'll give this a try. Other >> > ideas are welcome. >> >> That seems like a possibly promising idea. I do wonder how well any >> of this is going to scale. > > I did followup with a patch implementing that; did you see it? No, sorry. Which thread is it on? >> Presumably people are going to want >> similar things for CREATE and (hardest) ALTER. Seems like >> ProcessUtility() could get pretty messy and confusing. But I don't >> have a better idea, either. :-( > > Well, the first thing that we need to settle is the user interface. > Normalized command string don't seem to cut it; requiring users to write > SQL parsers is rather unfriendly IMHO. I could not agree more. The format we're moving towards for dropped objects can easily be converted back into SQL if you happen to want to do that, but it's also much easier to process programatically than a compared with a normalized command string. So I think we get the best of both worlds with this design. > The current idea of having a > function that returns objects affected by the command seems relatively > sensible. For drops, it seems pretty straighforward so far. For CREATE > it's probably somewhat more involved, but seems doable in principle (but > yes, we're going to have to sprinkle ProcessUtility() with a lot of > UTILITY_START/END_CREATE calls). > > Not sure about ALTER; maybe we will need a completely different idea to > attack that. I am inclined to think that putting this logic in ProcessUtility isn't scalable, even for CREATE, and even moreso for ALTER, unless we can put it around everything in that function, rather than each command individually. Suppose for example that on entry to that function we simply did this: if (isCompleteQuery) ++CompleteQueryNestingLevel; ...and at exit, we did the reverse. This could work a bit like the GUC nesting level. When an object is dropped, we find the array slot for the current nesting level, and it's, say, a List **, and we push a new element onto that list. When somebody asks for a list of dropped objects, we pull from the list for the current nesting level. When decrementing the nesting level, we flush the list associated with the old nesting level, if any. if (isCompleteQuery) { list_free(dropped_objects_list[CompleteQueryNestingLevel]; dropped_objects_list[CompleteQueryNestingLevel] = NIL; --CompleteQueryNestingLevel; } Now, if we want to support CREATE, we can just have a created_objects_list array as well, and only minimal changes are required here. If we want to support ALTER we can have an altered_objects_list, and the only decision is what data to stuff into the list elements. I think this is a lot better than decorating each individual command, which is already unweildly and bound to get worse. It is a bit of a definitional change, because it implies that the list of dropped objects is the list of objects dropped by the most-nearly-enclosing DDL command, rather than the list of objects dropped by the most-nearly-enclosing DDL command *that is capable of dropping objects*. However, I'm inclined to think that's a better definition anyway. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas escribió: > On Fri, Mar 8, 2013 at 9:18 AM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > > Robert Haas escribió: > >> On Tue, Mar 5, 2013 at 12:45 PM, Alvaro Herrera > >> <alvherre@2ndquadrant.com> wrote: > >> > Hmm, maybe I should be considering a pair of macros instead -- > >> > UTILITY_START_DROP and UTILITY_END_DROP. I'll give this a try. Other > >> > ideas are welcome. > >> > >> That seems like a possibly promising idea. I do wonder how well any > >> of this is going to scale. > > > > I did followup with a patch implementing that; did you see it? > > No, sorry. Which thread is it on? http://www.postgresql.org/message-id/20130305214218.GP9507@alvh.no-ip.org I think Gmail's feature of breaking threads when subject changes is an annoyance here. I somehow added a "g" at the end and later dropped it. I didn't remember that behavior of Gmail's. > > The current idea of having a > > function that returns objects affected by the command seems relatively > > sensible. For drops, it seems pretty straighforward so far. For CREATE > > it's probably somewhat more involved, but seems doable in principle (but > > yes, we're going to have to sprinkle ProcessUtility() with a lot of > > UTILITY_START/END_CREATE calls). > > > > Not sure about ALTER; maybe we will need a completely different idea to > > attack that. > > I am inclined to think that putting this logic in ProcessUtility isn't > scalable, even for CREATE, and even moreso for ALTER, unless we can > put it around everything in that function, rather than each command > individually. Suppose for example that on entry to that function we > simply did this: > > if (isCompleteQuery) > ++CompleteQueryNestingLevel; > > ...and at exit, we did the reverse. This could work a bit like the > GUC nesting level. Hmm, this seems an interesting idea to explore. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Alvaro Herrera escribió: > > I am inclined to think that putting this logic in ProcessUtility isn't > > scalable, even for CREATE, and even moreso for ALTER, unless we can > > put it around everything in that function, rather than each command > > individually. Suppose for example that on entry to that function we > > simply did this: > > > > if (isCompleteQuery) > > ++CompleteQueryNestingLevel; > > > > ...and at exit, we did the reverse. This could work a bit like the > > GUC nesting level. > > Hmm, this seems an interesting idea to explore. Okay, here's a patch implementing the spirit of this idea. We don't keep track of a nesting level per se; I tried that approach and it seemed a dead end. Instead I created a pair of macros that are pretty much the same as UTILITY_BEGIN_DROP, except that they enclose the whole of ProcessUtility instead of individual commands. These macros set up a context for event_trigger.c to record affected actions/objects. Currently this only keeps track of dropped objects (which is what this particular patch is all about), but it appears reasonably simple to extend to objects created, and subcommands of ALTER. I have also modified the return type of pg_event_trigger_dropped_objects, so that instead of object name it returns object identity (as well as object type and schema name). For example, for functions the identity is the signature; for tables, it's the name. Each object type has its own format, modelled after the COMMENT syntax (some object types don't have COMMENT support, so I had to come up with some. This might need further discussion to arrive at the best possible rendering for each object class). I had to add a nearly-duplicate of getObjectDescription for this, which needs a bit more work yet so that it doesn't schema-qualify the object names even when not in path (most objects are not qualified, but I think operators and types still are.) (There's no SQL-callable function to get the object identity, such as what we have as pg_describe_object. Not sure how important this is.) Also, we need to ensure that unsupported object types such as pg_default_acl rows are not reported. I have not tested this yet. Supposedly this is part of a future patch that will encompass all DCL commands (grant, revoke, create/drop roles, etc). This patch also adds event trigger support for DROP OWNED (a two-line change). -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
Here's a new version of this patch, rebased on top of the new pg_identify_object() stuff. Note that the regression test doesn't work yet, because I didn't adjust to the new identity output definition (the docs need work, too). But that's a simple change to do. I'm leaving that for later. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
On Wed, Mar 20, 2013 at 5:42 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > Here's a new version of this patch, rebased on top of the new > pg_identify_object() stuff. Note that the regression test doesn't work > yet, because I didn't adjust to the new identity output definition (the > docs need work, too). But that's a simple change to do. I'm leaving > that for later. I think this is getting there. A few things to think about: - pg_event_trigger_dropped_objects seems to assume that currentEventTriggerState will be pointing to the same list on every call. But is that necessarily true? I'm thinking about a case where someone opens a cursor in an event trigger and then tries to read from that cursor later in the transaction. I think you might be able to crash the server that way. - I am not wild about the idea of propagating PG_TRY/PG_CATCH blocks into yet more places. On Linux-x86 they are pretty cheap because Linux doesn't need a system call to change the signal mask and x86 has few registers that must be saved-and-restored, but elsewhere this can be a performance problem. Now maybe ProcessUtility is not a sufficiently-frequently called function for this to matter... but I'm not sure. The alternative is to teach the error handling pathways about this in somewhat greater detail, since the point of TRY/CATCH is to cleanup things that the regular error handling stuff doesn't now about. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas escribió: > On Wed, Mar 20, 2013 at 5:42 PM, Alvaro Herrera > <alvherre@2ndquadrant.com> wrote: > > Here's a new version of this patch, rebased on top of the new > > pg_identify_object() stuff. Note that the regression test doesn't work > > yet, because I didn't adjust to the new identity output definition (the > > docs need work, too). But that's a simple change to do. I'm leaving > > that for later. > > I think this is getting there. A few things to think about: Thanks. > - pg_event_trigger_dropped_objects seems to assume that > currentEventTriggerState will be pointing to the same list on every > call. But is that necessarily true? I'm thinking about a case where > someone opens a cursor in an event trigger and then tries to read from > that cursor later in the transaction. I think you might be able to > crash the server that way. Well, no, because it uses materialized return mode, so there's no "next time" --- the list is constructed completely before pg_event_trigger_dropped_objects returns. So there's no such hole. > - I am not wild about the idea of propagating PG_TRY/PG_CATCH blocks > into yet more places. On Linux-x86 they are pretty cheap because > Linux doesn't need a system call to change the signal mask and x86 has > few registers that must be saved-and-restored, but elsewhere this can > be a performance problem. Now maybe ProcessUtility is not a > sufficiently-frequently called function for this to matter... but I'm > not sure. The alternative is to teach the error handling pathways > about this in somewhat greater detail, since the point of TRY/CATCH is > to cleanup things that the regular error handling stuff doesn't now > about. I tried this and it doesn't work. The "error pathways" you speak about would be the xact.c entry points to commit and abort transactions; however, there's a problem with that because some of the commands that ProcessUtility() deals with have their own transaction management calls internally; so those would call CommitTransaction() and the event trigger state would go away, and then when control gets back to ProcessUtility there would be nothing to clean up. I think we could ignore the problem, or install smarts in ProcessUtility to avoid calling event_trigger.c when one of those commands is involved -- but this seems to me a solution worse than the problem. So all in all I feel like the macro on top of PG_TRY is the way to go. Now there *is* one rather big performance problem in this patch, which is that it turns on collection of object dropped data regardless of there being event triggers that use the info at all. That's a serious drawback and we're going to get complaints about it. So we need to do something to fix that. One idea that comes to mind is to add some more things to the grammar, CREATE EVENT TRIGGER name ... WITH ('DROPPED OBJECTS'); or some such, so that when events happen for which any triggers have that flag enabled, *then* collecting is activated, otherwise not. This would be stored in a new column in pg_event_trigger (say "evtsupport", a char array much like proargmodes). The sequence of (ahem) events goes like this: ProcessUtility() EventTriggerBeginCompleteQuery() EventTriggerDDLCommandStart() EventCacheLookup() EventTriggerInvoke().. run whatever command we've been handed ... EventTriggerDDLCommandEnd() EventCacheLookup() EventTriggerInvoke()EventTriggerEndCompleteQuery() So EventTriggerBeginCompleteQuery() will have to peek into the event trigger cache for any ddl_command_end triggers that might apply, and see if any of them has the flag for "dropped objects". If it's there, then enable dropped object collection. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > Now there *is* one rather big performance problem in this patch, which > is that it turns on collection of object dropped data regardless of > there being event triggers that use the info at all. That's a serious > drawback and we're going to get complaints about it. So we need to do > something to fix that. > One idea that comes to mind is to add some more things to the grammar, > CREATE EVENT TRIGGER name ... WITH ('DROPPED OBJECTS'); Uh ... surely we can just notice whether there's a trigger on the object-drop event? I don't understand why we need *yet more* mechanism here. regards, tom lane
Tom Lane escribió: > Alvaro Herrera <alvherre@2ndquadrant.com> writes: > > Now there *is* one rather big performance problem in this patch, which > > is that it turns on collection of object dropped data regardless of > > there being event triggers that use the info at all. That's a serious > > drawback and we're going to get complaints about it. So we need to do > > something to fix that. > > > One idea that comes to mind is to add some more things to the grammar, > > CREATE EVENT TRIGGER name ... WITH ('DROPPED OBJECTS'); > > Uh ... surely we can just notice whether there's a trigger on the > object-drop event? I don't understand why we need *yet more* > mechanism here. There's no object-drop event, only ddl_command_end. From previous discussion I understood we didn't want a separate event, so that's what we've been running with. However, I think previous discussions have conflated many different things, and we've been slowly fixing them one by one; so perhaps at this point it does make sense to have a new object-drop event. Let's discuss it -- we would define it as taking place just before ddl_command_end, and firing any time a command (with matching tag?) has called performDeletion or performMultipleDeletions. Does that sound okay? -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Tue, Mar 26, 2013 at 3:02 PM, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > I tried this and it doesn't work. The "error pathways" you speak about > would be the xact.c entry points to commit and abort transactions; > however, there's a problem with that because some of the commands that > ProcessUtility() deals with have their own transaction management > calls internally; so those would call CommitTransaction() and the > event trigger state would go away, and then when control gets back to > ProcessUtility there would be nothing to clean up. I think we could > ignore the problem, or install smarts in ProcessUtility to avoid calling > event_trigger.c when one of those commands is involved -- but this seems > to me a solution worse than the problem. So all in all I feel like the > macro on top of PG_TRY is the way to go. I see. :-( > Now there *is* one rather big performance problem in this patch, which > is that it turns on collection of object dropped data regardless of > there being event triggers that use the info at all. That's a serious > drawback and we're going to get complaints about it. So we need to do > something to fix that. Really? Who is going to care about that? Surely that overhead is quite trivial. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas escribió: > On Tue, Mar 26, 2013 at 3:02 PM, Alvaro Herrera > <alvherre@2ndquadrant.com> wrote: > > Now there *is* one rather big performance problem in this patch, which > > is that it turns on collection of object dropped data regardless of > > there being event triggers that use the info at all. That's a serious > > drawback and we're going to get complaints about it. So we need to do > > something to fix that. > > Really? Who is going to care about that? Surely that overhead is > quite trivial. I don't think it is, because it involves syscache lookups for each object being dropped, many extra pallocs, etc. Surely that's many times bigger than the PG_TRY overhead you were worried about. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Alvaro Herrera escribió: > However, I think previous discussions have conflated many different > things, and we've been slowly fixing them one by one; so perhaps at this > point it does make sense to have a new object-drop event. Let's discuss > it -- we would define it as taking place just before ddl_command_end, > and firing any time a command (with matching tag?) has called > performDeletion or performMultipleDeletions. Does that sound okay? Here's another version of this patch. I hope this is really final now ... but then, that's what I thought of the previous two versions, too. I have re-instated event sql_drop, which takes place just before ddl_command_end, and is called a single time per command (not once per object dropped, as the old definition would have had it). Within a function running in that event, you can call pg_event_trigger_dropped_object(); that will give you a list of all objects that have been dropped. Since the deletion command has already been run, the objects are not in catalogs anymore. There are no magic TG_* variables about objects deleted. I'm a bit unhappy about having to add calls to EventTriggerSQLDrop() just before each call to EventTriggerDDLCommandEnd(), but I didn't see any way to make this less duplicative. Docs and regression tests have been minimally fixed. The new event is called sql_drop, note. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
Alvaro Herrera escribió: > Alvaro Herrera escribió: > > > However, I think previous discussions have conflated many different > > things, and we've been slowly fixing them one by one; so perhaps at this > > point it does make sense to have a new object-drop event. Let's discuss > > it -- we would define it as taking place just before ddl_command_end, > > and firing any time a command (with matching tag?) has called > > performDeletion or performMultipleDeletions. Does that sound okay? > [1mdiff --git a/doc/src/sgml/event-trigger.sgml b/doc/src/sgml/event-trigger.sgml[m > [1mindex 71241c8..492a4ed 100644[m > [1m--- a/doc/src/sgml/event-trigger.sgml[m > [1m+++ b/doc/src/sgml/event-trigger.sgml[m Uh, seems I posted the --color version of the patch last night. Easy mistake to make. Here's a readable version (context diff, even), with a couple of bugs fixed that I noticed while writing the final docs. Some random minor cleanup too. I also took the opportunity to refactor some common code in the invoking sequence of existing events; there was way too much unnecessary duplication there. This horse is about to become corpse; last chance for beating it up. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Attachment
Pushed, with some further minor changes. One not-so-minor change I introduced was that pg_event_trigger_dropped_objects() now only works within a sql_drop event function. The reason I decided to do this was that if we don't have that protection, then it is possible to have a ddl_command_end trigger calling pg_event_trigger_dropped_objects(); and if there is an sql_drop trigger, then it'd return the list of dropped objects, but if there's no sql_drop trigger, it'd raise an error. That seemed surprising enough action-at-a-distance that some protection is warranted. Thanks for all the review. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Alvaro Herrera <alvherre@2ndquadrant.com> writes: > Pushed, with some further minor changes. One not-so-minor change I Thanks a lot for all the work you did put into this patch! > introduced was that pg_event_trigger_dropped_objects() now only works > within a sql_drop event function. The reason I decided to do this was > that if we don't have that protection, then it is possible to have a > ddl_command_end trigger calling pg_event_trigger_dropped_objects(); and > if there is an sql_drop trigger, then it'd return the list of dropped > objects, but if there's no sql_drop trigger, it'd raise an error. That > seemed surprising enough action-at-a-distance that some protection is > warranted. +1 I hope that we can add another such function for ddl_command_start and ddl_command_end function to get at some object information from there, now that the support code is in place. That would allow making any use of the event trigger facility in 9.3 without having to write a C coded extension… which has been the goal for the last two cycles… Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support