Thread: Event Triggers reduced, v1
Hi, Allow me to open the new season of the DML trigger series, named pg_event_trigger. This first episode is all about setting up the drama, so that next ones make perfect sense. The attached patch contains all the infrastructure for event triggers and also a first implementation of them for the event "command_start", implemented in a single place in utility.c. The infrastructure is about: - new catalog - grammar for new commands - documentation skeleton - pg_dump support - psql support - ability to actually run user triggers - written in "core languages" (pl/c, pl/pgsql, pl/python, pl/perl, pl/tcl) - limited subcommand handling The goal for this first patch is to avoid semantics issues so that we can get something technically clean in, and have more time to talk semantics next times. The main discussion to avoid is deciding if we want to fire event triggers for CREATE SEQUENCE and CREATE INDEX in a command that just did implement a SERIAL PRIMARY KEY in a table. This way of doing things is possible because we took time to set a road map together with Robert when we were both in Ottawa, and because it's early in the cycle. The complete feature still needs to happen before 9.3 is released, but any realistic progress has to be cut down. Look, it's an easy little skinny patch to review, right: git --no-pager diff --shortstat master 62 files changed, 4546 insertions(+), 108 deletions(-) This patch includes regression tests that we worked on with Thom last rounds, remember that they only run in the serial schedule, that means with `make installcheck` only. Adding noisy output at random while the parallel schedule run is a good way to break all the regression testing, so I've been avoiding that. I don't think this patch is ready as it is, by the way, I couldn't devote nearly enough time to have something that polished already. I think even with setting the goal not to embrace semantics, reviewing this patch will certainly bring some interesting design discussions. Regards, -- Dimitri Fontaine PostgreSQL DBA, Architecte
Attachment
On 15 June 2012 21:27, Dimitri Fontaine <dfontaine@hi-media.com> wrote: > The goal for this first patch is to avoid semantics issues so that we > can get something technically clean in, and have more time to talk > semantics next times. The main discussion to avoid is deciding if we > want to fire event triggers for CREATE SEQUENCE and CREATE INDEX in a > command that just did implement a SERIAL PRIMARY KEY in a table. So this patch triggers once per top level command, just with information about the set of nested events? I'm happy if we make sweeping initial points like "don't generate events for sequences and indexes" in the first version. We can always add more later. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Excerpts from Dimitri Fontaine's message of vie jun 15 16:27:50 -0400 2012: > The attached patch contains all the infrastructure for event triggers > and also a first implementation of them for the event "command_start", > implemented in a single place in utility.c. > > The infrastructure is about: > > - new catalog > - grammar for new commands > - documentation skeleton > - pg_dump support > - psql support > - ability to actually run user triggers > - written in "core languages" > (pl/c, pl/pgsql, pl/python, pl/perl, pl/tcl) > - limited subcommand handling Did you try REASSIGN OWNED and DROP OWNED with a role that has defined some event triggers? > Look, it's an easy little skinny patch to review, right: > > git --no-pager diff --shortstat master > 62 files changed, 4546 insertions(+), 108 deletions(-) Skinny ... right. I started to give it a look -- I may have something useful to comment later. > This patch includes regression tests that we worked on with Thom last > rounds, remember that they only run in the serial schedule, that means > with `make installcheck` only. Adding noisy output at random while the > parallel schedule run is a good way to break all the regression testing, > so I've been avoiding that. Hmm, I don't like the idea of a test that only runs in serial mode. Maybe we can find some way to make it work in parallel mode as well. I don't have anything useful to comment right now. -- Álvaro Herrera <alvherre@commandprompt.com> The PostgreSQL Company - Command Prompt, Inc. PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Fri, Jun 15, 2012 at 4:27 PM, Dimitri Fontaine <dfontaine@hi-media.com> wrote: > Allow me to open the new season of the DML trigger series, named > pg_event_trigger. This first episode is all about setting up the drama, > so that next ones make perfect sense. Comments: 1. I still think we ought to get rid of the notion of BEFORE or AFTER (i.e. pg_event_trigger.evttype) and just make that detail part of the event name (e.g. pg_event_trigger.evtevent). Many easily forseeable event types will be more like "during" rather than "before" or "after", and for those that do have a notion of before and after, we can have two different event names and include the word "before" or "after" there. I am otherwise satisfied with the schema you've chosen. 2. I think it's important to be able to add new types of event triggers without creating excessive parser bloat. I think it's important to use some kind of generic syntax here which will be able to apply to all types of triggers we may want to add, now or in the future. The easiest way to do that is to use literal syntax for the list of command tags, rather than writing them out as key words: e.g. 'ALTER TABLE' rather than ALTER TABLE. It's not quite as pretty, but the savings in parser bloat and future code change seem well worth it.Or, alternatively, we could use identifier style, e.g.alter_table, as I previously suggested. 3. The event trigger cache seems to be a few bricks shy of a load. First, event_trigger_cache_is_stalled is mis-named; I think you mean "stale", not "stalled". Second, instead of setting that flag and then rebuilding the cache when you see the flag set, how about just blowing away the cache contents whenever you would have set the flag? That seems a whole lot simpler and cleaner, and removes the need for a force_rebuild flag on BuildEventTriggerCache(). Third, ISTM that this isn't going to work correctly if backend A performs an event after backend B has built its cache. To fix this, I think you need to rip out all the places where you force a rebuild locally and instead use something like CacheRegisterSyscacheCallback() to blow away the cache whenever something changes; you might find it helpful to look at attoptcache.c. 4. The documentation doesn't build. openjade:reference.sgml:44:4:W: cannot generate system identifier for general entity "alterEventTrigger" openjade:reference.sgml:44:4:E: general entity "alterEventTrigger" not defined and no default entity openjade:reference.sgml:44:21:E: reference to entity "alterEventTrigger" for which no system identifier could be generated openjade:reference.sgml:44:3: entity was defined here openjade:reference.sgml:86:4:W: cannot generate system identifier for general entity "createEventTrigger" openjade:reference.sgml:86:4:E: general entity "createEventTrigger" not defined and no default entity openjade:reference.sgml:86:22:E: reference to entity "createEventTrigger" for which no system identifier could be generated openjade:reference.sgml:86:3: entity was defined here openjade:reference.sgml:125:4:W: cannot generate system identifier for general entity "dropEventTrigger" openjade:reference.sgml:125:4:E: general entity "dropEventTrigger" not defined and no default entity openjade:reference.sgml:125:20:E: reference to entity "dropEventTrigger" for which no system identifier could be generated openjade:reference.sgml:125:3: entity was defined here openjade:catalogs.sgml:1868:35:E: character "_" is not allowed in the value of attribute "ZONE" openjade:catalogs.sgml:1868:19:X: reference to non-existent ID "CATALOG-PG-EVENT_TRIGGER" openjade:trigger.sgml:43:47:X: reference to non-existent ID "SQL-CREATECOMMANDTRIGGER" 5. In terms of a psql command, I think that \dev is both not very mnemonic and, as you mentioned in the comment, possibly confusable with SQL/MED. If we're going to have a \d command for this, I suggest something like \dy, which is not very mnemonic either but at least seems unlikely to be confused with anything else. Other things that are unused include \dh, \dj, \dk, \dq, \dw, and \dz, if any of those grab you, or a somewhat broader range of things (but still nothing very memorable) if we include capital letters. Or we could branch out into punctuation, like \d& -- or things that don't begin with the letter d, but that doesn't seem like a particularly good idea. 6. In objectaddress.c, I think that get_object_address_event_trigger should be eliminated in favor of an additional case in get_object_address_unqualified. 7. There are many references to command triggers that still need to be cleaned up. trigger.sgml still makes reference to the name command triggers. plpgsql.sgml also contains vestiges of the command trigger notation, and references to some TG_* variables that I don't think exist in this version of the patch. event_trigger.c is identified (twice) as cmdtrigger.c in the file header comment. The header comment for InsertEventTriggerTuple makes reference to pg_cmdtrigger, as does a line comment in RemoveEventTriggerById. The regression output mentions cmdtrigger in a few places as well. In the documentation, event triggers are mentioned as having return type command_trigger, but it's now called event_trigger. 8. The re-addition of node tags like T_RemoveFuncStmt seems to be a merging error. Changing \dc so that it rejects \dcrap appears to be a leftover from when the command was \dcT. In one place in the docs you have 'avent' for 'event'. In event_trigger.c, you have #ifdef UNDEFINED; project standard is #ifdef NOT_USED (or else you can remove the code). 9. The regression tests seem to now be testing some features that don't exist any more, and might need some rethinking to make what they do match the scope of this patch. 10. I suggest separating out the support for other PLs (Python, Tcl) and submitting that as a later patch, since I'm unqualified to commit it (and I'm hoping to get the rest of this committed). The PL/TCL stuff also contains residual references to the command-trigger notation which should be cleaned up before resubmitting. There's probably more, but I'm all reviewed out for right now. Hopefully that's enough to get you started. I think this is heading in a good direction, even though there's still a good bit of work left to do. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Alvaro Herrera <alvherre@commandprompt.com> writes: > Did you try REASSIGN OWNED and DROP OWNED with a role that has defined > some event triggers? Not yet. Will add to regression tests, good catch. > Hmm, I don't like the idea of a test that only runs in serial mode. > Maybe we can find some way to make it work in parallel mode as well. I don't see how we can manage to do that, as adding an event trigger to some command will impact tests running in parallel. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Wednesday, June 20, 2012 09:33:26 PM Dimitri Fontaine wrote: > > Hmm, I don't like the idea of a test that only runs in serial mode. > > Maybe we can find some way to make it work in parallel mode as well. > > I don't see how we can manage to do that, as adding an event trigger to > some command will impact tests running in parallel. Cant you just put it alone in a test group in the parallel_schedule? Several other tests do that? Andres -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Wed, Jun 20, 2012 at 3:39 PM, Andres Freund <andres@2ndquadrant.com> wrote: > On Wednesday, June 20, 2012 09:33:26 PM Dimitri Fontaine wrote: >> > Hmm, I don't like the idea of a test that only runs in serial mode. >> > Maybe we can find some way to make it work in parallel mode as well. >> >> I don't see how we can manage to do that, as adding an event trigger to >> some command will impact tests running in parallel. > Cant you just put it alone in a test group in the parallel_schedule? Several > other tests do that? Yeah, that seems like it should work. If not, I'd say the tests themselves are broken. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Hi, Robert Haas <robertmhaas@gmail.com> writes: > 1. I still think we ought to get rid of the notion of BEFORE or AFTER > (i.e. pg_event_trigger.evttype) and just make that detail part of the > event name (e.g. pg_event_trigger.evtevent). Many easily forseeable > event types will be more like "during" rather than "before" or > "after", and for those that do have a notion of before and after, we > can have two different event names and include the word "before" or > "after" there. I am otherwise satisfied with the schema you've > chosen. It's not before/after anymore, but rather addon/replace if you will. I kept the INSTEAD OF keyword for the replace semantics, that you've been asking me to keep IIRC, with security policy plugins as a use case. Now we can of course keep those semantics and embed them in the event name we provide users, I though that maybe a documentation matrix of which event support which "mode" would be cleaner to document. We might as well find a clean way to implement both modes for most of the commands, I don't know yet. So, are you sure you want to embed that part of the event trigger semantics in the event name itself? > 2. I think it's important to be able to add new types of event > triggers without creating excessive parser bloat. I think it's I've been trying to do that yes, as you can see with event_name and event_trigger_variable rules. I've been re-using as much existing keywords as I could because I believe that's not causing any measurable bloat, I'll kindly reconsider if necessary, even if sadly. > important to use some kind of generic syntax here which will be able > to apply to all types of triggers we may want to add, now or in the > future. The easiest way to do that is to use literal syntax for the > list of command tags, rather than writing them out as key words: e.g. > 'ALTER TABLE' rather than ALTER TABLE. It's not quite as pretty, but > the savings in parser bloat and future code change seem well worth it. > Or, alternatively, we could use identifier style, e.g. alter_table, > as I previously suggested. Whatever the solution here, we still need to be able to ERROR out very early if the user entered a non supported command here, such as GRANT for the time being. I'm not sure a manual list of strcmp() would be more effective than having bison basically produce the same thing for us. I think using the grammar here makes for easier maintenance. > 3. The event trigger cache seems to be a few bricks shy of a load. I wouldn't be that surprised, mind you. I didn't have nearly as much time I wanted to working on that project. > First, event_trigger_cache_is_stalled is mis-named; I think you mean > "stale", not "stalled". Second, instead of setting that flag and then Stale. Right. Edited. > rebuilding the cache when you see the flag set, how about just blowing > away the cache contents whenever you would have set the flag? That I've been doing that at first, but that meant several full rebuilds in a row in the regression tests, which are adding new event triggers then using them. I though lazily maintaining the cache would be better. > seems a whole lot simpler and cleaner, and removes the need for a > force_rebuild flag on BuildEventTriggerCache(). Third, ISTM that this > isn't going to work correctly if backend A performs an event after > backend B has built its cache. To fix this, I think you need to rip > out all the places where you force a rebuild locally and instead use > something like CacheRegisterSyscacheCallback() to blow away the cache > whenever something changes; you might find it helpful to look at > attoptcache.c. Ok, looking at that for next revision of the patch, which I should be able to post early next week. > 4. The documentation doesn't build. Sorry about that, will get fixed too. > 5. In terms of a psql command, I think that \dev is both not very > mnemonic and, as you mentioned in the comment, possibly confusable > with SQL/MED. If we're going to have a \d command for this, I suggest > something like \dy, which is not very mnemonic either but at least > seems unlikely to be confused with anything else. Other things that > are unused include \dh, \dj, \dk, \dq, \dw, and \dz, if any of those > grab you, or a somewhat broader range of things (but still nothing > very memorable) if we include capital letters. Or we could branch out > into punctuation, like \d& -- or things that don't begin with the > letter d, but that doesn't seem like a particularly good idea. I'm not that fond of psql commands, but I don't think it's going to fly not to have one for event triggers. I could buy \dy. > 6. In objectaddress.c, I think that get_object_address_event_trigger > should be eliminated in favor of an additional case in > get_object_address_unqualified. Sure. It used to be more complex than that when the identifier was a composite with the command name, it makes no sense to separate it away now. Done. > 7. There are many references to command triggers that still need to be > cleaned up. trigger.sgml still makes reference to the name command > triggers. plpgsql.sgml also contains vestiges of the command trigger > notation, and references to some TG_* variables that I don't think > exist in this version of the patch. event_trigger.c is identified > (twice) as cmdtrigger.c in the file header comment. The header > comment for InsertEventTriggerTuple makes reference to pg_cmdtrigger, > as does a line comment in RemoveEventTriggerById. The regression > output mentions cmdtrigger in a few places as well. In the > documentation, event triggers are mentioned as having return type > command_trigger, but it's now called event_trigger. All fixed, will grep for cmd and command in the patch and fix any other one that's still there before sending v2 of the patch. Sorry about that. > 8. The re-addition of node tags like T_RemoveFuncStmt seems to be a > merging error. Changing \dc so that it rejects \dcrap appears to be a > leftover from when the command was \dcT. In one place in the docs you > have 'avent' for 'event'. In event_trigger.c, you have #ifdef > UNDEFINED; project standard is #ifdef NOT_USED (or else you can remove > the code). Will see about node tags and psql clean merge. Docs fixed. I meant to remove the code. Done now. Thanks. > 9. The regression tests seem to now be testing some features that > don't exist any more, and might need some rethinking to make what they > do match the scope of this patch. The current implementation must be kicking for all supported commands and we have a authoritative list of them in the grammar, so I wanted to maintain a regression test suite where they are all exercised, even if we're exercising the same code path each time. That's meant to change with later patch, I'm not sure how much gain we have to remove test covering now knowing that we certainly won't release with only that first patch. > 10. I suggest separating out the support for other PLs (Python, Tcl) > and submitting that as a later patch, since I'm unqualified to commit > it (and I'm hoping to get the rest of this committed). The PL/TCL > stuff also contains residual references to the command-trigger > notation which should be cleaned up before resubmitting. Fixed pltcl internal references. Will produce separate patches for next revision. > There's probably more, but I'm all reviewed out for right now. > Hopefully that's enough to get you started. I think this is heading > in a good direction, even though there's still a good bit of work left > to do. Thanks for your review, it's clearly enough to get started chewing on the patch! Early followers can see the progress, when it happens, in the github repository, if waiting for the next patch is unbearably long :) https://github.com/dimitri/postgres https://github.com/dimitri/postgres/tree/evt_trig_v1 Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Robert Haas <robertmhaas@gmail.com> writes: >> Cant you just put it alone in a test group in the parallel_schedule? Several >> other tests do that? > > Yeah, that seems like it should work. If not, I'd say the tests > themselves are broken. I completely missed that we could do that. I don't feel bright. Of course it just works. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Wed, Jun 20, 2012 at 4:36 PM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> 1. I still think we ought to get rid of the notion of BEFORE or AFTER >> (i.e. pg_event_trigger.evttype) and just make that detail part of the >> event name (e.g. pg_event_trigger.evtevent). Many easily forseeable >> event types will be more like "during" rather than "before" or >> "after", and for those that do have a notion of before and after, we >> can have two different event names and include the word "before" or >> "after" there. I am otherwise satisfied with the schema you've >> chosen. > > It's not before/after anymore, but rather addon/replace if you will. I > kept the INSTEAD OF keyword for the replace semantics, that you've been > asking me to keep IIRC, with security policy plugins as a use case. > > Now we can of course keep those semantics and embed them in the event > name we provide users, I though that maybe a documentation matrix of > which event support which "mode" would be cleaner to document. We might > as well find a clean way to implement both modes for most of the > commands, I don't know yet. > > So, are you sure you want to embed that part of the event trigger > semantics in the event name itself? Yeah, pretty sure. I think that for regular triggers, BEFORE, AFTER, and INSTEAD-OF are the firing-point specification. But even triggers will have more than three firing points, probably eventually quite a lot more. So we need something more flexible. But we don't need that more flexible thing AND ALSO the before/after/instead-of specification, which I think in most cases won't be meaningful anyway.It happens to be somewhat sensible for this initialfiring point, but I think for most of them there will be just one place, and in many cases it will be neither before, nor after, nor instead-of. >> 2. I think it's important to be able to add new types of event >> triggers without creating excessive parser bloat. I think it's > > I've been trying to do that yes, as you can see with event_name and > event_trigger_variable rules. I've been re-using as much existing > keywords as I could because I believe that's not causing any measurable > bloat, I'll kindly reconsider if necessary, even if sadly. The issue is that the size of the parser tables grow with the square of the number of states. This will introduce lots of new states that we don't really need; and every new kind of event trigger that we want to add will introduce more. >> 3. The event trigger cache seems to be a few bricks shy of a load. > > I wouldn't be that surprised, mind you. I didn't have nearly as much > time I wanted to working on that project. > >> First, event_trigger_cache_is_stalled is mis-named; I think you mean >> "stale", not "stalled". Second, instead of setting that flag and then > > Stale. Right. Edited. > >> rebuilding the cache when you see the flag set, how about just blowing >> away the cache contents whenever you would have set the flag? That > > I've been doing that at first, but that meant several full rebuilds in a > row in the regression tests, which are adding new event triggers then > using them. I though lazily maintaining the cache would be better. Well, AFAICS, you're still doing full rebuilds whenever something changes; you're just keeping the (useless, dead) cache around until you decide to rebuild it. Might as well free the memory once you know that the next access will rebuild it anyway, and for a bonus it saves you a flag. > I'm not that fond of psql commands, but I don't think it's going to fly > not to have one for event triggers. I could buy \dy. Yeah, I think people are going to want to have one. I really despise the \d<whatever> syntax, but it's not 100% clear what a better one would look like. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: >> It's not before/after anymore, but rather addon/replace if you will. I >> kept the INSTEAD OF keyword for the replace semantics, that you've been >> asking me to keep IIRC, with security policy plugins as a use case. >> >> Now we can of course keep those semantics and embed them in the event >> name we provide users, I though that maybe a documentation matrix of >> which event support which "mode" would be cleaner to document. We might >> as well find a clean way to implement both modes for most of the >> commands, I don't know yet. >> >> So, are you sure you want to embed that part of the event trigger >> semantics in the event name itself? > > Yeah, pretty sure. I think that for regular triggers, BEFORE, AFTER, > and INSTEAD-OF are the firing-point specification. But even triggers > will have more than three firing points, probably eventually quite a > lot more. So we need something more flexible. But we don't need that > more flexible thing AND ALSO the before/after/instead-of > specification, which I think in most cases won't be meaningful anyway. > It happens to be somewhat sensible for this initial firing point, but > I think for most of them there will be just one place, and in many > cases it will be neither before, nor after, nor instead-of. I agree with using the event name as a the specification for the firing point, and that we should prefer documenting the ordering of those rather than offering a fuzzy idea of BEFORE and AFTER steps in there. The AFTER step is better expressed as BEFORE the next one. Now, I still think there's an important discrepancy between adding a new behaviour that adds-up to whatever the backend currently implements and providing a replacement behaviour with a user defined function that gets called instead of the backend code. And I still don't think that the event name should be carrying alone that semantic discrepancy. Now, I also want the patch to get in, so I won't insist very much if I'm alone in that position. Anyone else interested enough to chime in? The user visible difference would be between those variants: create event trigger foo at 'before_security_check' ... create event trigger foo at 'replace_security_check' ... create event trigger foo before 'security_check' ... create event trigger foo instead of 'security_check' ... Note that in this version the INSTEAD OF variant is not supported, we only intend to offer it in some very narrow cases, or at least that is my understanding. > The issue is that the size of the parser tables grow with the square > of the number of states. This will introduce lots of new states that > we don't really need; and every new kind of event trigger that we want > to add will introduce more. It's a little sad not being able to reuse command tag keywords, but it's even more sad to impact the rest of the query parsing. IIRC you had some performance test patch with a split of the main parser into queries and dml on the one hand, and utility commands on the other hand. Would that help here? (I mean more as a general solution against that bloat problem than for this very patch here). I prefer the solution of using 'ALTER TABLE' rather than ALTER TABLE, even if code wise we're not gaining anything in complexity: the parser bloat gets replaced by a big series of if branches. Of course you only exercise it when you need to. I will change that for next patch. >>> 3. The event trigger cache seems to be a few bricks shy of a load. > > Well, AFAICS, you're still doing full rebuilds whenever something > changes; you're just keeping the (useless, dead) cache around until > you decide to rebuild it. Might as well free the memory once you know > that the next access will rebuild it anyway, and for a bonus it saves > you a flag. I'm just done rewriting the cache management with a catalog cache for event triggers and a Syscache Callback that calls into a new module called src/backend/utils/cache/evtcache.c that mimics attoptcache.c. No more cache stale variable. And a proper composite hash key. I still have some more work here before being able to send a new release of the patch, as I said I won't have enough time to make that happen until within next week. The git repository is updated, though. https://github.com/dimitri/postgres/tree/evt_trig_v1 https://github.com/dimitri/postgres/compare/913091de51...861eb038d0 Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Hi, Here's an early revision 2 of the patch, not yet ready for commit, so including the PL stuff still. What's missing is mainly a cache reference leak to fix at DROP EVENT TRIGGER, but I failed to spot where it comes from. As I fixed about all the other comments I though I might as well send in the new version of the patch to get to another round of review. Robert Haas <robertmhaas@gmail.com> writes: > 1. I still think we ought to get rid of the notion of BEFORE or AFTER > (i.e. pg_event_trigger.evttype) and just make that detail part of the > event name (e.g. pg_event_trigger.evtevent). Many easily forseeable So, agreed on before/after, not on INSTEAD OF. No change in the patch, still discussing that point. > 2. I think it's important to be able to add new types of event > triggers without creating excessive parser bloat. I think it's Fixed in the attached, I believe. > 3. The event trigger cache seems to be a few bricks shy of a load. Fixed in the attached, including cache invalidation registered as a system cache callback. > 4. The documentation doesn't build. Fixed, I mainly managed to forget adding the new files. > 5. In terms of a psql command, I think that \dev is both not very Switched to \dy and cleaned up. > 6. In objectaddress.c, I think that get_object_address_event_trigger > should be eliminated in favor of an additional case in > get_object_address_unqualified. Fixed in the attached. > 7. There are many references to command triggers that still need to be > cleaned up. All fixed. > 8. The re-addition of node tags like T_RemoveFuncStmt seems to be a > merging error. Changing \dc so that it rejects \dcrap appears to be a > leftover from when the command was \dcT. In one place in the docs you > have 'avent' for 'event'. In event_trigger.c, you have #ifdef > UNDEFINED; project standard is #ifdef NOT_USED (or else you can remove > the code). All fixed. > 9. The regression tests seem to now be testing some features that > don't exist any more, and might need some rethinking to make what they > do match the scope of this patch. Actually those tests helped me spot some strange things when cleaning up the cache key, and only some commands would fail. So I'm in favor of keeping it all for now. > 10. I suggest separating out the support for other PLs (Python, Tcl) > and submitting that as a later patch, since I'm unqualified to commit > it (and I'm hoping to get the rest of this committed). The PL/TCL > stuff also contains residual references to the command-trigger > notation which should be cleaned up before resubmitting. That's for next turn. > There's probably more, but I'm all reviewed out for right now. > Hopefully that's enough to get you started. I think this is heading > in a good direction, even though there's still a good bit of work left > to do. So, let's see about that remaining bit of work :) Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Attachment
On Sun, Jun 24, 2012 at 5:46 PM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > Here's an early revision 2 of the patch, not yet ready for commit, so > including the PL stuff still. What's missing is mainly a cache reference > leak to fix at DROP EVENT TRIGGER, but I failed to spot where it comes > from. > > As I fixed about all the other comments I though I might as well send in > the new version of the patch to get to another round of review. Some of the pg_dump hunks fail to apply for me; I guess this needs to be remerged. Spurious hunk: - query_hosts + query_hosts Spurious hunk: - * need deep copies, so they should be copied via list_copy() + * need deep copies, so they should be copied via list_copy(const ) There are a few remaining references to EVTG_FIRED_BEFORE / EVTG_FIRED_INSTEAD_OF which should be cleaned up. I suggest writing the grammar as CREATE EVENT TRIGGER name ON event_name EXECUTE... On a related note, evttype is still mentioned in the documentation, also.And CreateEventTrigStmt has a char timing field thatcan go. Incidentally, why do we not support an argument list here, as with ordinary triggers? I think the below hunk should get removed. Or else it should be surrounded by #ifdef rather than commented out. + /* + * Useful to raise WARNINGs for any DDL command not yet supported. + * + elog(WARNING, "Command Tag: %s", tag); + elog(WARNING, "Note to String: %s", nodeToString(parsetree)); + */ Typo: + * where we probide object name and namespace separately and still want nice Typo: + * the easier code makes up fot it big time. psql is now very confused about what the slash command is. It's actually implemented as \dy, but the comments say \dev in several places, and slashUsage() documents it as \dct. InitializeEvtTriggerCommandCache still needs work. First, it ensures that CacheMemoryContext is non-NULL... after it's already stored the value of CacheMemoryContext into the HASHCTL. Obviously, the order there needs to be fixed. Also, I think you need to split this into two functions. There should be one function that gets called just once at startup time to CacheRegisterSyscacheCallback(), because we don't want to redo that every time the cache gets blown away. Then there should be another function that gets called when, and only when, someone needs to use the cache. That should create and populate the hash table. I think that all event triggers should fire in exactly alphabetical order by name. Now that we have the concept of multiple firing points, there's really no reason to distinguish between any triggers and triggers on specific commands. Eliminating that distinction will eliminate a bunch of complexity while making things *more* flexible for the user, who will be able to get a command trigger for a specific command to fire either before or after an ANY command trigger he has also defined rather than having the system enforce policy on him. plperl_validator still makes reference to CMDTRIGGER. Let's simplify this down to an enum with just one element, since that's all we're going to support for starters, and remove the related parser support for everything but command_start: +typedef enum TrigEvent +{ + E_CommandStart = 1, + E_SecurityCheck = 10, + E_ConsistencyCheck = 15, + E_NameLookup = 20, + E_CommandEnd = 51 +} TrigEvent; I think we should similarly rip out the vestigial support for supplying schema name, object name, and object ID. That's not going to be possible at command_start anyway; we can include that stuff in the patch that adds a later firing point (command end, or consistency check, perhaps). I think standard_ProcessUtility should have a shortcut that bypasses setting up the event context if there are no event triggers at all in the entire system, so that we do no extra work unless there's some potential benefit. It seems to me that we probably need a CommandCounterIncrement() after firing each event trigger, unless that's happening under the covers somewhere and I'm missing it. A good test case would be to have two event triggers. Have the first one insert a row into the table and check that the second one can see the row there. I suggest adding something like this to the regression tests. Instead of having giant switch statements match E_WhatEver tags to strings and visca versa, I think it would be much better to create an array someplace that contains all the mappings. Then you can write a convenience function that scans the array for a string and returns the E_WhatEver tag, and another convenience function that scans the array for an E_WhatEver tag and returns the corresponding string. Then all the knowledge of how those things map onto each other is in ONE place, which should make things a whole lot easier in terms of future code maintenance, not to mention minimizing the chances of bugs of oversight in the patch as it stands. :-) The comment changes in type_sanity.sql seem unnecessary. I think you can revert that part. There's also a one-line whitespace change in triggers.sql which can also be removed. With respect to this hunk, it seems to me that the verbiage is inaccurate. It will forbid the execution of any command for which event triggers are supported, whether they happen to be DDL commands or not. Also, a hypothetical DDL command that can't tolerate event triggers wouldn't be forbidden. I'm not sure exactly what the wording should look like here, but we should strive to be as exact as possible. + <para> + Forbids the execution of any DDL command: + +<programlisting> +CREATE OR REPLACE FUNCTION abort_any_command() + RETURNS event_trigger + LANGUAGE plpgsql + AS $$ +BEGIN + RAISE EXCEPTION 'command % is disabled', tg_tag; +END; +$$; format_type_be_without_namespace is unused in this patch; please remove it for now. get_event_trigger_oid looks like it might be the source of your syscache reference leak. It would be a good idea to change the coding pattern of this function to match, say, get_foreign_server_oid. That would fix the leak and be more consistent. I'm all reviewed out; hope that's enough for now. I hope you can get this cleaned up some more soon, as we are starting to run out of CommitFest and I would really like to get this in. Of course if we miss the CommitFest deadline I am happy to work on it in July and August but the sooner we get it done the better. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, Jun 28, 2012 at 9:46 AM, Robert Haas <robertmhaas@gmail.com> wrote: > [ review ] Also: ../../../src/include/catalog/pg_event_trigger.h:34: error: expected specifier-qualifier-list before ‘int2’ This needs to be changed to int16 as a result of commit b8b2e3b2deeaab19715af063fc009b7c230b2336. alter.c:73: warning: implicit declaration of function ‘RenameEventTrigger’ That file needs to include commands/event_trigger.h. Please spell out DO_EVTTRIGGER as DO_EVENT_TRIGGER. Another random warning: plpy_main.c:195: warning: ‘retval’ may be used uninitialized in this function -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Hi, Here's an answer to your review (thanks!), with no patch attached yet even if I've been cleanup up most of what you reported. Incremental diff is available for browsing here: https://github.com/dimitri/postgres/compare/f99e8d93b7...8da156dc70 Robert Haas <robertmhaas@gmail.com> writes: > Some of the pg_dump hunks fail to apply for me; I guess this needs to > be remerged. Done. > There are a few remaining references to EVTG_FIRED_BEFORE / > EVTG_FIRED_INSTEAD_OF which should be cleaned up. I suggest writing > the grammar as CREATE EVENT TRIGGER name ON event_name EXECUTE... On > a related note, evttype is still mentioned in the documentation, also. > And CreateEventTrigStmt has a char timing field that can go. I didn't get the memo that we had made a decision here :) That said it will be possible to change our mind and introduce that instead of syntax if that's necessary later in the cycle, so I'll go clean up for the first commit. > Incidentally, why do we not support an argument list here, as with > ordinary triggers? Left for a follow-up patch. Do you want it already in this one? > I think the below hunk should get removed. Or else it should be > surrounded by #ifdef rather than commented out. Removed. > Typo: Fixed. > psql is now very confused about what the slash command is. It's > actually implemented as \dy, but the comments say \dev in several > places, and slashUsage() documents it as \dct. Fixed. > InitializeEvtTriggerCommandCache still needs work. First, it ensures > that CacheMemoryContext is non-NULL... after it's already stored the > value of CacheMemoryContext into the HASHCTL. Obviously, the order > there needs to be fixed. Also, I think you need to split this into > two functions. There should be one function that gets called just > once at startup time to CacheRegisterSyscacheCallback(), because we > don't want to redo that every time the cache gets blown away. Then > there should be another function that gets called when, and only when, > someone needs to use the cache. That should create and populate the > hash table. CacheMemoryContext ordering fixed, I wrongly copied from attoptcache here even when the memory management is not really done the same. The only place I can see where to init the Event Trigger Cache is in InitPostgres() itself (in src/backend/utils/init/postinit.c), done now. > I think that all event triggers should fire in exactly alphabetical > order by name. Now that we have the concept of multiple firing > points, there's really no reason to distinguish between any triggers > and triggers on specific commands. Eliminating that distinction will > eliminate a bunch of complexity while making things *more* flexible > for the user, who will be able to get a command trigger for a specific > command to fire either before or after an ANY command trigger he has > also defined rather than having the system enforce policy on him. Internally we still need to keep the distinction to be able to fire ANY triggers on otherwise non supported commands. I agree that we should not concern our users with that, though. Done. > plperl_validator still makes reference to CMDTRIGGER. In a comment, fixed. > Let's simplify this down to an enum with just one element, since > that's all we're going to support for starters, and remove the related > parser support for everything but command_start: > > +typedef enum TrigEvent > +{ > + E_CommandStart = 1, > + E_SecurityCheck = 10, > + E_ConsistencyCheck = 15, > + E_NameLookup = 20, > + E_CommandEnd = 51 > +} TrigEvent; I wanted to see where the current choice would lead us, I agree that we can remove that level of details for now, that also allows not to pick next event integration point names already :) Done. > I think we should similarly rip out the vestigial support for > supplying schema name, object name, and object ID. That's not going > to be possible at command_start anyway; we can include that stuff in > the patch that adds a later firing point (command end, or consistency > check, perhaps). We got this part of the API fixed last round, so I would prefer not to dumb it down in this first patch. We know that we want to add exactly that specification later, don't we? > I think standard_ProcessUtility should have a shortcut that bypasses > setting up the event context if there are no event triggers at all in > the entire system, so that we do no extra work unless there's some > potential benefit. Setting up the event context is a very lightweight operation, and there's no way to know if the command is going to fire an event trigger without having done exactly what the InitEventContext() is doing. Maybe what we need to do here is rename it? Another problem with short cutting it aggressively is what happens if a new event trigger is created while the command is in flight. We have yet to discuss about that (as we only support a single timing point), but doing it the way you propose forecloses any other choice than "repeatable read" equivalent where we might want to have some "read commited" behaviour, that is fire the new triggers if they appear while the command is being run. > It seems to me that we probably need a CommandCounterIncrement() after > firing each event trigger, unless that's happening under the covers > somewhere and I'm missing it. A good test case would be to have two > event triggers. Have the first one insert a row into the table and > check that the second one can see the row there. I suggest adding > something like this to the regression tests. Added CommandCounterIncrement() and a new regression test. That fails for now, I'll have to get back to that later. > Instead of having giant switch statements match E_WhatEver tags to > strings and visca versa, I think it would be much better to create an > array someplace that contains all the mappings. Then you can write a > convenience function that scans the array for a string and returns the > E_WhatEver tag, and another convenience function that scans the array > for an E_WhatEver tag and returns the corresponding string. Then all > the knowledge of how those things map onto each other is in ONE place, > which should make things a whole lot easier in terms of future code > maintenance, not to mention minimizing the chances of bugs of > oversight in the patch as it stands. :-) That means that the Enum definition can not jump from a number to another non consecutive one, or that we have a very sparse array and some way to fill it unknown to me. As those numbers are going to end up on disk, we can not ever change them. I though it would be better to mimic what we do with the NodeTag definition here. > The comment changes in type_sanity.sql seem unnecessary. I think you > can revert that part. There's also a one-line whitespace change in > triggers.sql which can also be removed. Done. > With respect to this hunk, it seems to me that the verbiage is > inaccurate. It will forbid the execution of any command for which > event triggers are supported, whether they happen to be DDL commands > or not. Also, a hypothetical DDL command that can't tolerate event > triggers wouldn't be forbidden. I'm not sure exactly what the wording > should look like here, but we should strive to be as exact as > possible. > > + <para> > + Forbids the execution of any DDL command: > + > +<programlisting> > +CREATE OR REPLACE FUNCTION abort_any_command() > + RETURNS event_trigger > + LANGUAGE plpgsql > + AS $$ > +BEGIN > + RAISE EXCEPTION 'command % is disabled', tg_tag; > +END; > +$$; Yeah, will rework that text. Not this late though, needs more brainpower. > format_type_be_without_namespace is unused in this patch; please > remove it for now. Done. > get_event_trigger_oid looks like it might be the source of your > syscache reference leak. It would be a good idea to change the coding > pattern of this function to match, say, get_foreign_server_oid. That > would fix the leak and be more consistent. Fixed meanwhile, that was it, thanks. > I'm all reviewed out; hope that's enough for now. I hope you can get > this cleaned up some more soon, as we are starting to run out of > CommitFest and I would really like to get this in. Of course if we > miss the CommitFest deadline I am happy to work on it in July and > August but the sooner we get it done the better. So, I've begun working on this already, and I intend to spend more time on PostgreSQL development from next week on. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Robert Haas <robertmhaas@gmail.com> writes: > ../../../src/include/catalog/pg_event_trigger.h:34: error: expected > specifier-qualifier-list before ‘int2’ > > This needs to be changed to int16 as a result of commit > b8b2e3b2deeaab19715af063fc009b7c230b2336. Done as part of the previous work. > alter.c:73: warning: implicit declaration of function ‘RenameEventTrigger’ Done now. > Please spell out DO_EVTTRIGGER as DO_EVENT_TRIGGER. Also done as part of the previous email. > Another random warning: > plpy_main.c:195: warning: ‘retval’ may be used uninitialized in this function Will do a whole warning check pass later. Can you give me your local Makefile trick to turn them into hard errors again please? :) Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes: > https://github.com/dimitri/postgres/compare/f99e8d93b7...8da156dc70 The revised incremental diff is here: https://github.com/dimitri/postgres/compare/f99e8d93b7...74bbeda8 And a new revision of the patch is attached to this email. We have some pending questions, depending on the answers it could be ready for commit. > Robert Haas <robertmhaas@gmail.com> writes: >> There are a few remaining references to EVTG_FIRED_BEFORE / >> EVTG_FIRED_INSTEAD_OF which should be cleaned up. I suggest writing >> the grammar as CREATE EVENT TRIGGER name ON event_name EXECUTE... On >> a related note, evttype is still mentioned in the documentation, also. >> And CreateEventTrigStmt has a char timing field that can go. > > I didn't get the memo that we had made a decision here :) That said it > will be possible to change our mind and introduce that instead of syntax > if that's necessary later in the cycle, so I'll go clean up for the > first commit. Done now. >> Incidentally, why do we not support an argument list here, as with >> ordinary triggers? > > Left for a follow-up patch. Do you want it already in this one? Didn't do that, I though cleaning up all the points here would go first, please tell me if you want that in the first commit. >> I think we should similarly rip out the vestigial support for >> supplying schema name, object name, and object ID. That's not going >> to be possible at command_start anyway; we can include that stuff in >> the patch that adds a later firing point (command end, or consistency >> check, perhaps). > > We got this part of the API fixed last round, so I would prefer not to > dumb it down in this first patch. We know that we want to add exactly > that specification later, don't we? Didn't change anything here. >> I think standard_ProcessUtility should have a shortcut that bypasses >> setting up the event context if there are no event triggers at all in >> the entire system, so that we do no extra work unless there's some >> potential benefit. > > Setting up the event context is a very lightweight operation, and > there's no way to know if the command is going to fire an event trigger > without having done exactly what the InitEventContext() is doing. Maybe > what we need to do here is rename it? > > Another problem with short cutting it aggressively is what happens if a > new event trigger is created while the command is in flight. We have yet > to discuss about that (as we only support a single timing point), but > doing it the way you propose forecloses any other choice than > "repeatable read" equivalent where we might want to have some "read > commited" behaviour, that is fire the new triggers if they appear while > the command is being run. Same, don't see a way to shortcut. > Added CommandCounterIncrement() and a new regression test. That fails > for now, I'll have to get back to that later. Of course I just needed to pay attention to the new ordering rules :) >> Instead of having giant switch statements match E_WhatEver tags to >> strings and visca versa, I think it would be much better to create an >> array someplace that contains all the mappings. Then you can write a >> convenience function that scans the array for a string and returns the >> E_WhatEver tag, and another convenience function that scans the array >> for an E_WhatEver tag and returns the corresponding string. Then all >> the knowledge of how those things map onto each other is in ONE place, >> which should make things a whole lot easier in terms of future code >> maintenance, not to mention minimizing the chances of bugs of >> oversight in the patch as it stands. :-) > > That means that the Enum definition can not jump from a number to > another non consecutive one, or that we have a very sparse array and > some way to fill it unknown to me. As those numbers are going to end up > on disk, we can not ever change them. I though it would be better to > mimic what we do with the NodeTag definition here. I'd like some more input here. >> + Forbids the execution of any DDL command: Worked out something. Might need some more input. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Attachment
On 2 July 2012 15:11, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes: >> https://github.com/dimitri/postgres/compare/f99e8d93b7...8da156dc70 > > The revised incremental diff is here: > > https://github.com/dimitri/postgres/compare/f99e8d93b7...74bbeda8 > > And a new revision of the patch is attached to this email. We have some > pending questions, depending on the answers it could be ready for > commit. > >> Robert Haas <robertmhaas@gmail.com> writes: >>> There are a few remaining references to EVTG_FIRED_BEFORE / >>> EVTG_FIRED_INSTEAD_OF which should be cleaned up. I suggest writing >>> the grammar as CREATE EVENT TRIGGER name ON event_name EXECUTE... On >>> a related note, evttype is still mentioned in the documentation, also. >>> And CreateEventTrigStmt has a char timing field that can go. >> >> I didn't get the memo that we had made a decision here :) That said it >> will be possible to change our mind and introduce that instead of syntax >> if that's necessary later in the cycle, so I'll go clean up for the >> first commit. > > Done now. > >>> Incidentally, why do we not support an argument list here, as with >>> ordinary triggers? >> >> Left for a follow-up patch. Do you want it already in this one? > > Didn't do that, I though cleaning up all the points here would go first, > please tell me if you want that in the first commit. > >>> I think we should similarly rip out the vestigial support for >>> supplying schema name, object name, and object ID. That's not going >>> to be possible at command_start anyway; we can include that stuff in >>> the patch that adds a later firing point (command end, or consistency >>> check, perhaps). >> >> We got this part of the API fixed last round, so I would prefer not to >> dumb it down in this first patch. We know that we want to add exactly >> that specification later, don't we? > > Didn't change anything here. > >>> I think standard_ProcessUtility should have a shortcut that bypasses >>> setting up the event context if there are no event triggers at all in >>> the entire system, so that we do no extra work unless there's some >>> potential benefit. >> >> Setting up the event context is a very lightweight operation, and >> there's no way to know if the command is going to fire an event trigger >> without having done exactly what the InitEventContext() is doing. Maybe >> what we need to do here is rename it? >> >> Another problem with short cutting it aggressively is what happens if a >> new event trigger is created while the command is in flight. We have yet >> to discuss about that (as we only support a single timing point), but >> doing it the way you propose forecloses any other choice than >> "repeatable read" equivalent where we might want to have some "read >> commited" behaviour, that is fire the new triggers if they appear while >> the command is being run. > > Same, don't see a way to shortcut. > >> Added CommandCounterIncrement() and a new regression test. That fails >> for now, I'll have to get back to that later. > > Of course I just needed to pay attention to the new ordering rules :) > >>> Instead of having giant switch statements match E_WhatEver tags to >>> strings and visca versa, I think it would be much better to create an >>> array someplace that contains all the mappings. Then you can write a >>> convenience function that scans the array for a string and returns the >>> E_WhatEver tag, and another convenience function that scans the array >>> for an E_WhatEver tag and returns the corresponding string. Then all >>> the knowledge of how those things map onto each other is in ONE place, >>> which should make things a whole lot easier in terms of future code >>> maintenance, not to mention minimizing the chances of bugs of >>> oversight in the patch as it stands. :-) >> >> That means that the Enum definition can not jump from a number to >> another non consecutive one, or that we have a very sparse array and >> some way to fill it unknown to me. As those numbers are going to end up >> on disk, we can not ever change them. I though it would be better to >> mimic what we do with the NodeTag definition here. > > I'd like some more input here. > >>> + Forbids the execution of any DDL command: > > Worked out something. Might need some more input. I'm not clear on this paragraph: Triggers on <literal>ANY</literal> command support more commands than just this list, and will only provide the <literal>command tag</literal> argument as <literal>NOT NULL</literal>. Do you actually mean it will return NULL? I also attach various typo/grammar fixes. -- Thom
Attachment
On Fri, Jun 29, 2012 at 5:38 PM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > Will do a whole warning check pass later. Can you give me your local > Makefile trick to turn them into hard errors again please? :) echo COPT=-Werror > src/Makefile.custom Your latest patch contains a warning about using a variable uninitialized that seems to indicate that you didn't test this very carefully: in get_event_triggers, current_any_name is set but not used. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Jul 2, 2012 at 1:56 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Fri, Jun 29, 2012 at 5:38 PM, Dimitri Fontaine > <dimitri@2ndquadrant.fr> wrote: >> Will do a whole warning check pass later. Can you give me your local >> Makefile trick to turn them into hard errors again please? :) > > echo COPT=-Werror > src/Makefile.custom > > Your latest patch contains a warning about using a variable > uninitialized that seems to indicate that you didn't test this very > carefully: in get_event_triggers, current_any_name is set but not > used. Make that used but not set. Anyhow, it's broken. :-( -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Jul 2, 2012 at 10:11 AM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > [ new patch ] I would really like to start committing parts of this, but there are still a lot of unfinished loose ends in this code. The one that is most immediately bothering me is related to the syntax you've implemented, which is: CREATE EVENT TRIGGER name ON event_name WHEN event_trigger_variable IN (trigger_command_list) EXECUTE PROCEDURE func_name () I've been beating on the issue of generic syntax for a while now, and that production looks sort of generic, but there are problems when you dig into it. trigger_command_list is defined in a way that means that it can never be anything other than a list of tags, which means that event_trigger_variable can never really be anything other than TAG. Oops. I think you need to remove the validation of trigger_command_list from the parser and do that afterwards. In other words, the parser should be happy if event_trigger_variable is an ColId and trigger_command_list is a bunch of comma-separated SConst items. Then, after parsing is complete, the code that actually implements CREATE EVENT TRIGGER should look at event_trigger_variable and decide whether it has a legal value and, if so, whether the associated trigger_command_list is compatible with it. IOW, the validation should be happening in CreateEventTrigger rather than gram.y. There is a related problem in terms of the system catalog representation which I should have noted previously, but did not. The system catalog representation has no place to store event_trigger_variable, and the column that will store trigger_command_list is called evttags, again implying rather strongly that these are command tags we're dealing with rather than anything else. Obviously this could be revised later, but it will be much easier to add new functionality in subsequent patches if we get the catalog infrastructure right - or close to right - on the first try, so it seems worth spending a little bit of time on it. The two questions that occur to me are: 1. Do we imagine a situation where a given event_name would allow more than one choice of event_trigger_variable? If so, then pg_event_trigger needs a place to store the event_trigger_variable. If not, then it's a noise word, and we should consider simplifying the syntax to something like: CREATE EVENT TRIGGER name ON event_name (trigger_command_list) EXECUTE PROCEDURE func_name () or maybe CREATE EVENT TRIGGER name ON event_name FOR (trigger_command_list) EXECUTE PROCEDURE func_name () or perhaps CREATE EVENT TRIGGER name ON event_name USING (trigger_command_list) EXECUTE PROCEDURE func_name () 2. On a related point, do we anticipate that we might eventually want to allow filtering by more than one event_trigger_variable in the same trigger? That is, might we want to do something like this: CREATE EVENT TRIGGER something ON somevent WHEN thingy IN ('item1', 'item2') AND otherthingy IN ('foo', 'bar') EXECUTE PROCEDURE func_name () If so, then how do we imagine that getting stored in the catalog? Please let me know your thoughts. Thanks, -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > CREATE EVENT TRIGGER name ON event_name WHEN event_trigger_variable IN > (trigger_command_list) EXECUTE PROCEDURE func_name () [...] > 1. Do we imagine a situation where a given event_name would allow more > than one choice of event_trigger_variable? If so, then The only use I think is really interesting here is matching command tags for either the main event or a sub-event of some sort. We decided not to include any sub-commands reasoning in that first patch, that's why the syntax is way more generic than the implementation. You could then install an event trigger for a CREATE SEQUENCE that happens as part of a CREATE TABLE, or a DROP COLUMN that happens as part of a DROP TYPE, for examples. So yes the event_trigger_variable is made to only host a tag and the associate catalog entry make that very clear. > pg_event_trigger needs a place to store the event_trigger_variable. My thinking was to add another hard-coded list of tags for the other variable, that is have a flexible syntax (no WHEN clause, a WHEN clause with only main tag matching, a WHEN clause with both parent/main tag matching, or only parent) that only uses two hard coded variables both being tags matched against a static list. The reason why it's not all in the current patch is that we decided together that we want to revisit the sub-command semantics once we have a basic framework in place. It's already leaky though. > If not, then it's a noise word, and we should consider simplifying the > syntax to something like: > CREATE EVENT TRIGGER name ON event_name FOR (trigger_command_list) > EXECUTE PROCEDURE func_name () That would have my preference, in case you would rather have a simplified first patch, that is without any provision to expand on the tag matching facility. Of course we won't be able to keep that syntax as soon as we decide how to handle sub commands, which makes that decision a lot less useful that it seems to be, in my mind. > 2. On a related point, do we anticipate that we might eventually want > to allow filtering by more than one event_trigger_variable in the same > trigger? That is, might we want to do something like this: > > CREATE EVENT TRIGGER something ON somevent WHEN thingy IN ('item1', > 'item2') AND otherthingy IN ('foo', 'bar') EXECUTE PROCEDURE func_name > () That's what I want to be able to do yes, in another step. The only two variables I think about would be named "tag" and "toplevel", or something equivalent ("main", "host", "user", etc). > If so, then how do we imagine that getting stored in the catalog? We will have to extend the catalog when we attack sub command handling, at least I believe we will. Hence the current proposed catalog is not yet finished. We could also already decide about sub command handling if you think that's the only remaining thing to do in this patch; so that it's easier down the road. At the expense of taking some more time right now. Your call, I have the round tuits :) Regards, -- Dimitri Fontaine 06 63 07 10 78 http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Mon, Jul 2, 2012 at 4:10 PM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: >> 2. On a related point, do we anticipate that we might eventually want >> to allow filtering by more than one event_trigger_variable in the same >> trigger? That is, might we want to do something like this: >> >> CREATE EVENT TRIGGER something ON somevent WHEN thingy IN ('item1', >> 'item2') AND otherthingy IN ('foo', 'bar') EXECUTE PROCEDURE func_name >> () > > That's what I want to be able to do yes, in another step. The only two > variables I think about would be named "tag" and "toplevel", or > something equivalent ("main", "host", "user", etc). > >> If so, then how do we imagine that getting stored in the catalog? > > We will have to extend the catalog when we attack sub command handling, > at least I believe we will. Hence the current proposed catalog is not > yet finished. We could also already decide about sub command handling if > you think that's the only remaining thing to do in this patch; so that > it's easier down the road. At the expense of taking some more time right > now. Your call, I have the round tuits :) I'd really like to not have to change the catalog again in every patch, because if we do that then we are just saying we're going to rewrite this patch completely every time we want to add a new feature, which kind of defeats the purpose IMHO. So let's try to hammer something out now. The obvious thing that occurs to me is to have a column in the catalog that is a 2-D array of text, with the first element of each array being something like "tag" or "subtag" (i.e. event_trigger_variable) and the remaining array elements being a list of legal values. That is: WHEN thingy IN thingy IN ('item1', 'item2') AND otherthingy IN ('foo', 'bar') would be represented as this array: {{thingy,item1,item2},{otherthingy,foo,bar}} This would allow us to support 0 or more WHERE clauses, each of the form "thing IN (value1, value2, ...)". Is that general enough for every use case that you can foresee, or do we need more? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > I'd really like to not have to change the catalog again in every > patch, because if we do that then we are just saying we're going to > rewrite this patch completely every time we want to add a new feature, > which kind of defeats the purpose IMHO. Fair enough. > So let's try to hammer something out now. The obvious thing that > occurs to me is to have a column in the catalog that is a 2-D array of > text, with the first element of each array being something like "tag" > or "subtag" (i.e. event_trigger_variable) and the remaining array > elements being a list of legal values. That is: > > WHEN thingy IN thingy IN ('item1', 'item2') AND otherthingy IN ('foo', 'bar') > > would be represented as this array: > > {{thingy,item1,item2},{otherthingy,foo,bar}} > > This would allow us to support 0 or more WHERE clauses, each of the > form "thing IN (value1, value2, ...)". Is that general enough for > every use case that you can foresee, or do we need more? Given what I foresee, simply having another columns in there named evtstags with the exact same content as evttags would be the simplest and most natural implementation, really. I don't foresee more generic needs here, unless you can convince me that we need both a. a full stack of arbitrarily nested commands and b. a way to match and target any level of that stack. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Mon, Jul 2, 2012 at 4:53 PM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> So let's try to hammer something out now. The obvious thing that >> occurs to me is to have a column in the catalog that is a 2-D array of >> text, with the first element of each array being something like "tag" >> or "subtag" (i.e. event_trigger_variable) and the remaining array >> elements being a list of legal values. That is: >> >> WHEN thingy IN thingy IN ('item1', 'item2') AND otherthingy IN ('foo', 'bar') >> >> would be represented as this array: >> >> {{thingy,item1,item2},{otherthingy,foo,bar}} >> >> This would allow us to support 0 or more WHERE clauses, each of the >> form "thing IN (value1, value2, ...)". Is that general enough for >> every use case that you can foresee, or do we need more? > > Given what I foresee, simply having another columns in there named > evtstags with the exact same content as evttags would be the simplest > and most natural implementation, really. That seems a lot less general for no particular gain. > I don't foresee more generic needs here, unless you can convince me that > we need both a. a full stack of arbitrarily nested commands and b. a way > to match and target any level of that stack. Well, let's take the example of an ALTER TABLE command. You could want to match on: - the type of object the user mentioned in the command (did he write ALTER TABLE or ALTER VIEW?) - the type of object actually being modified (could differ if he used ALTER TABLE on a view, or visca versa) - the particular ALTER TABLE subcommand in use (e.g. SET STATISTICS) I suspect there are other examples as well. If we pick the 2-D list representation I suggested, or something like it, we can easily accommodate these kinds of filters without having to whack the catalog representation around any further. That seems pretty appealing. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > So let's try to hammer something out now. The obvious thing that > occurs to me is to have a column in the catalog that is a 2-D array of > text, with the first element of each array being something like "tag" > or "subtag" (i.e. event_trigger_variable) and the remaining array > elements being a list of legal values. That is: > WHEN thingy IN thingy IN ('item1', 'item2') AND otherthingy IN ('foo', 'bar') > would be represented as this array: > {{thingy,item1,item2},{otherthingy,foo,bar}} Um, doesn't that require nonrectangular arrays? Or is there some non-obvious reason why the lists of legal values will always be all the same length? regards, tom lane
Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes: > I don't foresee more generic needs here, unless you can convince me that > we need both a. a full stack of arbitrarily nested commands and b. a way > to match and target any level of that stack. Um ... isn't the burden of proof the other way around here? That is, what's the argument that we *don't* need that? regards, tom lane
On Mon, Jul 2, 2012 at 6:59 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> So let's try to hammer something out now. The obvious thing that >> occurs to me is to have a column in the catalog that is a 2-D array of >> text, with the first element of each array being something like "tag" >> or "subtag" (i.e. event_trigger_variable) and the remaining array >> elements being a list of legal values. That is: > >> WHEN thingy IN thingy IN ('item1', 'item2') AND otherthingy IN ('foo', 'bar') > >> would be represented as this array: > >> {{thingy,item1,item2},{otherthingy,foo,bar}} > > Um, doesn't that require nonrectangular arrays? Or is there some > non-obvious reason why the lists of legal values will always be all the > same length? Doh. You're right: I keep forgetting that arrays have to be rectangular. Any suggestions on a sensible way to represent this? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Mon, Jul 2, 2012 at 6:59 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Um, doesn't that require nonrectangular arrays? > Doh. You're right: I keep forgetting that arrays have to be rectangular. > Any suggestions on a sensible way to represent this? Are there likely to be enough entries that storage efficiency actually matters? If not, we could use a 2xN array of {key,allowed_value} pairs, that is {{thingy,item1},{thingy,item2},{otherthingy,foo},{otherthingy,bar}} Or perhaps push these out into a separate table, along the lines ofoid key allowed_value and use an oidvector to list the selected values in a trigger entry? regards, tom lane
Robert Haas <robertmhaas@gmail.com> writes: >> Given what I foresee, simply having another columns in there named >> evtstags with the exact same content as evttags would be the simplest >> and most natural implementation, really. > > That seems a lot less general for no particular gain. The gain is code, docs and usage simplification. I think going general here is going to confuse every one involved and that we should have two targets in mind: classic use cases that we want to address easily enough in SQL and with some PLpgSQL help, and advanced use cases that are possible to implement in PL/C using the parse tree and the soon to come back rewritten command string. IOW, let's make the simple things simple and the complex one possible. The following is quite long an email where I try to give plenty of examples and to detail the logic I'm working with so that you can easily stab at whichever part you're thinking is not going to fly. >> I don't foresee more generic needs here, unless you can convince me that >> we need both a. a full stack of arbitrarily nested commands and b. a way >> to match and target any level of that stack. > > Well, let's take the example of an ALTER TABLE command. You could > want to match on: > > - the type of object the user mentioned in the command (did he write > ALTER TABLE or ALTER VIEW?) > - the type of object actually being modified (could differ if he used I don't think it's possible to implement that without shaking all the system, after having a look at the following lines from gram.y: ALTER VIEW qualified_name alter_table_cmds{ AlterTableStmt *n = makeNode(AlterTableStmt); So, the way to implement that need from an event trigger is to use the parse tree, and hopefully soon enough the rewritten command string. > ALTER TABLE on a view, or visca versa) > - the particular ALTER TABLE subcommand in use (e.g. SET STATISTICS) Now we can publish that, we would have some events with tag = 'ALTER TABLE' then some others with toplevel = 'ALTER TABLE'tag = 'SET STATISTICS' The same idea would need to get implemented for serial, where the tag is 'CREATE SEQUENCE' and the toplevel tag is 'CREATE TABLE'. That allows to easily install an event trigger that gets called every time a sequence is created, you can then have a look at the toplevel command tag if you need to. CREATE EVENT TRIGGER snitch_seqs ON command_start WHEN tag IN ('CREATE SEQUENCE') EXECUTEPROCEDURE snitch_seqs(); The idea is that the function snitch_seqs has a "magic" variable TG_TOPLEVEL that can be tested and will be set to 'CREATE TABLE' when we're dealing with a SERIAL, in that example. If you want your event trigger to only ever deal with SERIAL, you could install it this way: CREATE EVENT TRIGGER my_serial_trigger ON command_start WHEN toplevel IN ('CREATE TABLE') AND tag IN ('CREATE SEQUENCE') EXECUTE PROCEDURE handle_serial_trigger(); Now let's see about getting more generic than that. We also can get tag = 'CREATE INDEX' and toplevel = 'ALTER TABLE' when adding a primary key for example. That's an example that can lead us to more than 2 levels of nested tags, which I would want to avoid. The stack here would look like: 1. ALTER TABLE2. ADD PRIMARY KEY3. CREATE INDEX I think only having 1 and 3 is enough, for more details the command string and the parse tree are available. In the main use case of replication, you mostly just want to replicate the command string. You might want to apply some transformation rules to it (table names, cope with a different target schema, etc) but typically those rules are to be run in the subscriber system, not on the provider (picture a federating system where each provider uses the same schema, that gets mapped to a schema per provider on the subscriber). The other problem with the stack of tags is matching them. I don't see that it helps writing event triggers much. In the previous example, if you want an event trigger that fires on each ALTER TABLE, you don't know which level of the stack to target. Either you have to target the current tag or the toplevel tag or something in between. We could easily have the following tag stack: 1. CREATE SCHEMA2. ALTER TABLE3. ADD PRIMARY KEY4. CREATE INDEX So now we need a way to target any entry in the stack and a way to represent the stack in every PL language we have, and an easy way to analyze the stack. For PLpgSQL I guess that means we want to expose this tag stack as a TABLE, and the complexity just went off the table. My view point is that for any practical case I can think about what we care about is the current command being run, and given how PostgreSQL is made today that means handling one level of sub commands. That addresses ALTER TABLE and also DROP CASCADE. I don't think adding-in an ALTER TABLE that never happened in the middle of those two elements is going to make life easier for anybody involved, quite the contrary: 1. DROP TYPE2. DROP COLUMN Users that need that level of detail for their processing are welcome to code their Event Trigger in PL/C and analyze the parse tree. We can call that advanced analysis. > I suspect there are other examples as well. If we pick the 2-D list > representation I suggested, or something like it, we can easily > accommodate these kinds of filters without having to whack the catalog > representation around any further. That seems pretty appealing. The generic approach leads us to invent a stack of tags and (I suspect) a DSL for tag matching where you can express at least those different things: - this tag is found in the stack (tag <@ stack)- this other tag is found higher in the stack- this other tag is found justone level higher in the stack- this other tag is found at least 2 levels higher in the stack- this third tag is foundlower in the stack- this third tag is found just one level lower in the stack- and maybe some more Those semantics are going to be needed, either in the event trigger definition itself, or in the event trigger code. We could expose a stack of tags and ditch the WHEN clause here, but we still have to be able to implement the filtering in PLpgSQL for simple cases. If we're not able to express such detailed semantics I don't think we're servicing users by making things way more complex for them to use. The drawback is that we will have to make choices as to which tag we expose exactly, remembering that all the details are to be found in the parse tree and the rewritten command string. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Mon, Jul 2, 2012 at 11:25 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Mon, Jul 2, 2012 at 6:59 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Um, doesn't that require nonrectangular arrays? > >> Doh. You're right: I keep forgetting that arrays have to be rectangular. > >> Any suggestions on a sensible way to represent this? > > Are there likely to be enough entries that storage efficiency actually > matters? If not, we could use a 2xN array of {key,allowed_value} pairs, > that is > > {{thingy,item1},{thingy,item2},{otherthingy,foo},{otherthingy,bar}} > > Or perhaps push these out into a separate table, along the lines > of > oid key allowed_value > and use an oidvector to list the selected values in a trigger entry? It seems likely that there will fairly commonly be many allowed values per key. However, the universe of legal keys will be quite small, probably a total of 2-4. So maybe the best representation is to have an a separate column for each key and store the array of legal values in it. That's more or less what Dimitri already has in his latest patch, except that after looking it over I'm inclined to think that we'd be better off storing the keys as text and translating to internal ID numbers when we read and cache the table, rather than storing the ID numbers in the table. That will make the catalog contents easier to read, and more importantly will mean that a change to the internal ID numbers need not be initdb-forcing. Thoughts? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > in it. That's more or less what Dimitri already has in his latest > patch, except that after looking it over I'm inclined to think that > we'd be better off storing the keys as text and translating to > internal ID numbers when we read and cache the table, rather than > storing the ID numbers in the table. That will make the catalog > contents easier to read, and more importantly will mean that a change > to the internal ID numbers need not be initdb-forcing. Well that's what I had in previous versions of the patch, where the code was dealing directly with command tags. If we store text in catalogs and given that we already have the command tag as text, I suppose we would get back to only managing tags as text in the cache key and the rest of the code. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Tue, Jul 3, 2012 at 11:52 AM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> in it. That's more or less what Dimitri already has in his latest >> patch, except that after looking it over I'm inclined to think that >> we'd be better off storing the keys as text and translating to >> internal ID numbers when we read and cache the table, rather than >> storing the ID numbers in the table. That will make the catalog >> contents easier to read, and more importantly will mean that a change >> to the internal ID numbers need not be initdb-forcing. > > Well that's what I had in previous versions of the patch, where the code > was dealing directly with command tags. If we store text in catalogs and > given that we already have the command tag as text, I suppose we would > get back to only managing tags as text in the cache key and the rest of > the code. Yeah, I'm of two minds on that. I thought that it made sense to use integer identifiers internally for speed, but now I'm worried that the effort to translate back and forth between strings and integers is going to end up being more than any speed we might save. But I'm still not certain which way is best: maybe your original idea of using strings everywhere will prove to be the winner, but on the other hand I'm not convinced that the code as you have it written today is as efficient as it could be. At any rate, whether or not we end up using strings or integers inside the backend-local caches, I'm inclined to think that it's better to store strings in the catalog, so we'd end up with something like this: CATALOG(pg_event_trigger,3466) { NameData evtname; /* trigger's name */ int16 evtevent; /* trigger's event*/ Oid evtfoid; /* OID of function to be char evtenabled; /* trigger's firing configuratio *session_repli #ifdef CATALOG_VARLEN text evttags[1]; /* command TAGs this event trigger targe #endif } If there's no filter on tags, then I think we should let evttags = NULL. If in the future we have filtering that's not based on tags, we'd add, e.g. "text evtshushiness[1]" if we were going to filter based on smushiness level. We'd set evtsmushiness = NULL if there's no filtering by smushiness, or else an array of the smushiness levels that fire that trigger. Does that work for you? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > Yeah, I'm of two minds on that. I thought that it made sense to use > integer identifiers internally for speed, but now I'm worried that the > effort to translate back and forth between strings and integers is > going to end up being more than any speed we might save. We do that for nodetags in stored query/expression trees, and I've never seen any indication that it costs enough to notice. It's definitely a huge win for anything that changes regularly, which seems like it'll be a pretty good description of event tags for at least the next few years. regards, tom lane
On Tue, Jul 3, 2012 at 12:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> Yeah, I'm of two minds on that. I thought that it made sense to use >> integer identifiers internally for speed, but now I'm worried that the >> effort to translate back and forth between strings and integers is >> going to end up being more than any speed we might save. > > We do that for nodetags in stored query/expression trees, and I've never > seen any indication that it costs enough to notice. It's definitely a > huge win for anything that changes regularly, which seems like it'll be > a pretty good description of event tags for at least the next few years. Good analogy. In that case, as in what I'm proposing here, we use integers in memory and text on disk. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > On Tue, Jul 3, 2012 at 12:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Robert Haas <robertmhaas@gmail.com> writes: >>> Yeah, I'm of two minds on that. I thought that it made sense to use >>> integer identifiers internally for speed, but now I'm worried that the >>> effort to translate back and forth between strings and integers is >>> going to end up being more than any speed we might save. >> >> We do that for nodetags in stored query/expression trees, and I've never >> seen any indication that it costs enough to notice. It's definitely a >> huge win for anything that changes regularly, which seems like it'll be >> a pretty good description of event tags for at least the next few years. > > Good analogy. In that case, as in what I'm proposing here, we use > integers in memory and text on disk. New patch for that tomorrow. I assume we agree on having a column per extra variable, I'm not sure about already including full support for the toplevel one. With some luck I'll have news from you about that before sending the next revision of the patch, which then would include the int16 evttoplevel[1] column. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Tue, Jul 3, 2012 at 1:31 PM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Tue, Jul 3, 2012 at 12:18 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> Robert Haas <robertmhaas@gmail.com> writes: >>>> Yeah, I'm of two minds on that. I thought that it made sense to use >>>> integer identifiers internally for speed, but now I'm worried that the >>>> effort to translate back and forth between strings and integers is >>>> going to end up being more than any speed we might save. >>> >>> We do that for nodetags in stored query/expression trees, and I've never >>> seen any indication that it costs enough to notice. It's definitely a >>> huge win for anything that changes regularly, which seems like it'll be >>> a pretty good description of event tags for at least the next few years. >> >> Good analogy. In that case, as in what I'm proposing here, we use >> integers in memory and text on disk. > > New patch for that tomorrow. I assume we agree on having a column per > extra variable, Yes. > I'm not sure about already including full support for > the toplevel one. With some luck I'll have news from you about that > before sending the next revision of the patch, which then would include > the int16 evttoplevel[1] column. No, I'm not asking you to add any more columns right now (in fact, please do not). But the type of the existing column should change to text[]. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > No, I'm not asking you to add any more columns right now (in fact, > please do not). But the type of the existing column should change to > text[]. Ok, done in the attached. We now read text from either the user input in the grammar or from the catalogs (in a 1-D array there), and convert to our enum structure for internal code use. In fact I attached both the new patch version and the incremental patch on top of the v1.3 you had before, I suspect that might make life easier for you. It's of course browsable online too at this URL: https://github.com/dimitri/postgres/commit/a8cf89116ae7b6e51c8a580510612b85caae2d38 Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Attachment
Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes: > Robert Haas <robertmhaas@gmail.com> writes: >> No, I'm not asking you to add any more columns right now (in fact, >> please do not). But the type of the existing column should change to >> text[]. > > Ok, done in the attached. We now read text from either the user input in > the grammar or from the catalogs (in a 1-D array there), and convert to > our enum structure for internal code use. In the previous one I missed adapting pg_dump and psql, here's the updated set of patches. Sorry about that. This unexpected electrical shut down in the middle of the day was not cool. Nor was the second one. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Attachment
On Wed, Jul 4, 2012 at 1:56 PM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes: >> Robert Haas <robertmhaas@gmail.com> writes: >>> No, I'm not asking you to add any more columns right now (in fact, >>> please do not). But the type of the existing column should change to >>> text[]. >> >> Ok, done in the attached. We now read text from either the user input in >> the grammar or from the catalogs (in a 1-D array there), and convert to >> our enum structure for internal code use. > > In the previous one I missed adapting pg_dump and psql, here's the > updated set of patches. Sorry about that. This unexpected electrical > shut down in the middle of the day was not cool. Nor was the second one. Thanks. There are some review comments from previous rounds that don't seem to have been fully incorporated to this version: - You only changed the definition of pg_event_triggers.evttags, not the documentation. I don't think we need pg_evttag_to_string aka pg_event_trigger_command_to_string any more either. - When you removed format_type_be_without_namespace, you didn't remove either the function prototype or the related changes in format_type.c. - I'm pretty sure that a previous review mentioned the compiler warning in evtcache.c, which seems not to be fixed. It doesn't look harmless, either. - The definitions of EVTG_FIRED_* are still present in pg_event_trigger.h, even though they're longer used anywhere. - The comments in parsenodes.h still refer to command triggers rather than event triggers. event_trigger.h contains a reference to CommandTriggerData that should say EventTriggerData. plperl.sgml has vestiges of the old terminology as well. In terms of the schema itself, I think we are almost there, but: - I noticed while playing around with this that pg_dump emits a completely empty owner field when dumping an event trigger. At first I thought that was just an oversight, but then I realized there's a deeper reason for it: pg_event_trigger doesn't have an owner field. I think it should. The only other objects in the system that don't have owners are text search parsers and text search templates (and casts, sort of). It might seem redundant to have an owner even when event-triggers are superuser-only, but we might want to try to relax that restriction later. Note that foreign data wrappers, which are also superuser-create-only, do have an owner. (Note that if we give event triggers an owner, then we also need ALTER .. OWNER TO support for them.) - It seems pretty clear at this point that the cache you've implemented in src/backend/utils/cache/evtcache.c is going to do all the heavy lifting of converting the stored catalog representation to a form that is suitable for quick access. Given that, I wonder whether we should go whole hog and change evtevent to a text field. This would simplify things for pg_dump and we could get rid of pg_evtevent_to_string, at a cost of doing marginally more work when we rebuild the event cache (but who really cares, given that you're reading the entire table every time you rebuild the cache anyway?). Thoughts? Some other minor comments: - In the \dy output, command_start is referred to as the "condition", but the documentation calls it an "event" and the grammar calls it "event_name". "event" or "event_name" seems better, so I think this is just a matter of changing psql to match. - AlterEventTrigStmt defines tgenbled as char *; I think it should just be char. In gram.y, you can declare the enable_trigger production as <chr> (or <ival>?) rather than <str>. At any rate pstrdup(stmt->tgenabled)[0] doesn't look right; you don't need to copy a string to fetch the first byte. - Why did you create a separate file pg_event_trigger_fn.h instead of just including that single prototype in pg_event_trigger.h? - The EVENTTRIGGEROID syscache is used only in RemoveEventTriggerById.I don't think that's a sufficient justification foreating up more memory for another system cache. I think you should just remove that cache and recode RemoveEventTriggerById to find the correct tuple via an index scan. See RemoveEventTriggerById for an example. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Hi, Please find attached v6 of the patch, fixing all your comments here. Sorry about missing some obvious things in the making of this patch. Given that I had to merge master again, I can't easily produce an incremental patch on top of last version, so you only have the full patch attached. Robert Haas <robertmhaas@gmail.com> writes: > - You only changed the definition of pg_event_triggers.evttags, not > the documentation. I don't think we need pg_evttag_to_string aka > pg_event_trigger_command_to_string any more either. Done. Removed now useless exposed functions. > - When you removed format_type_be_without_namespace, you didn't remove > either the function prototype or the related changes in format_type.c. Fixed. > - I'm pretty sure that a previous review mentioned the compiler > warning in evtcache.c, which seems not to be fixed. It doesn't look > harmless, either. I'm failing to reproduce it here, in -O0. […Trying with the -O2 default…] I got the error in -O2, fixed in the attached. > - The definitions of EVTG_FIRED_* are still present in > pg_event_trigger.h, even though they're longer used anywhere. Fixed. > - The comments in parsenodes.h still refer to command triggers rather > than event triggers. event_trigger.h contains a reference to > CommandTriggerData that should say EventTriggerData. plperl.sgml has > vestiges of the old terminology as well. Fixed. I only found a single such vestige in plperl.sgml though. > In terms of the schema itself, I think we are almost there, but: > > - I noticed while playing around with this that pg_dump emits a > completely empty owner field when dumping an event trigger. At first > I thought that was just an oversight, but then I realized there's a > deeper reason for it: pg_event_trigger doesn't have an owner field. I > think it should. The only other objects in the system that don't have > owners are text search parsers and text search templates (and casts, > sort of). It might seem redundant to have an owner even when > event-triggers are superuser-only, but we might want to try to relax > that restriction later. Note that foreign data wrappers, which are > also superuser-create-only, do have an owner. (Note that if we give > event triggers an owner, then we also need ALTER .. OWNER TO support > for them.) Damn, I had it on my TODO and Álvaro hinted me already, and I kept forgetting about it nonetheless. Fixed now. > - It seems pretty clear at this point that the cache you've > implemented in src/backend/utils/cache/evtcache.c is going to do all > the heavy lifting of converting the stored catalog representation to a > form that is suitable for quick access. Given that, I wonder whether > we should go whole hog and change evtevent to a text field. This > would simplify things for pg_dump and we could get rid of > pg_evtevent_to_string, at a cost of doing marginally more work when we > rebuild the event cache (but who really cares, given that you're > reading the entire table every time you rebuild the cache anyway?). Agreed, done that way (using a NameData fixed width field). > - In the \dy output, command_start is referred to as the "condition", > but the documentation calls it an "event" and the grammar calls it > "event_name". "event" or "event_name" seems better, so I think this > is just a matter of changing psql to match. Fixed. > - AlterEventTrigStmt defines tgenbled as char *; I think it should > just be char. In gram.y, you can declare the enable_trigger > production as <chr> (or <ival>?) rather than <str>. At any rate > pstrdup(stmt->tgenabled)[0] doesn't look right; you don't need to copy > a string to fetch the first byte. D'oh. Sure. Done. > - Why did you create a separate file pg_event_trigger_fn.h instead of > just including that single prototype in pg_event_trigger.h? To mimic some existing files, fixed. > - The EVENTTRIGGEROID syscache is used only in RemoveEventTriggerById. > I don't think that's a sufficient justification for eating up more > memory for another system cache. I think you should just remove that > cache and recode RemoveEventTriggerById to find the correct tuple via > an index scan. See RemoveEventTriggerById for an example. It's now used in more places (dependencies, alter owner), so thinking that it's pulling its weight now with 3 call sites I've left it alone. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Attachment
On Thu, Jul 5, 2012 at 5:31 PM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > [ new patch ] Attached is a incremental patch with a bunch of minor cleanups, including reverts of a few spurious white space changes. Could you merge this into your version? I have some concerns about pg_dump: 1. Can we spell out EvtTriggerInfo, getEvtTriggers, and dumpEvtTrigger as EventTriggerInfo, getEventTriggers, and dumpEventTrigger? 2. I don't think that this code properly handles older server versions. First, the version test in getEvtTriggers checks against 90200 instead of 90300. Second, when running against a too-old server version, this is going to try to execute an empty query and then parse the results. I think you want to restructure this code so that it just plain old returns if the server is too old. See for example getForeignDataWrappers. 3. The way you're generating evtfname is unsafe if either the schema or the function name contains SQL characters. I think this should probably be casting the function OID to regproc in lieu of rolling its own formatting code. Also, the tags should probably be escaped using quote_literal, just to be future-proof. 4. I think we should aim to generate all the SQL in upper case. Right now "CREATE EVENT TRIGGER" and "EXECUTE PROCEDURE" are in upper case but "when tag in" is in lower case. In psql, I think we should similarly have listEventTriggers() rather than listEvtTriggers(); here as in pg_dump I think you should cast the evtfoid to regproc to get schema-qualification and escaping, in lieu of the explicit join. >> In terms of the schema itself, I think we are almost there, but: >> >> - I noticed while playing around with this that pg_dump emits a >> completely empty owner field when dumping an event trigger. At first >> I thought that was just an oversight, but then I realized there's a >> deeper reason for it: pg_event_trigger doesn't have an owner field. I >> think it should. The only other objects in the system that don't have >> owners are text search parsers and text search templates (and casts, >> sort of). It might seem redundant to have an owner even when >> event-triggers are superuser-only, but we might want to try to relax >> that restriction later. Note that foreign data wrappers, which are >> also superuser-create-only, do have an owner. (Note that if we give >> event triggers an owner, then we also need ALTER .. OWNER TO support >> for them.) > > Damn, I had it on my TODO and Álvaro hinted me already, and I kept > forgetting about it nonetheless. Fixed now. evtowner seems to have missed the documentation bus. With respect to the documentation, keep in mind that the synopsis is going to show up in the command line help for \h. I'm thinking that the full list of command tags is too long for that, and that we should instead rearrange the page so that the list of supported commands is outside the synopsis. The synposis is also somewhat misleading, I think, in that "variable in (tag, ...)" conveys the idea that no matter what the variable is, the items in parentheses will surely be tags. I suggest that we say something like "filter_variable in (filter_value, ...)" and then document in the text that filter_variable must currently always be TAG, and that the legal values for filter_value are dependent on the choice of filter_variable, and the legal choices for TAG are those listed in the following table: <splat>. The documentation contains the following claim with which I'm extremely uncomfortable: + <para> + Triggers on <literal>ANY</literal> command support more commands than + just this list, and will only provide the <literal>command + tag</literal> argument as <literal>NOT NULL</literal>. Supporting more + commands is made so that you can actually block <xref linkend="ddl"> + commands in one go. + </para> A minor issue is that there's no notion of ANY any more; it's just a consequence of leaving out the WHEN clause. The bigger issue is that I can't see any reason to do it that way. Surely if we're firing the trigger at all, we can arrange to have the command tag properly filled in so that we can filter by it. This might be a crazy idea, but... it seems like it would be awfully sweet if we could find a way to avoid having to translate between node tags (i.e. constants beginning with T_* that are used to identify the type of statement that is executing) and event tags (i.e. constants beginning with E_* that are used to identify the type of statement that is executing). Now obviously this is not quite possible because in some cases the choice of E_* constant depends on not only the node tag but also the type of object being operated on. However, it seems like this is a surmountable obstacle: since we no longer need to store the E_* constants in a system catalog, they don't really need to be integers. For example, we could define something like this: typedef struct { NodeTag nodetag; ObjectType objecttype; } NodeTagWithObjectType; ...and set the objecttype to a new OBJECT_DUMMY value or somesuch when the NodeTag is such that the ObjectType isn't relevant. If we did that, then all the E_* constants could go away, and InitEventContext() would become a lot simpler and, presumably, faster. It would just need to check whether it's got one of the node tags that needs the object-type filled in. If so, it does that; if not, it just sets the nodetag field to the statement's node-tag and the objecttype to OBJECT_DUMMY, and it's done. Well, OK, it's probably not quite that simple... but it still seems like we'd need explicit handling of a smaller number of cases than presently. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Attachment
Robert Haas <robertmhaas@gmail.com> writes: > Attached is a incremental patch with a bunch of minor cleanups, > including reverts of a few spurious white space changes. Could you > merge this into your version? Thank you very much for that, yes it's included now. So you have 3 attachments here, the whole new patch revision (v1.7), the incremental patch to go from 1.6 to 1.7 and the incremental patch that should apply cleanly on top of your cleanups. > 1. Can we spell out EvtTriggerInfo, getEvtTriggers, and dumpEvtTrigger > as EventTriggerInfo, getEventTriggers, and dumpEventTrigger? Done. > 2. I don't think that this code properly handles older server > versions. Fixed. > 3. The way you're generating evtfname is unsafe if either the schema Fixed using regproc cast, both in pg_dump and in psql. > 4. I think we should aim to generate all the SQL in upper case. Right > now "CREATE EVENT TRIGGER" and "EXECUTE PROCEDURE" are in upper case > but "when tag in" is in lower case. Oh, sure, done. > In psql, I think we should similarly have listEventTriggers() rather > than listEvtTriggers(); here as in pg_dump I think you should cast the > evtfoid to regproc to get schema-qualification and escaping, in lieu > of the explicit join. Done. > evtowner seems to have missed the documentation bus. I'm told it finally did catch it. > With respect to the documentation, keep in mind that the synopsis is > going to show up in the command line help for \h. I'm thinking that > the full list of command tags is too long for that, and that we should > instead rearrange the page so that the list of supported commands is > outside the synopsis. The synposis is also somewhat misleading, I > think, in that "variable in (tag, ...)" conveys the idea that no > matter what the variable is, the items in parentheses will surely be > tags. I suggest that we say something like "filter_variable in > (filter_value, ...)" and then document in the text that > filter_variable must currently always be TAG, and that the legal > values for filter_value are dependent on the choice of > filter_variable, and the legal choices for TAG are those listed in the > following table: <splat>. I tried to arrange something here, I'm not quite sure about the current result but it's already much better than the previous version. Articulating ideas that way also allows to begin that command / events support matrix, as you can see in the attached. > The documentation contains the following claim with which I'm > extremely uncomfortable: […ANY command set…] It's a vestige from a long time ago, I removed that and some other verbiage now. > I can't see any reason to do it that way. Surely if we're firing the > trigger at all, we can arrange to have the command tag properly filled > in so that we can filter by it. Indeed, command tag is always available. The magic trigger variables might not all be, though, that was what this text was alluding to, but that case is already covered in the specific PL docs. The part that we will certainly have to re-install later is when we add new event "integration points" that we can only implement in some of the supported commands, not all of them. Think about command_end and CLUSTER or other commands managing the transaction themselves (concurrently). But that's not at all relevant to what's in this reduced v1 patch. > This might be a crazy idea, but... it seems like it would be awfully > sweet if we could find a way to avoid having to translate between node > tags (i.e. constants beginning with T_* that are used to identify the > type of statement that is executing) and event tags (i.e. constants > beginning with E_* that are used to identify the type of statement > that is executing). Now obviously this is not quite possible because > in some cases the choice of E_* constant depends on not only the node > tag but also the type of object being operated on. However, it seems > like this is a surmountable obstacle: since we no longer need to store > the E_* constants in a system catalog, they don't really need to be > integers. For example, we could define something like this: All of that happens in InitEventContext(). We can see in there that we wouldn't gain much, the great majority of this code is dealing with cases where we have no symmetry at all. Also I don't think that the nodeTag macro is expensive, nor is a 2-levels nested switch statement. So while I would enjoy seeing that part simplified, I don't think your angle of attack would provide us much progress here… Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Attachment
On Fri, Jul 6, 2012 at 7:21 AM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> Attached is a incremental patch with a bunch of minor cleanups, >> including reverts of a few spurious white space changes. Could you >> merge this into your version? > > Thank you very much for that, yes it's included now. So you have 3 > attachments here, the whole new patch revision (v1.7), the incremental > patch to go from 1.6 to 1.7 and the incremental patch that should apply > cleanly on top of your cleanups. Here is an incremental documentation patch which I hope you will like. I made the event triggers stuff its own chapter rather than trying to fold it in under triggers, and added some more detail. It's already quite a bit of extra stuff, and it's only going to become more as we expand this feature, so I think a separate chapter is appropriate. I moved a bunch of the details that were under CREATE EVENT TRIGGER into this new chapter, which I think is a better location, and reformatted the matrix somewhat. I think as we add more firing points it will be clearer and easier to read if we have all the commands arranged in columns rather than listing a bunch of firing points on each line. I also made a bunch of minor edits to improve readability and improve the English (which wasn't bad, but I touched it up a bit); and I tried to add some extra detail here and there (some of it recycled from previous patch versions). Assuming this all seems reasonably agreeable, can you merge it on your side? This took the last several hours, so I haven't looked at your latest code changes yet. However, in the course of editing the documentation, it occurred to me that we seem to be fairly arbitrarily excluding a large number of commands from the event trigger mechanism. For example, GRANT and REVOKE. In earlier patches, we needed specific changes for every command, so there was some reason not to try to support everything right out of the gate. But ISTM that the argument for this is much less now; presumably it's just a few extra lines of code per command, so maybe we ought to go ahead and try to make this as complete as possible. I attempt to explain in the attached patch the reasons why we don't support certain classes of commands, but I can't come up with any explanation for supporting GRANT and REVOKE that doesn't fall flat. I can't even really see a reason not to support things like LISTEN and NOTIFY, and it would certainly be more consistent with the notion of a command_start trigger to support as many commands as we can. I had an interesting experience while testing this patch. I accidentally redefined my event trigger function to something which errored out. That of course precluded me from using CREATE OR REPLACE FUNCTION to fix it. This makes me feel rather glad that we decided to exclude CREATE/ALTER/DROP EVENT TRIGGER from the event trigger mechanism, else recovery would have had to involve system catalog hackery. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Attachment
Robert Haas <robertmhaas@gmail.com> writes: > Here is an incremental documentation patch which I hope you will like. Definitely, it's better this way. I'm not thrilled with separating it into its own top level chapter, but I can see how it makes sense indeed. This part is strange though: + A trigger definition can also specify a <literal>WHEN</literal> + condition so that, for example, a <literal>command_start</literal> + tag can be fired only for particular commands which the user wishes + to intercept. A common use of such triggers is to restrict the range of + DDL operations which users may perform. I don't think of that as firing a command tag, so it's hard for me to parse that sentence. > the matrix somewhat. I think as we add more firing points it will be > clearer and easier to read if we have all the commands arranged in > columns rather than listing a bunch of firing points on each line. I +1 > also made a bunch of minor edits to improve readability and improve > the English (which wasn't bad, but I touched it up a bit); and I tried > to add some extra detail here and there (some of it recycled from > previous patch versions). Assuming this all seems reasonably > agreeable, can you merge it on your side? Done, thanks ! > This took the last several hours, so I haven't looked at your latest > code changes yet. However, in the course of editing the > documentation, it occurred to me that we seem to be fairly arbitrarily > excluding a large number of commands from the event trigger mechanism. As many as that? I'm surprised about the quantity. Yes I did not add all and any command we have, on purpose, and I agree that the new turn of things allow us to add a new set. > For example, GRANT and REVOKE. In earlier patches, we needed > specific changes for every command, so there was some reason not to > try to support everything right out of the gate. But ISTM that the > argument for this is much less now; presumably it's just a few extra > lines of code per command, so maybe we ought to go ahead and try to > make this as complete as possible. I attempt to explain in the Will do soon™. > attached patch the reasons why we don't support certain classes of > commands, but I can't come up with any explanation for supporting > GRANT and REVOKE that doesn't fall flat. I can't even really see a > reason not to support things like LISTEN and NOTIFY, and it would > certainly be more consistent with the notion of a command_start > trigger to support as many commands as we can. I would think that NOTIFY is on a fast track not to be disturbed by calling into used defined code, and that would explain why we don't support event triggers here. > I had an interesting experience while testing this patch. I > accidentally redefined my event trigger function to something which > errored out. That of course precluded me from using CREATE OR REPLACE > FUNCTION to fix it. This makes me feel rather glad that we decided to > exclude CREATE/ALTER/DROP EVENT TRIGGER from the event trigger > mechanism, else recovery would have had to involve system catalog > hackery. Yeah, we have some places were it's not very hard to shoot oneself in the foot, here the resulting hole is a little too big and offers no real benefits. Event triggers on create|alter|drop event triggers, really? Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Fri, Jul 6, 2012 at 12:00 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Fri, Jul 6, 2012 at 7:21 AM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: >> Robert Haas <robertmhaas@gmail.com> writes: >>> Attached is a incremental patch with a bunch of minor cleanups, >>> including reverts of a few spurious white space changes. Could you >>> merge this into your version? >> >> Thank you very much for that, yes it's included now. So you have 3 >> attachments here, the whole new patch revision (v1.7), the incremental >> patch to go from 1.6 to 1.7 and the incremental patch that should apply >> cleanly on top of your cleanups. > > Here is an incremental documentation patch which I hope you will like. And here is another incremental patch, this one doing some more cleanup. Some of this is cosmetic, but it also: - Fixes the new event_trigger type so that it passes the type sanity test, instead of adding the failure as expected output. - Fixes DROP EVENT TRIGGER IF EXISTS on a non-existent trigger. - Fleshes out the ownership handling so that it's more similar to what we do for other types of objects. I'm feeling pretty good about this at this point, although I think there is still some more work to do before we call it done and go home. I have a large remaining maintainability concern about the way we're mapping back and forth between node tags, event tags, and command tags. Right now we've got parse_event_tag, which parses something like 'ALTER AGGREGATE' into E_AlterAggregate; and then we've got command_to_string, which turns E_AlterAggregate back into 'ALTER AGGREGATE', and then we've got InitEventContext(), which turns T_RenameStmt or T_AlterObjectSchemaStmt with OBJECT_AGGREGATE into E_AlterAggregate. I can't easily verify that all three of these things are consistent with each other, and even if they are right now I estimate the chances of that remaining true as other people patch the code as near-zero. You didn't like my last proposal for dealing with this, which is fine: it might not have been the best way of dealing with it. But I think we have to figure out something better than what we've got now, or this is almost guaranteed to get broken. If you don't have a brilliant idea I'll hack on it and see what I can come up with. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Attachment
Robert Haas <robertmhaas@gmail.com> writes: > And here is another incremental patch, this one doing some more > cleanup. Some of this is cosmetic, but it also: Thanks, applied in my github repository! > I'm feeling pretty good about this at this point, although I think > there is still some more work to do before we call it done and go > home. Nice reading that :) > I have a large remaining maintainability concern about the way we're > mapping back and forth between node tags, event tags, and command > tags. Right now we've got parse_event_tag, which parses something […valid concern…] > If you don't have a brilliant idea I'll hack on it and see what I can > come up with. I think we might be able to install a static array for the setup where we would find the different elements, and then code up some procedures doing different kind of look ups in that array. > like 'ALTER AGGREGATE' into E_AlterAggregate; and then we've got > command_to_string, which turns E_AlterAggregate back into 'ALTER > AGGREGATE', and then we've got InitEventContext(), which turns > T_RenameStmt or T_AlterObjectSchemaStmt with OBJECT_AGGREGATE into > E_AlterAggregate. I can't easily verify that all three of these { E_AlterAggregate, // TrigEventCommand "ALTER AGGREGATE", // command tag T_RenameStmt, // nodeTag -1 // object type }, { E_AlterAggregate, "ALTER AGGREGATE", T_AlterObjectSchemaStmt, OBJECT_AGGREGATE } The problem is coming up with a way of writing the code that does not incur a full array scan for each step of parsing or rewriting. And I don't see that it merits yet another cache. Given the existing event trigger cache it might be that we don't care about having a full scan of this table twice per event trigger related commands, as I don't think it would happen when executing other DDLs. Scratch that we need to parse command tags when we build the event cache, so scanning the full array each time would make that O(n²) and we want to avoid that. So we could install the contents of the array in another hash table in BuildEventTriggerCache() then use that to lookup the TrigEventCommand from the command tag… Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On Fri, Jul 6, 2012 at 3:29 PM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: >> Here is an incremental documentation patch which I hope you will like. > > Definitely, it's better this way. I'm not thrilled with separating it > into its own top level chapter, but I can see how it makes sense indeed. Oh, really? I thought that was a huge readability improvement. There are some things that are the same between triggers and event triggers, but there's an awful lotta stuff that is completely different. > This part is strange though: > > + A trigger definition can also specify a <literal>WHEN</literal> > + condition so that, for example, a <literal>command_start</literal> > + tag can be fired only for particular commands which the user wishes > + to intercept. A common use of such triggers is to restrict the range of > + DDL operations which users may perform. > > I don't think of that as firing a command tag, so it's hard for me to > parse that sentence. Oh, that should say "a command_start trigger" rather than "a command_start tag". Good catch. >> This took the last several hours, so I haven't looked at your latest >> code changes yet. However, in the course of editing the >> documentation, it occurred to me that we seem to be fairly arbitrarily >> excluding a large number of commands from the event trigger mechanism. > > As many as that? I'm surprised about the quantity. Yes I did not add all > and any command we have, on purpose, and I agree that the new turn of > things allow us to add a new set. I admit I didn't count them up. :-) Maybe there aren't that many. I think we might want to extend the support matrix to include every command that appears in sql-commands.html and have either an X if it's supported or blank if it's not. That would make it easier to judge how many commands are not supported, not just for us but for users who may be trying to answer the same questions we are. > I would think that NOTIFY is on a fast track not to be disturbed by > calling into used defined code, and that would explain why we don't > support event triggers here. If the DBA is allowed to restrict CREATE FUNCTION, why not NOTIFY? I guess I don't see why that one's deserving of special treatment. Mind you, if I ran the world, this would probably be broken up differently: I'd have ddl_command_start covering all the CREATE/ALTER/DROP commands and nothing else; and separate firing points for anything else I wanted to support. It's not too late to make that change, hint, hint. But if we're not gonna do that then I think that we'd better try to cast the net as broadly as reasonably possible. It seems to me that our excuse for not including things like UPDATE and DELETE is a bit thin; surely there are people who would like a sort of universal trigger that applies to every relation in the system. Of course there are recursion problems there that need to be thought long hard about, and no I don't really want to go there right now, but I'll bet you a nickle that someone is going to ask why it doesn't work that way. Another advantage to recasting this as ddl_command_start is that we quite easily pass the operation (CREATE, ALTER, DROP) and the named object type (TABLE, FUNCTION, CAST) as separate arguments. I think that would be a usability improvement, since it would then be dead easy to write an event trigger that prohibits DROP (and only DROP) of any sort. Of course it's not that hard to do it right now, but you have to parse the command tag. It would likely simplify the code for mapping between node and command tags, too. One other thought: if we're NOT going to do what I suggested above, then how about renaming TG_WHEN to TG_EVENT? Seems like that would fit better. Also: now that the E_WhatEver constants don't get stored on disk, I don't think they should live in pg_event_trigger.h any more; can we move them to someplace more appropriate? Possibly make them private to event_trigger.c? And, on a related note, I don't think it's a good idea to use E_ as a prefix for both the event types and the command tags. It's too short, and hard to grep for, and we don't want the same one for both, I think. How above things like EVT_CommandStart for the events and ECT_CreateAggregate for the command tags? >> I had an interesting experience while testing this patch. I >> accidentally redefined my event trigger function to something which >> errored out. That of course precluded me from using CREATE OR REPLACE >> FUNCTION to fix it. This makes me feel rather glad that we decided to >> exclude CREATE/ALTER/DROP EVENT TRIGGER from the event trigger >> mechanism, else recovery would have had to involve system catalog >> hackery. > > Yeah, we have some places were it's not very hard to shoot oneself in > the foot, here the resulting hole is a little too big and offers no real > benefits. Event triggers on create|alter|drop event triggers, really? Indeed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Fri, Jul 6, 2012 at 4:00 PM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> And here is another incremental patch, this one doing some more >> cleanup. Some of this is cosmetic, but it also: > > Thanks, applied in my github repository! Thanks. >> I have a large remaining maintainability concern about the way we're >> mapping back and forth between node tags, event tags, and command >> tags. Right now we've got parse_event_tag, which parses something > […valid concern…] >> If you don't have a brilliant idea I'll hack on it and see what I can >> come up with. > > I think we might be able to install a static array for the setup where > we would find the different elements, and then code up some procedures > doing different kind of look ups in that array. +1. >> like 'ALTER AGGREGATE' into E_AlterAggregate; and then we've got >> command_to_string, which turns E_AlterAggregate back into 'ALTER >> AGGREGATE', and then we've got InitEventContext(), which turns >> T_RenameStmt or T_AlterObjectSchemaStmt with OBJECT_AGGREGATE into >> E_AlterAggregate. I can't easily verify that all three of these > > { > E_AlterAggregate, // TrigEventCommand > "ALTER AGGREGATE", // command tag > T_RenameStmt, // nodeTag > -1 // object type > }, > { > E_AlterAggregate, > "ALTER AGGREGATE", > T_AlterObjectSchemaStmt, > OBJECT_AGGREGATE > } > > The problem is coming up with a way of writing the code that does not > incur a full array scan for each step of parsing or rewriting. And I > don't see that it merits yet another cache. Given the existing event > trigger cache it might be that we don't care about having a full scan of > this table twice per event trigger related commands, as I don't think it > would happen when executing other DDLs. > > Scratch that we need to parse command tags when we build the event > cache, so scanning the full array each time would make that O(n²) and we > want to avoid that. So we could install the contents of the array in > another hash table in BuildEventTriggerCache() then use that to lookup > the TrigEventCommand from the command tag… Ugh. Yeah, obviously the most important thing I think is that InitEventContext() needs to be lightning-fast, but we don't want BuildEventTriggerCache() to be pathologically slow either. I think the best thing to do with InitEventContext() might be to get rid of it. It's just a big switch over node tags, and we've already got one of those in standard_ProcessUtility. Maybe every case that already exists in that function should either (a) get a comment of the form /* does not support event triggers */ or (b) get a call of the form EventTriggerStartup(&evt, parsetree, E_WhateverCommandThisIs). EventTriggerStartup() could call InitEventContext() and then if CommandFiresTriggersForEvent(..., E_CommandStart) it could also call ExecEventTriggers(). This might seem like it's just moving the wood around, but if someone adds a new case in standard_ProcessUtility, they're going to model it on one of the existing cases, which greatly decreases the likelihood that they're going to screw it up. And if they do screw it up it will be obviously non-parallel to the rest of what's there, so somebody can notice and fix it. As a side benefit, this would probably be faster than having two separate switches that are executed more or less consecutively. Now that leaves the question of how to translate between E_AlterAggregate and "ALTER AGGREGATE"; I think your idea of a hash table (or two?) might be the most practical option. We'd only need to build the hash table(s) if an index-scan of pg_event_trigger finds it non-empty, and then only once per session. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Thom Brown <thom@linux.com> writes: > I also attach various typo/grammar fixes. In fact Robert's cleanup of the docs make that patch of yours not apply anymore, and I think a part of it is maybe already fixed. Do you have time to look at this with the new v1.8 patch that you will receive in a minute, or with the github branch if you're tracking that? Sorry about that. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Hi, Please find attached a new revision of the patch, v1.8, and an incremental diff from the previous one, that includes the patches you sent me. Robert Haas <robertmhaas@gmail.com> writes: > Oh, really? I thought that was a huge readability improvement. There > are some things that are the same between triggers and event triggers, > but there's an awful lotta stuff that is completely different. True. > Oh, that should say "a command_start trigger" rather than "a > command_start tag". Good catch. Fixed in the attached. > I think we might want to extend the support matrix to include every > command that appears in sql-commands.html and have either an X if it's > supported or blank if it's not. That would make it easier to judge > how many commands are not supported, not just for us but for users who > may be trying to answer the same questions we are. Yes, not done yet, see below. > Mind you, if I ran the world, this would probably be broken up > differently: I'd have ddl_command_start covering all the > CREATE/ALTER/DROP commands and nothing else; and separate firing > points for anything else I wanted to support. It's not too late to > make that change, hint, hint. But if we're not gonna do that then I Let's see about doing that. I guess we would have ddl_command_start and command_start, and I would think that the later is the most generic. So we would certainly want DDL commands to run first the command_start event triggers then the ddl_command_start event triggers, whereas a NOTIFY would only run command_start triggers, and a GRANT command would run maybe command_start then dcl_command_start triggers? If that's where we're going to, we can commit as-is and expand later. > think that we'd better try to cast the net as broadly as reasonably > possible. It seems to me that our excuse for not including things > like UPDATE and DELETE is a bit thin; surely there are people who > would like a sort of universal trigger that applies to every relation > in the system. Of course there are recursion problems there that need > to be thought long hard about, and no I don't really want to go there > right now, but I'll bet you a nickle that someone is going to ask why > it doesn't work that way. The current reason why we only support 149 SQL commands and variations is because we want a patch that's easy enough to review and agree on. So I think we will in the future be able to add new firing point at places where maybe some discussion is needed. Such places, in my mind, include the NOTIFY mechanism, DCLs, and global objects such as databases and tablespaces and roles. I'd be happy to see event triggers embrace support for those. Maybe in v2 though? > Another advantage to recasting this as ddl_command_start is that we > quite easily pass the operation (CREATE, ALTER, DROP) and the named > object type (TABLE, FUNCTION, CAST) as separate arguments. I think That's a good idea. I don't think we should replace the current tag support with that though, because some commands are harder to stow into the operation and type model (in supported commands, mainly LOAD). So I've included partial support for that in the attached patch, in the simplest way possible, just so that we can see where it leads in term of using the feature. The next step here is to actually go in each branch of the process utility switch and manually decorate the command context with the current operation and objecttype when relevant. > One other thought: if we're NOT going to do what I suggested above, > then how about renaming TG_WHEN to TG_EVENT? Seems like that would > fit better. That seems like the right thing to do in either case, done. > Also: now that the E_WhatEver constants don't get stored on disk, I > don't think they should live in pg_event_trigger.h any more; can we > move them to someplace more appropriate? Possibly make them private > to event_trigger.c? And, on a related note, I don't think it's a good Done, they now live in evtcache.h, where they belong. > idea to use E_ as a prefix for both the event types and the command > tags. It's too short, and hard to grep for, and we don't want the > same one for both, I think. How above things like EVT_CommandStart > for the events and ECT_CreateAggregate for the command tags? Done this way. Robert Haas <robertmhaas@gmail.com> writes: >> I think we might be able to install a static array for the setup where >> we would find the different elements, and then code up some procedures >> doing different kind of look ups in that array. > > +1. Done in the attached. Filling that array was… an interesting use case for Emacs Keyboard Macros spanning 3 different buffers, maintaining it should be easy enough now. > Ugh. Yeah, obviously the most important thing I think is that > InitEventContext() needs to be lightning-fast, but we don't want > BuildEventTriggerCache() to be pathologically slow either. So I've built two new hash tables so that we can either search internal command enum number by command tag or by parse tree nodeTag and object type. The hash table is only built when first needed, I didn't add any trick to only build it when the pg_event_trigger table is empty. > I think the best thing to do with InitEventContext() might be to get > rid of it. It's just a big switch over node tags, and we've already > got one of those in standard_ProcessUtility. Maybe every case that Given that second hash table (EventTriggerCommandNodeCache) we can now move that look-up to later in the course of events. I kept InitEventContext() as the place where to stuff default values in the EventContext structure, though. Also, as we're optimizing things, there's no longer any call sites to the function CommandFiresTriggersForEvent() in the attached patch. That's because this function main use case is allowing the specific command support code to avoid some possibly costly setup when there's in fact no trigger to fire. That's never going to be the case with command_start triggers as they have no specific variable support. A typical integration now looks like this: case T_CreateCastStmt: ExecEventTriggers(&evt, EVT_CommandStart); CreateCast((CreateCastStmt *) parsetree); break; > Now that leaves the question of how to translate between > E_AlterAggregate and "ALTER AGGREGATE"; I think your idea of a hash > table (or two?) might be the most practical option. We'd only need to > build the hash table(s) if an index-scan of pg_event_trigger finds it > non-empty, and then only once per session. As said earlier, I implemented that without the non-empty trick. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Attachment
On Tue, Jul 10, 2012 at 10:38 AM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: >> Mind you, if I ran the world, this would probably be broken up >> differently: I'd have ddl_command_start covering all the >> CREATE/ALTER/DROP commands and nothing else; and separate firing >> points for anything else I wanted to support. It's not too late to >> make that change, hint, hint. But if we're not gonna do that then I > > Let's see about doing that. I guess we would have ddl_command_start and > command_start, and I would think that the later is the most generic. So > we would certainly want DDL commands to run first the command_start > event triggers then the ddl_command_start event triggers, whereas a > NOTIFY would only run command_start triggers, and a GRANT command would > run maybe command_start then dcl_command_start triggers? > > If that's where we're going to, we can commit as-is and expand later. That's not quite what I was thinking. I actually can't imagine any situation where you want an event trigger that gets fired on EVERY command for which we can support command_start. If you're trying to prevent or replicate DDL, that's too much. If you're trying to do logging or auditing, it's not enough, since there will still be commands that aren't supported, and it'll be grossly inefficient to boot. You really want something like log_min_duration_statement=0 for those cases. So it seems to me that the use case for a command_start trigger, conceived in the broadest possible way so that every single command we can support is included, is razor-thin. So my proposal for the present patch would be: 1. Rename command_start to ddl_command_start. 2. Remove support for everything other than CREATE, ALTER, and DROP. 3. Pass the operation and the SQL object type as separate magic variables. Then we can add dcl_command_start, etc. in follow-on patches. >> think that we'd better try to cast the net as broadly as reasonably >> possible. It seems to me that our excuse for not including things >> like UPDATE and DELETE is a bit thin; surely there are people who >> would like a sort of universal trigger that applies to every relation >> in the system. Of course there are recursion problems there that need >> to be thought long hard about, and no I don't really want to go there >> right now, but I'll bet you a nickle that someone is going to ask why >> it doesn't work that way. > > The current reason why we only support 149 SQL commands and variations > is because we want a patch that's easy enough to review and agree on. So > I think we will in the future be able to add new firing point at places > where maybe some discussion is needed. Agreed. > Such places, in my mind, include the NOTIFY mechanism, DCLs, and global > objects such as databases and tablespaces and roles. I'd be happy to see > event triggers embrace support for those. Maybe in v2 though? Yep, sure. Note that the proposal above constrains the list of commands we support in v1 in a very principled way: CREATE, ALTER, DROP. Everything else can be added later under a different (but similarly situated) firing point name. If we stick with command_start then I think we're going to be forever justifying our decisions as to what got included or excluded; which might be worth it if it seemed likely that there'd be much use for such a command trigger, but it doesn't (to me, anyway). >> Another advantage to recasting this as ddl_command_start is that we >> quite easily pass the operation (CREATE, ALTER, DROP) and the named >> object type (TABLE, FUNCTION, CAST) as separate arguments. I think > > That's a good idea. I don't think we should replace the current tag > support with that though, because some commands are harder to stow into > the operation and type model (in supported commands, mainly LOAD). I'm imagining that ddl_command_start triggers would get the information this way, but LOAD might be covered by something like admin_command_start that just gets the command tag. > So I've included partial support for that in the attached patch, in the > simplest way possible, just so that we can see where it leads in term of > using the feature. The next step here is to actually go in each branch > of the process utility switch and manually decorate the command context > with the current operation and objecttype when relevant. > [...] > Done in the attached. Filling that array was… an interesting use case > for Emacs Keyboard Macros spanning 3 different buffers, maintaining it > should be easy enough now. Yep, looks better. It looks like you've got EventTriggerCommandTagsEntry mapping the command tag to an ETC_* constant; I think the need for that hash goes away entirely if you just pass this information down from the ProcessUtility() switch. At any rate having NameData involved seems like it's probably not too good an idea; if for some reason we need to keep that hash, use a NUL-terminated string and initialize the hash table with string_hash instead of tag_hash. That'll be simpler and also allows the length of the buffer to vary independently of NAMEDATALEN, which can only be to the good. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Hi, Robert Haas <robertmhaas@gmail.com> writes: > That's not quite what I was thinking. I actually can't imagine any > situation where you want an event trigger that gets fired on EVERY > command for which we can support command_start. If you're trying to I don't have any solid use case in mind, just though it would make the feature rather complete. Withdrawn. > So my proposal for the present patch would be: > > 1. Rename command_start to ddl_command_start. > 2. Remove support for everything other than CREATE, ALTER, and DROP. > 3. Pass the operation and the SQL object type as separate magic variables. All done in the attached. As usual, you get an incremental patch and a complete one. It's also published on github for easy browsing: https://github.com/dimitri/postgres/compare/33f285e67f...ba42279924 > Yep, sure. Note that the proposal above constrains the list of > commands we support in v1 in a very principled way: CREATE, ALTER, > DROP. Everything else can be added later under a different (but > similarly situated) firing point name. If we stick with command_start > then I think we're going to be forever justifying our decisions as to > what got included or excluded; which might be worth it if it seemed > likely that there'd be much use for such a command trigger, but it > doesn't (to me, anyway). Yeah, done that way. I had to remove support for only 4 commands… > I'm imagining that ddl_command_start triggers would get the > information this way, but LOAD might be covered by something like > admin_command_start that just gets the command tag. My point was that I didn't want to replace the command tag by the object type and operation combo, happy to see we're on the same line. > EventTriggerCommandTagsEntry mapping the command tag to an ETC_* > constant; I think the need for that hash goes away entirely if you > just pass this information down from the ProcessUtility() switch. At Well, no, because we use that hash mapping in BuildEventTriggerCache(), when reading from the catalogs. I don't see a way to cut down on the number of per-session hash-table here without involving a penalty. > any rate having NameData involved seems like it's probably not too > good an idea; if for some reason we need to keep that hash, use a > NUL-terminated string and initialize the hash table with string_hash > instead of tag_hash. That'll be simpler and also allows the length of > the buffer to vary independently of NAMEDATALEN, which can only be to > the good. Oh, I just failed to realize that the hash table key mustn't be a static length field. I'm done for tonight though, it's still something to fix. Maybe you will beat me to it? :) (if not, I'll be happy to, with some luck even tomorrow). Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Attachment
Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes: > All done in the attached. As usual, you get an incremental patch and a > complete one. It's also published on github for easy browsing: > > https://github.com/dimitri/postgres/compare/33f285e67f...ba42279924 Same as usual, even if that's a very little adjustment this time: https://github.com/dimitri/postgres/commit/5ec1ddb41489a89eea09e350f6ee39e726f9fb03 >> any rate having NameData involved seems like it's probably not too >> good an idea; if for some reason we need to keep that hash, use a >> NUL-terminated string and initialize the hash table with string_hash >> instead of tag_hash. That'll be simpler and also allows the length of >> the buffer to vary independently of NAMEDATALEN, which can only be to >> the good. Fixed now, using a char[] as with PortalHashTable for example. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Attachment
On Thu, Jul 12, 2012 at 8:50 AM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > [ new patch ] Well, I think it's about time to start getting some of this code into our tree. However, I'm still not confident that the code that actually runs the triggers is entirely solid, so I decided to rip that stuff out and commit only the syntax support and documentation portions of this patch. We can add the actual trigger firing stuff and PL language support in a subsequent commit or commits. I made a significant number of modifications. First, I changed the representation of CreateEventTriggerStmt considerably, so that we can more easily accommodate multiple filter variables in future patches without having to whack the code around too much. I also disentangled the CreateEventTriggerStmt processing from the event-cache machinery, because it doesn't seem like a good idea to have evtcache.c and event_trigger.c be deeply intertwined, from a modularity point of view. I think we might still need to make a bit more adjustment to the organization of this code - perhaps the code that is needed by both commands/event_triggers.c and utils/cache/evtcache.c ought to be moved to something like catalog/pg_event_trigger.c. However, I haven't done that in this commit. Second, I made this work with COMMENT, SECURITY LABEL, and with the extension mechanism, including updating the documentation. Third, I also some other documentation updates to match recent changes and also added the missing documentation for \dy. Fourth, I rewrote the regression tests fairly extensively to make sure we're adequately testing all of the syntax that this commit adds: not only that it works when it's supposed to work, but also that it fails when it's supposed to fail and emits a hopefully-good error message in such cases. And finally there were a bunch of miscellaneous code cleanups (some of them fixing things that I muffed in earlier incremental patches that I sent you to merge), and others that touch code you wrote. I suspect there are probably still a few oversights here, but hopefully not too many. The next step here is obviously to complete the work necessary to actually be able to fire these triggers, as opposed to just saying that we fire them. I'll write up my thoughts on that topic in a separate email. I don't think there's a ton of work left to be done there, but more than zero. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Jul 18, 2012 at 10:22 AM, Robert Haas <robertmhaas@gmail.com> wrote: > The next step here is obviously to complete the work necessary to > actually be able to fire these triggers, as opposed to just saying > that we fire them. I'll write up my thoughts on that topic in a > separate email. I don't think there's a ton of work left to be done > there, but more than zero. I have committed code to do this. It's basically similar to what you had before, but I reworked the event cache machinery heavily. I think that the new organization will be easier to extent to meet future needs, and it also gets rid of a lot of boilerplate code whose job was to translate between different representations. I also committed the PL/pgsql support code, but not the code for the other PLs. It should be easy to rebase that work and resubmit it as a separate patch, though, or maybe one patch per PL would be better. Obviously, there's quite a bit more work that could be done here -- passing more context, add more firing points, etc. -- but now we've at least got the basics. As previously threatened, I amended this code so that triggers fire once per DDL command. So internally generated command nodes that are used as an argument-passing mechanism do not fire triggers, for example. I believe this is the right decision because I think, as I mentioned in another email to Tom yesterday, that generating and passing around command tags is a really bad practice that we should be looking to eliminate, not institutionalize. It posed a major obstacle to my 9.2-era efforts to clean up the behavior of our DDL under concurrency, for example. I think, however, that it would be useful to have an event trigger that is defined to fire "every time a certain type of SQL object gets created" rather than "every time a certain command gets executed". That could complement, not replace, this mechanism. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 20 July 2012 16:50, Robert Haas <robertmhaas@gmail.com> wrote: > On Wed, Jul 18, 2012 at 10:22 AM, Robert Haas <robertmhaas@gmail.com> wrote: >> The next step here is obviously to complete the work necessary to >> actually be able to fire these triggers, as opposed to just saying >> that we fire them. I'll write up my thoughts on that topic in a >> separate email. I don't think there's a ton of work left to be done >> there, but more than zero. > > I have committed code to do this. It's basically similar to what you > had before, but I reworked the event cache machinery heavily. I think > that the new organization will be easier to extent to meet future > needs, and it also gets rid of a lot of boilerplate code whose job was > to translate between different representations. I also committed the > PL/pgsql support code, but not the code for the other PLs. It should > be easy to rebase that work and resubmit it as a separate patch, > though, or maybe one patch per PL would be better. > > Obviously, there's quite a bit more work that could be done here -- > passing more context, add more firing points, etc. -- but now we've at > least got the basics. > > As previously threatened, I amended this code so that triggers fire > once per DDL command. So internally generated command nodes that are > used as an argument-passing mechanism do not fire triggers, for > example. I believe this is the right decision because I think, as I > mentioned in another email to Tom yesterday, that generating and > passing around command tags is a really bad practice that we should be > looking to eliminate, not institutionalize. It posed a major obstacle > to my 9.2-era efforts to clean up the behavior of our DDL under > concurrency, for example. > > I think, however, that it would be useful to have an event trigger > that is defined to fire "every time a certain type of SQL object gets > created" rather than "every time a certain command gets executed". > That could complement, not replace, this mechanism. I've just run my own set of tests against these changes, which tests every supported DDL command (with the exception of "CREATE USER MAPPING", "ALTER USER MAPPING", "DROP USER MAPPING", "CREATE LANGUAGE" and "DROP LANGUAGE"), and many variants of each command, and everything behaves exactly as expected. :) -- Thom
The Windows buildfarm members don't seem too happy with the latest patch. regards, tom lane
On Fri, Jul 20, 2012 at 2:12 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > The Windows buildfarm members don't seem too happy with the latest > patch. I'm looking at this now, but am so far mystified. Something's obviously broken as regards how the trigger flags get set up, but if that were broken in a trivial way it would be broken everywhere, and it's not. Will keep searching... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Fri, Jul 20, 2012 at 1:22 PM, Thom Brown <thom@linux.com> wrote: > I've just run my own set of tests against these changes, which tests > every supported DDL command (with the exception of "CREATE USER > MAPPING", "ALTER USER MAPPING", "DROP USER MAPPING", "CREATE LANGUAGE" > and "DROP LANGUAGE"), and many variants of each command, and > everything behaves exactly as expected. :) Nice! But all of those commands you just mentioned ought to work, too. The documentation probably needs some updating on that point, come to think of it. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Fri, Jul 20, 2012 at 3:51 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Fri, Jul 20, 2012 at 2:12 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> The Windows buildfarm members don't seem too happy with the latest >> patch. > > I'm looking at this now, but am so far mystified. Something's > obviously broken as regards how the trigger flags get set up, but if > that were broken in a trivial way it would be broken everywhere, and > it's not. Will keep searching... I think this is fixed now. Let me know if anyone sees evidence of remaining breakage. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Hi, I'm back to PostgreSQL development concerns after some distraction here. First, thanks for pushing the patch to commit! I've been reviewing your changes and here's a very small patch with some details I would have spelled out differently. See what you think, I mostly needed to edit some code to get back in shape :) Coming next, catch-up with things I've missed and extending the included support for event triggers in term of function parameters (rewritten command string, object kind, etc), and maybe PL support too. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Attachment
On Mon, Aug 27, 2012 at 7:52 AM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > I've been reviewing your changes and here's a very small patch with some > details I would have spelled out differently. See what you think, I > mostly needed to edit some code to get back in shape :) I guess I don't particularly like either of these changes. The first one is mostly harmless, but I don't really see why it's any better, and it does have the downside of traversing the string twice (once for strlen and a second time in str_toupper) instead of just once. It also makes a line wider than 80 columns, which is a bit ugly. In the second hunk, the point is that we never have to do CreateCommandTag() here at all unless either casserts are enabled or EventCacheLookup returns a non-empty list. That means that in a non-assert-enabled build, we get to skip that work altogether in the presumably-common case where there are no relevant event triggers. Your proposed change would avoid doing it twice when asserts are disabled, but the cost would be that we'd have to do it once when asserts were disabled even if no event triggers exist. I don't think that's a good trade-off. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > I guess I don't particularly like either of these changes. The first Fair enough. > one is mostly harmless, but I don't really see why it's any better, > and it does have the downside of traversing the string twice (once for > strlen and a second time in str_toupper) instead of just once. It > also makes a line wider than 80 columns, which is a bit ugly. In the It's much easier to read, I think. The line is 79 columns here given the project Emacs setup wrt tabs, see src/tools/editors/emacs.samples. > second hunk, the point is that we never have to do CreateCommandTag() > here at all unless either casserts are enabled or EventCacheLookup > returns a non-empty list. That means that in a non-assert-enabled > build, we get to skip that work altogether in the presumably-common > case where there are no relevant event triggers. Your proposed change > would avoid doing it twice when asserts are disabled, but the cost > would be that we'd have to do it once when asserts were disabled even > if no event triggers exist. I don't think that's a good trade-off. Well I needed more exercise before sending a patch then, I just missed that. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support