Thread: Deparsing DDL command strings
Hi, Working on the Event Trigger next patch series, one of the challenge to address is deparsing the DDL commands so that the User Defined Function used by the trigger definition has that information. I'm making good progress on that, it's some amount of code but pretty straightforward. The only road blocker for now is best summarized by the following comment in src/backend/commands/tablecmds.c * Now add any newly specified column default values and CHECK constraints * to the new relation. These are passed tous in the form of raw * parsetrees; we need to transform them to executable expression trees * before they can be added.The most convenient way to do that is to * apply the parser's transformExpr routine, but transformExpr doesn't * work unless we have a pre-existing relation. So, the transformation has * to be postponed to this final step of CREATETABLE. So I have a Node *parsetree containing some CHECK and DEFAULT raw expressions to work with. Those can reference non existing tables, either to-be-created or already-dropped. Should I work on transformExpr() so that it knows how to work with a non existing relation, or should I write a new transformRawExpr() that knows how to handle this case? Or do we want to limit the deparsing feature and not accept some CHECK and DEFAULT expressions (though not being able to cope with T_A_Const is a bummer)? (I don't mean to do it, I still have to mention the choice). Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes: > So I have a Node *parsetree containing some CHECK and DEFAULT raw > expressions to work with. Those can reference non existing tables, > either to-be-created or already-dropped. Why don't you just pass the original query string, instead of writing a mass of maintenance-requiring new code to reproduce it? This would require (1) making sure the query string is still available where needed. I think we are 99% of the way there but maybe not 100%. (2) being able to identify the substring corresponding to the current command, when we're processing a multi-command string. The parser could easily provide that, I think --- we've just never insisted that it do so before. regards, tom lane
On Friday, October 05, 2012 04:03:03 PM Tom Lane wrote: > Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes: > > So I have a Node *parsetree containing some CHECK and DEFAULT raw > > expressions to work with. Those can reference non existing tables, > > either to-be-created or already-dropped. > > Why don't you just pass the original query string, instead of writing > a mass of maintenance-requiring new code to reproduce it? Its not easy to know which tables are referenced in e.g. an ALTER TABLE statement if the original statement didn't schema qualify everything. Its also not really possible to trigger cascading effects like the creating of a sequence from a serial column that way... Greetings, Andres -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Andres Freund <andres@2ndquadrant.com> writes: > On Friday, October 05, 2012 04:03:03 PM Tom Lane wrote: >> Why don't you just pass the original query string, instead of writing >> a mass of maintenance-requiring new code to reproduce it? > Its not easy to know which tables are referenced in e.g. an ALTER TABLE > statement if the original statement didn't schema qualify everything. What he's talking about is deparsing the raw grammar output, which by definition contains no more information than is in the query string. Deparsing post-parse-analysis trees is a different problem (for which code already exists, unlike the raw-tree case). regards, tom lane
On Friday, October 05, 2012 04:24:55 PM Tom Lane wrote: > Andres Freund <andres@2ndquadrant.com> writes: > > On Friday, October 05, 2012 04:03:03 PM Tom Lane wrote: > >> Why don't you just pass the original query string, instead of writing > >> a mass of maintenance-requiring new code to reproduce it? > > > > Its not easy to know which tables are referenced in e.g. an ALTER TABLE > > statement if the original statement didn't schema qualify everything. > > What he's talking about is deparsing the raw grammar output, which by > definition contains no more information than is in the query string. > Deparsing post-parse-analysis trees is a different problem (for which > code already exists, unlike the raw-tree case). Sure. I am not saying Dimitri's approach is perfect. I am just listing some of reasons why I think just using the raw input string isn't sufficient... Andres -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
Tom Lane <tgl@sss.pgh.pa.us> writes: > Why don't you just pass the original query string, instead of writing > a mass of maintenance-requiring new code to reproduce it? Do we have that original query string in all cases, including EXECUTE like spi calls from any PL? What about commands that internally set a parsetree to feed ProcessUtility() directly? Do we want to refactor them all just now as a prerequisite? Also, we need to normalize that command string. Tools needing to look at it won't want to depend on random white spacing and other oddities. Those tools could also use the Node *parsetree and be written only in C, but then what about giving them a head start by having a parsetree walker in our code base? Then we want to qualify object names. Some type names have already been taken care of apparently by the parser here, relation names not yet and we need to cope with non existing relation names. My freshly grown limited understanding is that we currently only know how to produce a "cooked" parse tree from the raw one if all referenced objects do exist in the catalogs, so that we will postpone some "cooking" (transform*) until the main object in a CREATE command are defined, right? Is that something we want to revisit? Another option would be to capture search_path and other parse time impacting GUCs, call that the query environment, and have a way to serialize and pass in the environment and restore it either on the same host or on another (replication is an important use case here). Yet another option would be to output both the original query string and something that's meant for easy machine parsing yet is not the internal representation of the query, so that we're free to hack the parser at will in between releases, even minor. Building that new code friendly document will require about the same amount of code as spitting out normalized SQL, I believe. Yet another option would be to go the "sax" way rather than the "dom" one: instead of spitting out a new command string have the user register callbacks and only implement walking down the parsetree and calling those. I'm not sure how much maintenance work we would save here, and I'm not seeing another reason why going that way. Yet another option would be to only provide for a hook and some space in the EventTriggerData structure for extensions to register themselves and provide whatever deparsing they need. But then we need to figure out a way for the user defined function to use the resulting opaque data, from any PL language, if only to be able to call some extension's API to process it. Looks like a very messy way to punt the work outside of core. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes: > Tom Lane <tgl@sss.pgh.pa.us> writes: >> Why don't you just pass the original query string, instead of writing >> a mass of maintenance-requiring new code to reproduce it? > Do we have that original query string in all cases, including EXECUTE > like spi calls from any PL? As I said, I believe we are pretty close to that if not there already. There are a number of utility statements that *require* the source string to be provided, because they do parse analysis and since 8.4 or so the original string has been required for that. So we certainly have the string available at ProcessUtility. I've not looked at the event trigger patch at all, so I don't know what amount of refactoring is going to be required below that ... but the point I'm trying to make is that it would be a one-and-done task. Adding a requirement to be able to decompile raw parse trees will be a permanent drag on every type of SQL feature addition. > Also, we need to normalize that command string. Tools needing to look at > it won't want to depend on random white spacing and other oddities. Instead, they'll get to depend on the oddities of parse transformations (ie, what's done in the raw grammar vs. what's done in parse_analyze), which is something we whack around regularly. Besides which, you've merely asserted this requirement without providing any evidence that it's important at all, let alone important enough to justify the kind of maintenance burden that you want to saddle us with. > Those tools could also use the Node *parsetree and be written only in C, > but then what about giving them a head start by having a parsetree > walker in our code base? Well, as far as a raw parsetree *walker* goes, there already is one in nodeFuncs.c. It does not follow that we need something that tries to reconstruct SQL from that. It's not clear to me that there is any production-grade use-case for which reconstructed SQL is more useful than examining the parse tree. Now, if you're talking about half-baked code that gets only some cases right, then yeah grepping reconstructed SQL might serve. But I'm not excited about doing a lot of work to support partial solutions. > Then we want to qualify object names. Some type names have already been > taken care of apparently by the parser here, relation names not yet and > we need to cope with non existing relation names. Which is exactly what you *won't* be able to do anything about when looking at a raw parse tree. It's just a different presentation of the query string. > My freshly grown limited understanding is that we currently only know > how to produce a "cooked" parse tree from the raw one if all referenced > objects do exist in the catalogs, Well, yeah. Anything else is magic not code. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes: > going to be required below that ... but the point I'm trying to make is > that it would be a one-and-done task. Adding a requirement to be able > to decompile raw parse trees will be a permanent drag on every type of > SQL feature addition. I'll show some examples of very involved command (CREATE and ALTER TABLE are the most complex we have I think) and some very simple commands (DROP TABLE is one of the simplest), so that we can make up our minds on that angle. >> Then we want to qualify object names. Some type names have already been >> taken care of apparently by the parser here, relation names not yet and >> we need to cope with non existing relation names. > > Which is exactly what you *won't* be able to do anything about when > looking at a raw parse tree. It's just a different presentation of the > query string. So, I'm currently adding the deparsing to the existing only event we have, which is ddl_command_start. That's maybe not the best place where to do it, I even now wonder if we can do it there at all. Doing the same thing at ddl_command_end would allow us have all the information we need and leave nothing to magic guesses: full schema qualification of all objects involved, main object(s) OIDs available, all the jazz. > Well, yeah. Anything else is magic not code. Well, prepending an object name with the first entry of the current search_path as its schema is not that far a stretch when the object is being created, as far as I see it. It's more reasonable to document that the rewritten no-ambiguities command string is only available for ddl_command_end events, though. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On 10/5/12 11:15 AM, Tom Lane wrote: >> Also, we need to normalize that command string. Tools needing to look at >> >it won't want to depend on random white spacing and other oddities. > Instead, they'll get to depend on the oddities of parse transformations > (ie, what's done in the raw grammar vs. what's done in parse_analyze), > which is something we whack around regularly. Besides which, you've > merely asserted this requirement without providing any evidence that > it's important at all, let alone important enough to justify the kind of > maintenance burden that you want to saddle us with. I definitely want to be able to parse DDL commands to be able to either enforce things or to drive other parts of the systembased on what's changing. Without the ability to capture (and parse) DDL commands I'm stuck creating wrapper functionsaround anything I want to capture and then trying to ensure that everyone uses the wrappers and not the raw DDLcommands. Event triggers that just spit out raw SQL give me the first part of this, but not the second part: I'm still stuck tryingto parse raw SQL on my own. Having normalized SQL to parse should make that a bit easier, but ideally I'd like to beable to pull specific elements out of a command. I'd want to be able to do things like: IF command is ALTER TABLE THEN FOR EACH subcommand IF subcommand IS DROP COLUMN THEN do something that needs toknow what column is being dropped ELSE IF subcommand IS ADD COLUMN THEN do something that needs to know the definitionof the column being added I don't think every bit of that has to be dealt with by the event trigger code itself. For example, if you're adding a columnto a table and the entries have already been made in the catalog, you could query to get the details of the columndefinition if you were given an OID into pg_attributes. Having said all that, an event system that spits back the raw SQL would certainly be better than nothing. But realize thatpeople would still need to do parsing on it (ie: replication solutions will need to know what table just got ALTER'd). -- Jim C. Nasby, Database Architect jim@nasby.net 512.569.9461 (cell) http://jim.nasby.net
Jim Nasby <jim@nasby.net> writes: > I definitely want to be able to parse DDL commands to be able to > either enforce things or to drive other parts of the system based on > what's changing. Without the ability to capture (and parse) DDL > commands I'm stuck creating wrapper functions around anything I want > to capture and then trying to ensure that everyone uses the wrappers > and not the raw DDL commands. Are you mainly working on some Auditing system? > Event triggers that just spit out raw SQL give me the first part of > this, but not the second part: I'm still stuck trying to parse raw SQL > on my own. Having normalized SQL to parse should make that a bit > easier, but ideally I'd like to be able to pull specific elements out > of a command. I'd want to be able to do things like: The current design for event triggers is to spit out several things: - command tag is already commited- object id, can be null- schema name, can be null- object name- operation either ALTER, CREATE or DROP, …- object type TABLE, VIEW, FUNCTION,…- normalized command string After some more thinking, it appears that in several case you want to have all those information filled in and you don't want to care if that means your trigger needs to run at ddl_command_start or ddl_command_end. The proposal I want to make here is to introduce a generic event (or an event alias) named ddl_command_trace that the system provides at the right spot where you have the information. That's useful when you don't intend to divert the execution of the DDL and need to know all about it. For a DROP operation, ddl_command_trace would be ddl_command_start, and for a CREATE operation, that would be ddl_command_end, so that the target object (still|already) exists when the trigger is fired. > IF command is ALTER TABLE THEN That's called TG_TAG, see http://www.postgresql.org/docs/devel/static/plpgsql-trigger.html#PLPGSQL-EVENT-TRIGGER > FOR EACH subcommand > IF subcommand IS DROP COLUMN THEN > do something that needs to know what column is being dropped > ELSE IF subcommand IS ADD COLUMN THEN > do something that needs to know the definition of the column being added We decided not to publish any notion of subcommand at this stage. > I don't think every bit of that has to be dealt with by the event > trigger code itself. For example, if you're adding a column to a table > and the entries have already been made in the catalog, you could query > to get the details of the column definition if you were given an OID > into pg_attributes. It's easy enough to provide the OID of the newly created main command target object, it's harder to provide in a generic way all the OID of the objects you might be interested into, because each command has its own set of such. DROP can target multiple objects, they all are the main target. ALTER target only a single object, but can link to dependent objects. CREATE an operator class or a cast and you're talking about a bunch of operators and functions to tie together. It's not that easy. > Having said all that, an event system that spits back the raw SQL > would certainly be better than nothing. But realize that people would > still need to do parsing on it (ie: replication solutions will need to > know what table just got ALTER'd). You would have most of what you're asking. I think that looking into the normalized string to get the information you need when you already know you're looking at an ALTER TABLE statement and you already have the object references (schema, name, oid) is going to make things easier. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes: > I'll show some examples of very involved command (CREATE and ALTER TABLE > are the most complex we have I think) and some very simple commands > (DROP TABLE is one of the simplest), so that we can make up our minds on > that angle. So please find attached a demo patch to show up what it takes to deparse complex command strings, and here's inline some example of why that's a good idea to actually deparse them rather than hand out whatever the user typed in: \dy List of event triggers -[ RECORD 1 ]-------------------------- Name | regress_event_trigger_trace Event | ddl_command_trace Owner | dim Enabled | enabled Procedure | test_event_trigger Tags | foo=# drop table foo; NOTICE: test_event_trigger: ddl_command_start DROP TABLE NOTICE: test_event_trigger: DROP, TABLE NOTICE: test_event_trigger: DROP TABLE public.foo RESTRICT; DROP TABLE foo=# create table foo(id serial primary key, f2 text default 'plop' check (f2 != '')); NOTICE: test_event_trigger: ddl_command_end CREATE TABLE NOTICE: test_event_trigger: CREATE, TABLE NOTICE: test_event_trigger: CREATE TABLE public.foo (id integer PRIMARY KEY DEFAULT nextval('foo_id_seq'::regclass)NOT NULL, f2 text DEFAULT 'plop' CHECK ((f2 <> ''::text)), CHECK ((f2 <> ''::text))); CREATE TABLE The user of that command string still has to know what to look for and maybe should include a proper SQL parser, but at least it doesn't need to do much guesswork about how the serial attached sequence will get named by the system and such oddities. The attached patch also includes support for the complete ALTER TABLE command and some more (CREATE SEQUENCE, CREATE EXTENSION). > Doing the same thing at ddl_command_end would allow us have all the > information we need and leave nothing to magic guesses: full schema > qualification of all objects involved, main object(s) OIDs available, > all the jazz. That's what is happening now in the attached patch, also with a new event called 'ddl_command_trace' which will either map to _start or _end depending on the operation (we want _start when doing DROP TABLE, we want the operation to be complete before tracing it when talking about a CREATE or an ALTER table). And here's the scope we're talking about, including new command types, new information passed down to user triggers, and the rewrite support itself, isolated away: git diff --stat postgres/master.. src/backend/catalog/heap.c | 5 +- src/backend/commands/event_trigger.c | 241 ++++- src/backend/tcop/utility.c | 187 ++-- src/backend/utils/adt/Makefile | 2 +- src/backend/utils/adt/ddl_rewrite.c | 1415 +++++++++++++++++++++++++++ src/backend/utils/adt/ruleutils.c | 9 +- src/backend/utils/cache/evtcache.c | 4 + src/include/catalog/heap.h | 4 + src/include/commands/event_trigger.h | 43 +- src/include/utils/builtins.h | 14 + src/include/utils/evtcache.h | 4 +- src/pl/plpgsql/src/pl_comp.c | 40 + src/pl/plpgsql/src/pl_exec.c | 53 +- src/pl/plpgsql/src/plpgsql.h | 5 + src/test/regress/expected/event_trigger.out | 40 +- src/test/regress/sql/event_trigger.sql | 36 +- 16 files changed, 1938 insertions(+), 164 deletions(-) Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
Attachment
On Fri, Oct 12, 2012 at 12:55 PM, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote: > The user of that command string still has to know what to look for and > maybe should include a proper SQL parser, but at least it doesn't need > to do much guesswork about how the serial attached sequence will get > named by the system and such oddities. IMHO, it should be our explicit goal for clients not to need to parse any SQL at all. I think that the communication between the server and event triggers should be accomplished using magic variables. If the data is too complex to be structured that way, then I think we should use an established serialization format such as XML or (my personal preference) JSON for which many parsers already exist. If we pass sanitized SQL, then everyone who wants to do anything complicated will need a parser for sanitized-SQL-as-generated-by-the-PostgreSQL-event-trigger-mechanism, and I think that's likely to be strictly worse than using one of the serialization methods that already exists and is well understood. Consider a hypothetical new feature where a table can be located in either Paris or Pittsburgh. We add two new keywords to the grammer: PARIS and PITTSBURGH. The syntax is extended to CREATE { PARIS | PITTSBURGH } {schemaname}.{tablename} etc. Now, if we use sanitized-SQL as a way of passing data to triggers, they all immediately break, because the new key word is in exactly the location where the table name used to be. If we use magic variables or JSON or XML, we'll just add in another magic variable, or another field in the JSON or XML object, and well-written triggers will ignore it and keep working. Now, it may seem like I've chosen this example somewhat unfairly since most syntax changes are a bit less obtrusive than that, but we did do something not dissimilar to the above in 9.1, with UNLOGGED. There's a careful line to be walked here, because in order for event triggers to be useful, we're going to have to pass them information - and the way we do that becomes part of the API, and might get broken by future changes. That sucks, but there's nothing we can do to completely prevent it except not have event triggers at all, and I want event triggers, and so do many other people. What we can and I think must do is minimize the difficulty of writing and maintaining event triggers, and that means making the API as clean as possible. Like Tom, I'm very skeptical that this is the right horse to bet on. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas <robertmhaas@gmail.com> writes: > IMHO, it should be our explicit goal for clients not to need to parse > any SQL at all. I think that the communication between the server and > event triggers should be accomplished using magic variables. If the +1 on that. There's a but. > data is too complex to be structured that way, then I think we should > use an established serialization format such as XML or (my personal > preference) JSON for which many parsers already exist. If we pass > sanitized SQL, then everyone who wants to do anything complicated will > need a parser for > sanitized-SQL-as-generated-by-the-PostgreSQL-event-trigger-mechanism, > and I think that's likely to be strictly worse than using one of the > serialization methods that already exists and is well understood. That's not an easy task by any means, we're talking about inventing a stable alternative to the parse tree format (nodeToString) so that we can whack the parsenodes as much as we need and still produce something compatible for our users. Well at least we've already heard about more than one use case, e.g. exposing the parser for syntax highlighting… Now, if all you want to do is replay the exact same DDL on another PostgreSQL instance, and if you happen to trust the master's server, then the command string I'm currently spitting out is exactly what you need. And having the more generalized representation of the parser data would only mean that the trigger now has to rewrite the command string itself. > There's a careful line to be walked here, because in order for event > triggers to be useful, we're going to have to pass them information - > and the way we do that becomes part of the API, and might get broken > by future changes. That sucks, but there's nothing we can do to +1 here, again, we're on the same line, just having different use cases in mind it seems. > completely prevent it except not have event triggers at all, and I > want event triggers, and so do many other people. What we can and I > think must do is minimize the difficulty of writing and maintaining > event triggers, and that means making the API as clean as possible. > Like Tom, I'm very skeptical that this is the right horse to bet on. I would thing that having one doesn't preclude having the other, and while I hear Tom and you saying that it's a lot of maintenance work down the road to have the command string, I'm highly skeptical of being able to produce that external stable (enough) parser format in a way that impose less code maintenance down the road. Also, I'm thinking that publishing the normalized command string is something that must be maintained in the core code, whereas the external stable format might be done as a contrib extension, coded in C, working with the Node *parsetree. Because it needs to ouput a compatible format when applied to different major versions of PostgreSQL, I think it suits quite well the model of C coded extensions. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On 10/9/12 2:57 AM, Dimitri Fontaine wrote: > Jim Nasby <jim@nasby.net> writes: >> I definitely want to be able to parse DDL commands to be able to >> either enforce things or to drive other parts of the system based on >> what's changing. Without the ability to capture (and parse) DDL >> commands I'm stuck creating wrapper functions around anything I want >> to capture and then trying to ensure that everyone uses the wrappers >> and not the raw DDL commands. > > Are you mainly working on some Auditing system? Definitely not. We need to deal with questions like "If we rename a table, what do we have to do in londiste to accommodate that?" Exceptwe're dealing with more than just londiste. We have internal code that does things like track "seed tables", whichare tables that we need to dump data for when we dump schema. We have other systems that care about the schema that'sdefined in a database, and changes that happen to that schema. >> Event triggers that just spit out raw SQL give me the first part of >> this, but not the second part: I'm still stuck trying to parse raw SQL >> on my own. Having normalized SQL to parse should make that a bit >> easier, but ideally I'd like to be able to pull specific elements out >> of a command. I'd want to be able to do things like: > > The current design for event triggers is to spit out several things: > > - command tag is already commited > - object id, can be null > - schema name, can be null > - object name > - operation either ALTER, CREATE or DROP, … > - object type TABLE, VIEW, FUNCTION, … > - normalized command string > > After some more thinking, it appears that in several case you want to > have all those information filled in and you don't want to care if that > means your trigger needs to run at ddl_command_start or ddl_command_end. > > The proposal I want to make here is to introduce a generic event (or an > event alias) named ddl_command_trace that the system provides at the > right spot where you have the information. That's useful when you don't > intend to divert the execution of the DDL and need to know all about it. > > For a DROP operation, ddl_command_trace would be ddl_command_start, and > for a CREATE operation, that would be ddl_command_end, so that the > target object (still|already) exists when the trigger is fired. In some cases we may need to divert or reject DDL, but that's a secondary concern. >> Having said all that, an event system that spits back the raw SQL >> would certainly be better than nothing. But realize that people would >> still need to do parsing on it (ie: replication solutions will need to >> know what table just got ALTER'd). > > You would have most of what you're asking. I think that looking into the > normalized string to get the information you need when you already know > you're looking at an ALTER TABLE statement and you already have the > object references (schema, name, oid) is going to make things easier. Possibly. We certainly have cases where we need to know what's happening *inside* the DDL. -- Jim C. Nasby, Database Architect jim@nasby.net 512.569.9461 (cell) http://jim.nasby.net
Jim Nasby <jim@nasby.net> writes: > We need to deal with questions like "If we rename a table, what do we have > to do in londiste to accommodate that?" Except we're dealing with more than > just londiste. We have internal code that does things like track "seed > tables", which are tables that we need to dump data for when we dump schema. > We have other systems that care about the schema that's defined in a > database, and changes that happen to that schema. Interesting use case, I hope we're offering you good progress. >> - command tag is already commited >> - object id, can be null >> - schema name, can be null >> - object name >> - operation either ALTER, CREATE or DROP, … >> - object type TABLE, VIEW, FUNCTION, … >> - normalized command string >> >> The proposal I want to make here is to introduce a generic event (or an >> event alias) named ddl_command_trace that the system provides at the >> right spot where you have the information. That's useful when you don't >> intend to divert the execution of the DDL and need to know all about it. > > In some cases we may need to divert or reject DDL, but that's a > secondary concern. Reject is already in, just RAISE ERROR from the trigger code. Divert is another sell entirely, we currently miss that capability. I'm interested into it for some DDLs. Which commands do you want to divert, and at exactly what point in their execution? (think about privilege checks, lock aquisition, etc). >> You would have most of what you're asking. I think that looking into the >> normalized string to get the information you need when you already know >> you're looking at an ALTER TABLE statement and you already have the >> object references (schema, name, oid) is going to make things easier. > > Possibly. We certainly have cases where we need to know what's > happening *inside* the DDL. In that cases you would probably need to resort to coding the trigger in C so that you can abuse the parsetree. At least the fact that you're doing funny things with some commands is easy to get at when doing \dy from a psql prompt, an information that's completely lost when using the standard process utility hook directly. I don't see a way to pass down the parse tree in a format easy to use in PLpgSQL anyway, but maybe we'll get there at some point. I want to say that having to resort to C in some complex cases is good enough for a first version of the feature. Regards, -- Dimitri Fontaine 06 63 07 10 78 http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On 10/29/12 4:30 PM, Dimitri Fontaine wrote: >> In some cases we may need to divert or reject DDL, but that's a >> >secondary concern. > Reject is already in, just RAISE ERROR from the trigger code. Divert is > another sell entirely, we currently miss that capability. I'm interested > into it for some DDLs. Which commands do you want to divert, and at > exactly what point in their execution? (think about privilege checks, > lock aquisition, etc). After further thought... I can't think of any case (right now) where we'd need to divert. We only care about firing up secondaryeffects from DDL. >>> >>You would have most of what you're asking. I think that looking into the >>> >>normalized string to get the information you need when you already know >>> >>you're looking at an ALTER TABLE statement and you already have the >>> >>object references (schema, name, oid) is going to make things easier. >> > >> >Possibly. We certainly have cases where we need to know what's >> >happening*inside* the DDL. > In that cases you would probably need to resort to coding the trigger in > C so that you can abuse the parsetree. At least the fact that you're > doing funny things with some commands is easy to get at when doing \dy > from a psql prompt, an information that's completely lost when using the > standard process utility hook directly. > > I don't see a way to pass down the parse tree in a format easy to use in > PLpgSQL anyway, but maybe we'll get there at some point. I want to say > that having to resort to C in some complex cases is good enough for a > first version of the feature. +1 -- Jim C. Nasby, Database Architect jim@nasby.net 512.569.9461 (cell) http://jim.nasby.net
Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes: > The current design for event triggers is to spit out several things: > > - command tag is already commited > - object id, can be null > - schema name, can be null > - object name > - operation either ALTER, CREATE or DROP, … > - object type TABLE, VIEW, FUNCTION, … > - normalized command string > > After some more thinking, it appears that in several case you want to > have all those information filled in and you don't want to care if that > means your trigger needs to run at ddl_command_start or ddl_command_end. > > The proposal I want to make here is to introduce a generic event (or an > event alias) named ddl_command_trace that the system provides at the > right spot where you have the information. That's useful when you don't > intend to divert the execution of the DDL and need to know all about it. Please find attached a patch that includes support for such a design. > For a DROP operation, ddl_command_trace would be ddl_command_start, and > for a CREATE operation, that would be ddl_command_end, so that the > target object (still|already) exists when the trigger is fired. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support