Thread: dealing with extension dependencies that aren't quite 'e'
Hi. I'm looking at an extension that creates some triggers (on user tables) dynamically (i.e., not during CREATE EXTENSION, but afterwards). The author has two problems with it: * «DROP EXTENSION ext» won't work without adding CASCADE, which is an (admittedly relatively minor) inconvenience to users. * More importantly, pg_dump will dump all those trigger definitions, which is inappropriate in this case because the extensionwill try to create them. As an experiment, I implemented "ALTER EXTENSION … ADD TRIGGER … ON …" (a few-line patch to gram.y) and taught pg_dump.c's getTriggers() to skip triggers with any 'e' dependency. That works, in the sense that DROP EXTENSION will drop the triggers (but the triggers can't be dropped on their own any more), and pg_dump will ignore them. I'm trying to find a more generally useful mechanism that addresses this problem and has a chance of being accepted upstream. Rather than overloading 'e', we could introduce a new dependency type that references an extension, but means that the dependent object should be dropped when the extension is, but it can also be dropped on its own, and pg_dump should ignore it. That would work for this case, and I can imagine it *may* be useful for other types of objects (e.g., sequences that depend on a seqam function provided by an extension, indexes that depend on index functions provided by an extension). But that leaves open the question of how exactly to record the dependency: ALTER EXTENSION … ADD … is the easiest way, but it doesn't feel right to introduce object-type-specific logic there to determine the dependency type to use. Besides, I'm not entirely comfortable with tying pg_dump behaviour so closely with the dependency, though there's some sort of precedent there with deptype = 'i'. In short, I can see some scope for improvement, but not clearly enough to make a concrete proposal. I'm hoping for advice or suggestions with a view towards submitting something to the next commitfest (2016-03). Thanks. -- Abhijit
Abhijit Menon-Sen <ams@2ndQuadrant.com> writes: > I'm looking at an extension that creates some triggers (on user tables) > dynamically (i.e., not during CREATE EXTENSION, but afterwards). The > author has two problems with it: > * «DROP EXTENSION ext» won't work without adding CASCADE, which is an > (admittedly relatively minor) inconvenience to users. I am not sure why that's a bad thing. > * More importantly, pg_dump will dump all those trigger definitions, > which is inappropriate in this case because the extension will try > to create them. Or that. Shouldn't pg_dump be expected to restore the same state that was there before? IOW, what is the argument that this doesn't just represent poorly-thought-through extension behavior? regards, tom lane
Abhijit Menon-Sen <ams@2ndQuadrant.com> writes:
> I'm looking at an extension that creates some triggers (on user tables)
> dynamically (i.e., not during CREATE EXTENSION, but afterwards). The
> author has two problems with it:
How do these triggers come to be?
> * «DROP EXTENSION ext» won't work without adding CASCADE, which is an
> (admittedly relatively minor) inconvenience to users.
I am not sure why that's a bad thing.
Agreed. The triggers the extension creates are not part of the extension itself and thus should not be removed even if the extension itself is removed.
> * More importantly, pg_dump will dump all those trigger definitions,
> which is inappropriate in this case because the extension will try
> to create them.
Or that. Shouldn't pg_dump be expected to restore the same state
that was there before? IOW, what is the argument that this doesn't
just represent poorly-thought-through extension behavior?
Also agreed - pending an answer to my question. Restoration involves recreating the state of the database without "executing" things. It is assumed that those things not directly created as part of executing "CREATE EXTENSION" are instead created by "executing" things located in the extension (e.g., functions) and thus direct re-creation has to occur since there is no mechanism to execute during restoration.
If there is some sort of catalog-driven user-space population going on the selection criteria should omit from selection any objects already affected.
This is a bunch of hand-waving, though. It would help to have a concrete use-case to discuss explicitly rather than espouse theory.
I am not familiar enough with the dependency and extension internals to comment on the merit of a new kind of dependency type behaving as described. It sounds like it would allow for a more accurate description of the internal dependencies of the database - which is good absent any kind of cost consideration.
David J.
I'm sorry, it wasn't clear from my earlier post that the triggers are dependent on a function provided by the extension. So when you do CREATE EXTENSION foo, it creates foo_somefunc() that RETURNS TRIGGER. Later, a trigger is created (somehow; in this case it is by some other function in the extension, but it could be by the user directly as well) that executes this function. But that's only a partial answer to the questions raised here, and I'll return to the drawing board and draw up a more complete explanation. Thanks for reading. -- Abhijit
Right, here's another try. The extension does trigger-based DML auditing. You install it using CREATE EXTENSION and then call one of its functions to enable auditing for a particular table. That function will create a customised trigger function based on the table's columns and a trigger that uses it: CREATE FUNCTION fn_audit_$table_name() RETURNS TRIGGER … CREATE TRIGGER … ON $table_name … EXECUTE fn_audit_$table_name; All that works fine (with pg_dump too). But if you drop the extension, the triggers stop working because the trigger function calls functions in the extension that are now gone. To mitigate this problem, the extension actually does: CREATE FUNCTION fn_audit… ALTER EXTENSION … ADD FUNCTION fn_audit… Now the trigger depends on the trigger function (as before), and the trigger function depends on the extension, so you can't inadvertently break the system by dropping the extension. But now pg_dump has a problem: it'll dump the trigger definitions, but not the trigger functions (because of their new 'e' dependency on the extension). So if you restore, you get the extension and the triggers, but the trigger functions are gone, and things break. *This* is the problem I'm trying to solve. Sorry, my earlier explanation was not clear, because I didn't fully understand the problem and what the extension was doing. One possible solution is to make the trigger function depend on the extension with a dependency type that isn't 'e', and therefore doesn't prevent pg_dump from including the function in its output. We would need some way to record the dependency, but no changes to pg_dump would be needed. Thoughts? -- Abhijit
On Jan 16, 2016, at 9:48 AM, Abhijit Menon-Sen <ams@2ndQuadrant.com> wrote: > Right, here's another try. > > The extension does trigger-based DML auditing. You install it using > CREATE EXTENSION and then call one of its functions to enable auditing > for a particular table. That function will create a customised trigger > function based on the table's columns and a trigger that uses it: > > CREATE FUNCTION fn_audit_$table_name() RETURNS TRIGGER … > CREATE TRIGGER … ON $table_name … EXECUTE fn_audit_$table_name; > > All that works fine (with pg_dump too). But if you drop the extension, > the triggers stop working because the trigger function calls functions > in the extension that are now gone. This seems like one manifestation of the more general problem that we don't have any real idea what objects a function definitiondepends on. ...Robert
At 2016-01-16 12:18:53 -0500, robertmhaas@gmail.com wrote: > > This seems like one manifestation of the more general problem that we > don't have any real idea what objects a function definition depends > on. Yes. I'm proposing to address a part of that problem by allowing extension dependencies to be explicitly declared for functions and objects created either by a user or dynamically by the extension itself—things that need the extension to function, but aren't a part of it. Put that way, ALTER EXTENSION doesn't sound like the way to do it. Maybe ALTER FUNCTION … DEPENDS ON EXTENSION …? I don't particularly care how the dependency is recorded, it's the dependency type that's important. I'll post a patch along those lines in a bit, just so we have something concrete to discuss; meanwhile, suggestions for another syntax to record the dependency are welcome. -- Abhijit
On 15 January 2016 at 14:26, Abhijit Menon-Sen <ams@2ndquadrant.com> wrote:
-- * «DROP EXTENSION ext» won't work without adding CASCADE, which is an
(admittedly relatively minor) inconvenience to users.
* More importantly, pg_dump will dump all those trigger definitions,
which is inappropriate in this case because the extension will try
to create them.
I dealt with both of those in BDR (and pglogical), where we create TRUNCATE triggers to capture and replicate table truncation. The triggers are created either during node creation/join by a SQL function that calls into C code, or via an event trigger on CREATE TABLE for subsequent creations.
Creating them tgisinternal gets you both properties you need IIRC. Certainly it hides them from pg_dump, which was the requirement for me.
You can't easily create a tgisinternal trigger from SQL. You can hack it but it's ugly. It's simple enough to just create from C. See:
https://github.com/2ndQuadrant/bdr/blob/5567302d8112c5422efc80fc43d79cd347afe09b/bdr_executor.c#L393
Other people are doing it the hacky way already, see e.g.:
Rather than overloading 'e', we could introduce a new dependency type
that references an extension, but means that the dependent object should
be dropped when the extension is, but it can also be dropped on its own,
and pg_dump should ignore it.
So ... kind of like tgisinternal and 'i' dependencies, but without the restriction on the object being dropped directly?
Is there any particular reason the user needs to be able to drop the created trigger directly?
Is it reasonable to endorse the use of 'i' dependencies by extensions instead?
At 2016-01-18 11:08:19 +0530, ams@2ndQuadrant.com wrote: > > I'm proposing to address a part of that problem by allowing extension > dependencies to be explicitly declared for functions and objects > created either by a user or dynamically by the extension itself—things > that need the extension to function, but aren't a part of it. I didn't hear any further suggestions, so here's a patch for discussion. 1. This adds the 'x'/DEPENDENCY_AUTO_EXTENSION type. 2. This adds an 'ALTER FUNCTION … ADD DEPENDENT FUNCTION …' command. I split up the two because we may want the new dependency type without going to the trouble of adding a new command. Maybe extension authors should just insert an 'x' row into pg_depend directly? I was inclined to implement it using ALTER FUNCTION, but AlterFunction() is focused on altering the pg_proc entry for a function, so the new code didn't fit. Ultimately, ExecAlterExtensionContentsStmt() was the closest match, so that's where I did it. Comments welcome. I'll add this patch to the CF. -- Abhijit
Attachment
On 2/29/16 7:27 PM, Abhijit Menon-Sen wrote: > 1. This adds the 'x'/DEPENDENCY_AUTO_EXTENSION type. > 2. This adds an 'ALTER FUNCTION … ADD DEPENDENT FUNCTION …' command. > > I split up the two because we may want the new dependency type without > going to the trouble of adding a new command. Maybe extension authors > should just insert an 'x' row into pg_depend directly? I don't see why this would be limited to just functions. I could certainly see an extension that creates ease-of-use views that depend on the extension, or tables that have triggers that .... Am I missing something? > I was inclined to implement it using ALTER FUNCTION, but AlterFunction() > is focused on altering the pg_proc entry for a function, so the new code > didn't fit. Ultimately, ExecAlterExtensionContentsStmt() was the closest > match, so that's where I did it. Maybe the better way to handle this would be through ALTER EXTENSION? Given the audience for this, I think it'd probably be OK to just provide a function that does this, instead of DDL. I'd be concerned about asking users to do raw inserts though. pg_depends isn't the easiest thing to grok so I suspect there'd be a lot of problems with that, resulting in more raw DML to try and fix things, resulting in pg_depend getting completely screwed up... -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com
At 2016-02-29 19:56:07 -0600, Jim.Nasby@BlueTreble.com wrote: > > I don't see why this would be limited to just functions. […] Am I > missing something? No, you are not missing anything. The specific problem I was trying to solve involved a function, so I sketched out a solution for functions. Once we have some consensus on whether that's an acceptable approach, I'll extend the patch in whatever way we agree seems appropriate. > Maybe the better way to handle this would be through ALTER EXTENSION? That's what this (second) patch does. > Given the audience for this, I think it'd probably be OK to just > provide a function that does this, instead of DDL. That seems like a promising idea. Can you suggest some possible usage? Thanks. -- Abhijit
On 2/29/16 10:33 PM, Abhijit Menon-Sen wrote: >> >Given the audience for this, I think it'd probably be OK to just >> >provide a function that does this, instead of DDL. > That seems like a promising idea. Can you suggest some possible usage? pg_extension_dependency( regextension, any ) where "any" would be all the other reg* types. That should be a lot less work to code up than messing with the grammar. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com
Hi Abhijit, On 3/1/16 8:36 AM, Jim Nasby wrote: > On 2/29/16 10:33 PM, Abhijit Menon-Sen wrote: >>> >Given the audience for this, I think it'd probably be OK to just >>> >provide a function that does this, instead of DDL. >> That seems like a promising idea. Can you suggest some possible usage? > > pg_extension_dependency( regextension, any ) > > where "any" would be all the other reg* types. That should be a lot less > work to code up than messing with the grammar. So where are we on this now? Were you going to implement this as a function the way Jim suggested? Alexander, you are signed up to review. Any opinion on which course is best? Thanks, -- -David david@pgmasters.net
Abhijit Menon-Sen wrote: > At 2016-01-18 11:08:19 +0530, ams@2ndQuadrant.com wrote: > > > > I'm proposing to address a part of that problem by allowing extension > > dependencies to be explicitly declared for functions and objects > > created either by a user or dynamically by the extension itself—things > > that need the extension to function, but aren't a part of it. > > I didn't hear any further suggestions, so here's a patch for discussion. > > 1. This adds the 'x'/DEPENDENCY_AUTO_EXTENSION type. > 2. This adds an 'ALTER FUNCTION … ADD DEPENDENT FUNCTION …' command. > > I split up the two because we may want the new dependency type without > going to the trouble of adding a new command. Maybe extension authors > should just insert an 'x' row into pg_depend directly? Surely not. I don't think the first patch is acceptable standalone -- we need both things together. > I was inclined to implement it using ALTER FUNCTION, but AlterFunction() > is focused on altering the pg_proc entry for a function, so the new code > didn't fit. Ultimately, ExecAlterExtensionContentsStmt() was the closest > match, so that's where I did it. Right, but see AlterObjectOwner() or ExecAlterObjectSchemaStmt() whereby an arbitrary object has some property altered. I think that's a closer model for this. It's still not quite the same, because those two functions are still about modifying an object's catalog row rather than messing with things outside of its own catalog. But in reality, pg_depend handling is mixed up with other changes all over the place. Anyway I think this should be something along the lines of ALTER FUNCTION foo() DEPENDS ON EXTENSION bar; because it's really that object's behavior that you're modifying, not the extension's. Perhaps we could use the precedent that columns "own" sequences when they use them in their default value, which would lead to ALTER FUNCTION foo() OWNED BY EXTENSION bar; (which might cause a problem when you want to mark sequences as dependant on extensions, because we already have OWNED BY for them. But since EXTENSION is already a reserved word, maybe it's fine.) I wondered whether it's right to be focusing solely on extensions as being possible targets of such dependencies. It's true that extensions are the only "object containers" we have, but perhaps you want to mark a function as dependant on some view, type, or another function, for instance. Another argument to focus only on extensions is that pg_dump knows specifically about extensions for supressing objects to dump, and we don't have any other object type doing the same kind of thing; so perhaps extensions-only is fine. I'm undecided on this. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
At 2016-03-19 17:46:25 -0300, alvherre@2ndquadrant.com wrote: > > I don't think the first patch is acceptable standalone -- we need both > things together. OK. > But in reality, pg_depend handling is mixed up with other changes all > over the place. Yes, I noticed that. :-/ > Anyway I think this should be something along the lines of > ALTER FUNCTION foo() DEPENDS ON EXTENSION bar; OK. That's reasonable. > ALTER FUNCTION foo() OWNED BY EXTENSION bar; If the function is really OWNED BY EXTENSION, then the right way to declare it would seem to be ALTER EXTENSION … ADD FUNCTION. I prefer DEPENDS ON EXTENSION for this reason, there's no ambiguity about what we're declaring. > Another argument to focus only on extensions is that pg_dump knows > specifically about extensions for supressing objects to dump, and we > don't have any other object type doing the same kind of thing; so > perhaps extensions-only is fine. That's the argument that motivates this particular patch. I think if we have a DEPENDS ON EXTENSION framework, it (a) addresses the immediate need, and (b) gives us a straightforward way to add DEPENDS ON <x> in future when we find some need for it. I'll write up a patch for this. Thanks for the suggestions. -- Abhijit
On Mon, Mar 21, 2016 at 9:34 AM, Abhijit Menon-Sen <ams@2ndquadrant.com> wrote:
At 2016-03-19 17:46:25 -0300, alvherre@2ndquadrant.com wrote:
>
> I don't think the first patch is acceptable standalone -- we need both
> things together.
OK.
> But in reality, pg_depend handling is mixed up with other changes all
> over the place.
Yes, I noticed that. :-/
> Anyway I think this should be something along the lines of
> ALTER FUNCTION foo() DEPENDS ON EXTENSION bar;
OK. That's reasonable.
> ALTER FUNCTION foo() OWNED BY EXTENSION bar;
If the function is really OWNED BY EXTENSION, then the right way to
declare it would seem to be ALTER EXTENSION … ADD FUNCTION. I prefer
DEPENDS ON EXTENSION for this reason, there's no ambiguity about what
we're declaring.
I'm not sure why we want to make new dependency type by ALTER FUNCTION command, not ALTER EXTENSION?
Since, we already add 'e' dependencies by ALTER EXTENSION command, why it should be different for 'x' dependencies.
The argument could be that 'x' dependency type would be used for other objects not extensions. But this is much more general problem and it's unclear, that we would end up with this behaviour and this dependency type.
So, I would prefer this patch to extend ALTER EXTENSION command while it's aimed to this particular problem.
I even think we could extend existent grammar rule
| ALTER EXTENSION name add_drop FUNCTION function_with_argtypes
*************** AlterExtensionContentsStmt:
*** 3982,3987 ****
--- 3987,3993 ----
n->objtype = OBJECT_FUNCTION;
n->objname = $6->funcname;
n->objargs = $6->funcargs;
+ n->deptype = 'e';
$$ = (Node *)n;
}
instead of adding another
+ | ALTER EXTENSION name add_drop DEPENDENT FUNCTION function_with_argtypes
+ {
+ AlterExtensionContentsStmt *n = makeNode(AlterExtensionContentsStmt);
+ n->extname = $3;
+ n->action = $4;
+ n->objtype = OBJECT_FUNCTION;
+ n->objname = $7->funcname;
+ n->objargs = $7->funcargs;
+ n->deptype = 'x';
$$ = (Node *)n;
}
by introducing separate rule extension_dependency_type.
In the same way we could dependency type parameter to each ALTER EXTENSION grammar rule. Therefore, existent functionality would be extended in natural way with not large changes in the code.
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
At 2016-03-21 13:04:33 +0300, a.korotkov@postgrespro.ru wrote: > > I'm not sure why we want to make new dependency type by ALTER FUNCTION > command, not ALTER EXTENSION? It's a matter of semantics. It means something very different than what an 'e' dependency means. The extension doesn't own the function (and so pg_dump shouldn't ignore it), but the function depends on the extension (and so dropping the extension should drop it). > The argument could be that 'x' dependency type would be used for other > objects not extensions. I suppose this is possible, but yes, I agree with you that it's not clear how or why this would be useful. > So, I would prefer this patch to extend ALTER EXTENSION command while > it's aimed to this particular problem. OK, so that's what the patch does, and it's certainly the simplest approach for reasons discussed earlier (though perhaps not as clear semantically as the ALTER FUNCTION approach). But: > I even think we could extend existent grammar rule > > | ALTER EXTENSION name add_drop FUNCTION function_with_argtypes > > *************** AlterExtensionContentsStmt: > > *** 3982,3987 **** > > --- 3987,3993 ---- > > n->objtype = OBJECT_FUNCTION; > > n->objname = $6->funcname; > > n->objargs = $6->funcargs; > > + n->deptype = 'e'; > > $$ = (Node *)n; > > } How exactly do you propose to do this, i.e., what would the final command to declare an 'x' dependency look like? -- Abhijit
On Mon, Mar 21, 2016 at 2:19 PM, Abhijit Menon-Sen <ams@2ndquadrant.com> wrote:
At 2016-03-21 13:04:33 +0300, a.korotkov@postgrespro.ru wrote:
>
> I'm not sure why we want to make new dependency type by ALTER FUNCTION
> command, not ALTER EXTENSION?
It's a matter of semantics. It means something very different than what
an 'e' dependency means. The extension doesn't own the function (and so
pg_dump shouldn't ignore it), but the function depends on the extension
(and so dropping the extension should drop it).
> The argument could be that 'x' dependency type would be used for other
> objects not extensions.
I suppose this is possible, but yes, I agree with you that it's not
clear how or why this would be useful.
> So, I would prefer this patch to extend ALTER EXTENSION command while
> it's aimed to this particular problem.
OK, so that's what the patch does, and it's certainly the simplest
approach for reasons discussed earlier (though perhaps not as clear
semantically as the ALTER FUNCTION approach). But:
> I even think we could extend existent grammar rule
>
> | ALTER EXTENSION name add_drop FUNCTION function_with_argtypes
> > *************** AlterExtensionContentsStmt:
> > *** 3982,3987 ****
> > --- 3987,3993 ----
> > n->objtype = OBJECT_FUNCTION;
> > n->objname = $6->funcname;
> > n->objargs = $6->funcargs;
> > + n->deptype = 'e';
> > $$ = (Node *)n;
> > }
How exactly do you propose to do this, i.e., what would the final
command to declare an 'x' dependency look like?
I'm proposed something like this.
extension_dependency_type:
DEPENDENT { $$ = 'x'; }
| /*EMPTY*/ { $$ = 'e'; }
;
...
| ALTER EXTENSION name add_drop extension_dependency_type FUNCTION function_with_argtypes
{
AlterExtensionContentsStmt *n = makeNode(AlterExtensionContentsStmt);
n->extname = $3;
n->action = $4;
n->objtype = OBJECT_FUNCTION;
n->objname = $7->funcname;
n->objargs = $7->funcargs;
n->deptype = $5;
$$ = (Node *)n;
}
I didn't try it. Probably it causes a grammar conflicts. In this case I don't insist on it.
------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
At 2016-03-21 12:04:40 +0530, ams@2ndQuadrant.com wrote: > > I'll write up a patch for this. Thanks for the suggestions. Here's a patch to implement ALTER FUNCTION x DEPENDS ON EXTENSION y. The changes to functioncmds.c:AlterFunction were less intrusive than I had originally feared. -- Abhijit
Attachment
Abhijit Menon-Sen wrote: > + else if (strcmp(defel->defname, "extdepend") == 0) > + { > + if (*extdepend_item) > + goto duplicate_error; > + > + *extdepend_item = defel; > + } > else > return false; > I'm not sure I agree with this implementation. I mentioned ALTER .. SET SCHEMA and ALTER .. OWNER TO as examples because, since other object types were mentioned as possible targets for this command, then this should presumably object-type-agnostic, like those ALTER forms are. So IMO we shouldn't shoehorn this into AlterFunctionStmt but rather have its own node AlterObjectDepends or similar. The other point is that if we're doing it in ALTER FUNCTION which allows multiple subcommands in one go, why do we not allow to run this command for multiple extensions? After all, it's not completely stupid to think that one function could depend on multiple extensions, and so if you agree with that then it's not completely stupid that it should be possible to declare such in one command, i.e. ALTER FUNCTION .. DEPENDS ON EXTENSION one, two; or perhaps ALTER FUNCTION .. DEPENDS ON EXTENSION one, DEPENDS ON EXTENSION two; -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Mon, Mar 21, 2016 at 7:19 AM, Abhijit Menon-Sen <ams@2ndquadrant.com> wrote: > At 2016-03-21 13:04:33 +0300, a.korotkov@postgrespro.ru wrote: >> >> I'm not sure why we want to make new dependency type by ALTER FUNCTION >> command, not ALTER EXTENSION? > > It's a matter of semantics. It means something very different than what > an 'e' dependency means. The extension doesn't own the function (and so > pg_dump shouldn't ignore it), but the function depends on the extension > (and so dropping the extension should drop it). Yeah, I think this is definitely an ALTER FUNCTION kind of thing, not an ALTER EXTENSION kind of thing. I also think we should allow a function to depend on multiple extensions, as Alvaro mentions downthread. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
At 2016-03-21 15:43:09 -0400, robertmhaas@gmail.com wrote: > > I also think we should allow a function to depend on multiple > extensions, as Alvaro mentions downthread. I'm working on an updated patch, will post shortly. -- Abhijit
Hi. I implemented "ALTER FUNCTION … DEPENDS ON EXTENSION" using a new node (AlterObjectDependsStmt), and tried to add "ALTER TRIGGER … DEPENDS ON EXTENSION" (partly because I wanted to make sure the code could support multiple object types, partly because it's convenient in this particular use case). Then I ran into a problem, which I could use some help with. The main ExecAlterObjectDependsStmt() function is very simple: +ObjectAddress +ExecAlterObjectDependsStmt(AlterObjectDependsStmt *stmt) +{ + ObjectAddress address; + ObjectAddress extAddr; + + address = get_object_address(stmt->objectType, stmt->objname, stmt->objargs, + NULL, AccessExclusiveLock, false); + extAddr = get_object_address(OBJECT_EXTENSION, stmt->extname, NULL, + NULL, AccessExclusiveLock, false); + + recordDependencyOn(&address, &extAddr, DEPENDENCY_AUTO_EXTENSION); + + return address; +} All that's needed is to construct the node based on variants of the ALTER command: +AlterObjectDependsStmt: + ALTER FUNCTION function_with_argtypes DEPENDS ON EXTENSION name + { + AlterObjectDependsStmt *n = makeNode(AlterObjectDependsStmt); + n->objectType = OBJECT_FUNCTION; + n->objname = $3->funcname; + n->objargs = $3->funcargs; + n->extname = list_make1(makeString($7)); + $$ = (Node *)n; + } + | ALTER TRIGGER name ON any_name DEPENDS ON EXTENSION name + { + AlterObjectDependsStmt *n = makeNode(AlterObjectDependsStmt); + n->objectType = OBJECT_TRIGGER; + n->objname = list_make1(lappend($5, makeString($3))); + n->objargs = NIL; + n->extname = list_make1(makeString($9)); + $$ = (Node *)n; + } + ; Now, the first part of this works fine. But with the second part, I get a reduce/reduce conflict if I use any_name. Here's an excerpt from the verbose bison output: State 2920 1181 AlterObjectDependsStmt: ALTER TRIGGER name ON any_name DEPENDS . ON EXTENSION name ON shift, and go to state 3506 State 2921 1165 RenameStmt: ALTER TRIGGER name ON qualified_name RENAME . TO name TO shift, and go to state 3507 State 2922 829 attrs: '.' . attr_name 2006 indirection_el: '.' . attr_name 2007 | '.' . '*' … attr_name go to state 3508 ColLabel go to state 1937 unreserved_keyword go to state 1938 col_name_keyword go to state 1939 type_func_name_keyword go to state 1940 reserved_keyword go to state 1941 State 3506 1181 AlterObjectDependsStmt: ALTER TRIGGER name ON any_name DEPENDS ON . EXTENSION name EXTENSION shift, and go to state 4000 State 3507 1165 RenameStmt: ALTER TRIGGER name ON qualified_name RENAME TO . name … name go to state 4001 ColId go to state 543 unreserved_keyword go to state 544 col_name_keyword go to state 545 State 3508 829 attrs: '.' attr_name . 2006 indirection_el: '.' attr_name . RENAME reduce using rule 2006 (indirection_el) '[' reduce using rule 2006 (indirection_el) '.' reduce using rule 829 (attrs) '.' [reduce using rule 2006 (indirection_el)] $default reduce using rule 829 (attrs) I can see that the problem is that it can reduce '.' in two ways, but I'm afraid I don't remember enough about yacc to know how to fix it. If anyone has suggestions, I would be grateful. Meanwhile, I tried using qualified_name instead of any_name, which does work without conflicts, but I end up with a RangeVar instead of the sort of List* that I can feed to get_object_address. I could change ExecAlterObjectDependsStmt() to check if stmt->relation is set, and if so, convert the RangeVar back to a List* in the right format before passing it to get_object_address. That would work fine, but it doesn't seem like a good solution. I could write some get_object_address_rv() wrapper that does essentially the same conversion, but that's only slightly better. Are there any other options? (Apart from the above, the rest of the patch is really just boilerplate for the new node type, but I've attached it anyway for reference.) -- Abhijit
Attachment
On Wed, Mar 23, 2016 at 1:00 PM, Abhijit Menon-Sen <ams@2ndquadrant.com> wrote: > Now, the first part of this works fine. But with the second part, I get > a reduce/reduce conflict if I use any_name. Here's an excerpt from the > verbose bison output: > > State 2920 > > 1181 AlterObjectDependsStmt: ALTER TRIGGER name ON any_name DEPENDS . ON EXTENSION name > > ON shift, and go to state 3506 > > > State 2921 > > 1165 RenameStmt: ALTER TRIGGER name ON qualified_name RENAME . TO name > > TO shift, and go to state 3507 Yeah, that ain't gonna work. If the existing ALTER TRIGGER production uses "name ON qualified_name", switching to "name ON any_name" for the new production is not going to work. You're going to have to make it "name ON qualified_name" and deal with the fallout of that some other way. > I could change ExecAlterObjectDependsStmt() to check if stmt->relation > is set, and if so, convert the RangeVar back to a List* in the right > format before passing it to get_object_address. That would work fine, > but it doesn't seem like a good solution. > > I could write some get_object_address_rv() wrapper that does essentially > the same conversion, but that's only slightly better. I might have a view on which of these alternatives is for the best at some point in time, but I do not have one now. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Robert Haas wrote: > On Wed, Mar 23, 2016 at 1:00 PM, Abhijit Menon-Sen <ams@2ndquadrant.com> wrote: > > Now, the first part of this works fine. But with the second part, I get > > a reduce/reduce conflict if I use any_name. Here's an excerpt from the > > verbose bison output: > > > > State 2920 > > > > 1181 AlterObjectDependsStmt: ALTER TRIGGER name ON any_name DEPENDS . ON EXTENSION name > > > > ON shift, and go to state 3506 > > > > > > State 2921 > > > > 1165 RenameStmt: ALTER TRIGGER name ON qualified_name RENAME . TO name > > > > TO shift, and go to state 3507 > > Yeah, that ain't gonna work. If the existing ALTER TRIGGER production > uses "name ON qualified_name", switching to "name ON any_name" for the > new production is not going to work. You're going to have to make it > "name ON qualified_name" and deal with the fallout of that some other > way. I helped Abhijit study this problem yesterday. I found it a bit annoying that RenameStmt uses RangeVars here (qualified_name) instead of plain lists like the other productions that use generic objects do. I thought one idea to get from under this problem is to change these productions into using lists too (any_name), but the problem with that is that qualified_name allows inheritance markers (a * and the ONLY keyword), which any_name doesn't. So it'd probably be okay for some cases, but others are definitely going to need the inheritance markers in some other way, which would be annoying. The other problem is that the downstream code uses the RangeVar lookup callback mechanism to lookup objects and perform permissions checking before locking. I imagine the only way to make that work with lists-in-parser would be to turn the lists into rangevars after the parser is done ... but that doesn't sound any better. In other words I think the conclusion here is that we must use qualified_name in the new production rather than switching the old production to any_name. > > I could change ExecAlterObjectDependsStmt() to check if stmt->relation > > is set, and if so, convert the RangeVar back to a List* in the right > > format before passing it to get_object_address. That would work fine, > > but it doesn't seem like a good solution. > > > > I could write some get_object_address_rv() wrapper that does essentially > > the same conversion, but that's only slightly better. > > I might have a view on which of these alternatives is for the best at > some point in time, but I do not have one now. I think I would like to see code implement both alternatives to see which one is least ugly. Maybe a third idea will manifest itself upon seeing those. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
At 2016-03-24 12:31:16 -0300, alvherre@2ndquadrant.com wrote: > > In other words I think the conclusion here is that we must use > qualified_name in the new production rather than switching the old > production to any_name. Makes sense. > I think I would like to see code implement both alternatives to see > which one is least ugly. Maybe a third idea will manifest itself upon > seeing those. Here's the first one. ExecAlterObjectDependsStmt() looks like this: +ObjectAddress +ExecAlterObjectDependsStmt(AlterObjectDependsStmt *stmt) +{ + ObjectAddress address; + ObjectAddress extAddr; + Relation rel = NULL; + + /* + * If the parser handed us a RangeVar, we add the relation's name to + * stmt->objname so that we can pass it to get_object_address(). + */ + if (stmt->relation) + { + stmt->objname = lcons(makeString(stmt->relation->relname), stmt->objname); + if (stmt->relation->schemaname) + stmt->objname = lcons(makeString(stmt->relation->schemaname), stmt->objname); + if (stmt->relation->catalogname) + stmt->objname = lcons(makeString(stmt->relation->catalogname), stmt->objname); + } + + address = get_object_address(stmt->objectType, stmt->objname, stmt->objargs, + &rel, AccessExclusiveLock, false); + + if (rel) + heap_close(rel, NoLock); + + extAddr = get_object_address(OBJECT_EXTENSION, stmt->extname, NULL, + &rel, AccessExclusiveLock, false); + + recordDependencyOn(&address, &extAddr, DEPENDENCY_AUTO_EXTENSION); + + return address; +} (This works fine for both functions and triggers, I tested it.) Complete patch attached for reference. I'll post the get_object_address_rv() variant tomorrow, but comments are welcome in the meantime. -- Abhijit
Attachment
At 2016-03-24 22:48:51 +0530, ams@2ndQuadrant.com wrote: > > > I think I would like to see code implement both alternatives to see > > which one is least ugly. Maybe a third idea will manifest itself > > upon seeing those. > > Here's the first one. ExecAlterObjectDependsStmt() looks like this: Here's the second one, which is only slightly different from the first. ExecAlterObjectDependsStmt() now looks like this: +ObjectAddress +ExecAlterObjectDependsStmt(AlterObjectDependsStmt *stmt) +{ + ObjectAddress address; + ObjectAddress extAddr; + Relation rel = NULL; + + address = get_object_address_rv(stmt->objectType, stmt->relation, stmt->objname, + stmt->objargs, &rel, AccessExclusiveLock, false); + + if (rel) + heap_close(rel, NoLock); + + extAddr = get_object_address(OBJECT_EXTENSION, stmt->extname, NULL, + &rel, AccessExclusiveLock, false); + + recordDependencyOn(&address, &extAddr, DEPENDENCY_AUTO_EXTENSION); + + return address; +} And the new get_object_address_rv() looks like this: +ObjectAddress +get_object_address_rv(ObjectType objtype, RangeVar *rel, List *objname, + List *objargs, Relation *relp, LOCKMODE lockmode, + bool missing_ok) +{ + if (rel) + { + objname = lcons(makeString(rel->relname), objname); + if (rel->schemaname) + objname = lcons(makeString(rel->schemaname), objname); + if (rel->catalogname) + objname = lcons(makeString(rel->catalogname), objname); + } + + return get_object_address(objtype, objname, objargs, + relp, lockmode, missing_ok); +} Complete patch attached for reference, as before. (I know I haven't documented the function. I will go through the code to see if there are any other potential callers, but I wanted to share what I had already.) -- Abhijit
Attachment
Hi Abhijit, On 3/25/16 1:57 PM, Abhijit Menon-Sen wrote: > Complete patch attached for reference, as before. (I know I haven't > documented the function. I will go through the code to see if there are > any other potential callers, but I wanted to share what I had already.) I'm not entirely sure whether this patch should be marked "ready for review" or not. Either way it looks like you need to post a patch with more documentation - do you know when you'll have that ready? Thanks, -- -David david@pgmasters.net
At 2016-03-29 10:15:51 -0400, david@pgmasters.net wrote: > > Either way it looks like you need to post a patch with more > documentation - do you know when you'll have that ready? Here it is. (I was actually looking for other potential callers, but I couldn't find any. There are some places that take a RangeVar and make a list from it, but they are creating new nodes, and don't quite fit. So the only change in this patch is to add a comment above the get_object_address_rv function.) Álvaro, do you like this one any better? -- Abhijit
Attachment
Abhijit Menon-Sen wrote: > At 2016-03-29 10:15:51 -0400, david@pgmasters.net wrote: > > > > Either way it looks like you need to post a patch with more > > documentation - do you know when you'll have that ready? > > Here it is. > > (I was actually looking for other potential callers, but I couldn't find > any. There are some places that take a RangeVar and make a list from it, > but they are creating new nodes, and don't quite fit. So the only change > in this patch is to add a comment above the get_object_address_rv > function.) > > Álvaro, do you like this one any better? Well, yes and no. It's certainly much cleaner than the other approach IMO. But this patch makes me consider the following question: could I use this to implement ExecRenameStmt, instead of the current code? It's not a trivial question, because the current rename code goes through RangeVarGetRelidExtended with custom callbacks, to ensure that they have the correct object before locking. get_object_address also has some protections against this, but I'm not really clear on whether they offer the same guarantees. If they do, we could replace the rangevar lookup with the new get_object_address_rv and the end result would probably turn out simpler. At this point I hate myself for introducing the SQL-accessible code for get_object_address and friends; we could change the representation of what comes out of the parser, but that functionality would be broken if we do it now. I think it's okay to change it at some point in the future, given sufficient reason, but I'm not sure that this patch is that. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Abhijit Menon-Sen wrote: > At 2016-03-29 10:15:51 -0400, david@pgmasters.net wrote: > > > > Either way it looks like you need to post a patch with more > > documentation - do you know when you'll have that ready? > > Here it is. > > (I was actually looking for other potential callers, but I couldn't find > any. There are some places that take a RangeVar and make a list from it, > but they are creating new nodes, and don't quite fit. So the only change > in this patch is to add a comment above the get_object_address_rv > function.) I gave this another look. To test whether the grammar is good, I tried a few additional object cases. They all seem to work fine, provided that we use the same production for the object name as in the corresponding ALTER <foo> case for the object. The attached is simply an extension with those new grammar rules -- I didn't go beyond ensuring the new productions didn't cause any conflicts. (I also removed the new includes in alter.c, which are no longer necessary AFAICS.) At this point I think we're missing user-level docs for the additional clause in each ALTER command. I think it's a fiddly business, because each individual ALTER page is going to need to be touched for the new clause, but that can't be avoided. Do you have any ideas on how this would appear in regression tests? -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
At 2016-04-04 18:55:03 -0300, alvherre@2ndquadrant.com wrote: > > At this point I think we're missing user-level docs for the additional > clause in each ALTER command. Thanks for having a look. Now that you're happy with the grammar, I'll write the remaining docs and resubmit the patch later today. > Do you have any ideas on how this would appear in regression tests? No specific ideas. At a high level, I think we should install an empty extension, create one of each kind of object, alter them to depend on the extension, check that pg_depend has the right 'x' rows, then drop the extension and make sure the objects have gone away. Does that sound reasonable? Suggestions welcome. -- Abhijit
Hi. Here's the updated patch. It fixes a couple of small problems, and includes documentation and tests (which I placed in a new file in src/test/modules/test_extensions, on Petr's advice). I wanted to post this before I went on to attempt any more grammar cleanups. Please let me know if there's anything else you'd like to see here. Thanks again for your review and suggestions. -- Abhijit
Attachment
Álvaro: I did document and test the extra types you added, but now that I think about it a bit more, it's hard to argue that it's useful to have a table, for example, depend on an extension. There's really nothing about a table that "doesn't work without" an extension. -- Abhijit
At 2016-04-05 12:33:56 +0530, ams@2ndQuadrant.com wrote: > > Álvaro: I did document and test the extra types you added, but now > that I think about it a bit more, it's hard to argue that it's useful > to have a table, for example, depend on an extension. There's really > nothing about a table that "doesn't work without" an extension. I think it makes sense to implement this for triggers and functions. It may also be useful for indexes and materialised views, which can refer to functions in an extension (and in future, sequences as well). It's certainly good to know the grammar would work if we wanted to add other object types in future, but I think we should leave it at that. Thoughts? -- Abhijit
Abhijit Menon-Sen wrote: > At 2016-04-05 12:33:56 +0530, ams@2ndQuadrant.com wrote: > > > > Álvaro: I did document and test the extra types you added, but now > > that I think about it a bit more, it's hard to argue that it's useful > > to have a table, for example, depend on an extension. There's really > > nothing about a table that "doesn't work without" an extension. > > I think it makes sense to implement this for triggers and functions. It > may also be useful for indexes and materialised views, which can refer > to functions in an extension (and in future, sequences as well). > > It's certainly good to know the grammar would work if we wanted to add > other object types in future, but I think we should leave it at that. Yes, agreed. What I implemented weren't cases that were supposed to be useful to users, only those for which I thought it was good to test bison with. Sorry I wasn't clear about this. Feel free the strip out (some of?) them, if they aren't useful. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
OK, thanks for the clarification. Here's the earlier patch, but with the relevant added docs and tests retained. -- Abhijit
Attachment
Abhijit Menon-Sen wrote: > OK, thanks for the clarification. Here's the earlier patch, but with > the relevant added docs and tests retained. I'd like to add indexes and materialized views to the set of objects covered (functions and triggers). I'm already doing that, so no need to resubmit; it should be a pretty easy addition. I've done a few minor additional changes too. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Alvaro Herrera wrote: > Abhijit Menon-Sen wrote: > > OK, thanks for the clarification. Here's the earlier patch, but with > > the relevant added docs and tests retained. > > I'd like to add indexes and materialized views to the set of objects > covered (functions and triggers). I'm already doing that, so no need to > resubmit; it should be a pretty easy addition. I've done a few minor > additional changes too. ... and pushed. I changed the regression test a bit more, so please recheck. -- Álvaro Herrera http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
At 2016-04-05 18:45:58 -0300, alvherre@2ndquadrant.com wrote: > > I changed the regression test a bit more, so please recheck. Looks good, thank you. -- Abhijit