Thread: search_path vs extensions
Hi, Preliminary note: I'm using the term "extension" as if it's what we already agree to call them, feel free to ignore thisand use whatever term you see fit. We'll have the naming issue tackled, please not now though. Following-up to discussions we had at the Developer Meeting and subsequent pub events, I'd like us to agree upon the relations of extensions and search_path. We basically have to choose one of those: Proposal: do nothing What's good about it: it's already there, folks! What's not good about it: Users are alone on deciding where to put what, and the system won't help them: either public isa complete mess, or they have to manually care about search_path for their extensions and their own application needs.Installations where DBA and application folks are separate teams will suffer, ones where the application is heavilyusing schemas will suffer too. Proposal: pg_extension, a new dedicated system schema for extensions Good: It's easy to see SQL objects (\df) of extensions (think contribs) you installed, and as extension developpers are requiredto use it, you don't have to care about it any more. As you have only one namespace for everyone, the collisions are detected early. Not good: As you have only one namespace for everyone, collisions prevent users from installing several extensions usingthe same SQL object name, so we'd need a way for extension authors to share a catalog of free names, like internallywe do for systems OIDs in the bootstrap, IIUC. But in a distributed fashion. We would have to add ways for the user to see which extension which object belongs to, so you'd have extension | schema| object_name columns in all \dX things, e.g. Proposal: allow user schema to behave the same as pg_catalog Good: Tell the system your schema is implicit and be done with it, object searching won't need users to manage search_pathexplicitly. Not good: Breaking existing application code by adding an implicit schema in an existing database is damn too easy. And howto choose if the implicit schemas are to be searched in before or after the search_path? Proposal: Separate search_path into components: pre_search_path, search_path, post_search_path Good: This allows to easily separate who changes what: typically DBAs will edit pre and post search_path components whileapplication will care about search_path the same way as now. Not good: 2 new GUCs (but no new semantics, and defaults to empty) My vote is to go with the pre/post search_path components proposal as it's the one allowing the more flexibility, and we tend to value this a lot around here. Regards, -- dim
On May 25, 2009, at 2:16 AM, Dimitri Fontaine wrote: > Proposal: pg_extension, a new dedicated system schema for extensions > Good: > It's easy to see SQL objects (\df) of extensions (think contribs) you > installed, and as extension developpers are required to use it, you > don't have to care about it any more. > > As you have only one namespace for everyone, the collisions are > detected early. > Not good: > As you have only one namespace for everyone, collisions prevent users > from installing several extensions using the same SQL object name, so > we'd need a way for extension authors to share a catalog of free > names, like internally we do for systems OIDs in the bootstrap, > IIUC. But in a distributed fashion. > > We would have to add ways for the user to see which extension which > object belongs to, so you'd have extension | schema | object_name > columns in all \dX things, e.g. I like this, although I'd want to be able, as a user, to override that default and tell an extension to install in some other schema. That would allow me to immediately overcome conflicts, and to organize my extensions if I want, rather than throw them all in one place. > Proposal: Separate search_path into components: pre_search_path, > search_path, post_search_path > Good: > This allows to easily separate who changes what: typically DBAs will > edit pre and post search_path components while application will care > about search_path the same way as now. > Not good: > 2 new GUCs (but no new semantics, and defaults to empty) I don't follow this at all. How to the three components effect behavior? And what does this mean for where extensions are installed in schemas? Best, David
Hi, "David E. Wheeler" <david@kineticode.com> writes: > On May 25, 2009, at 2:16 AM, Dimitri Fontaine wrote: > >> Proposal: pg_extension, a new dedicated system schema for extensions > > I like this, although I'd want to be able, as a user, to override that > default and tell an extension to install in some other schema. That would > allow me to immediately overcome conflicts, and to organize my extensions > if I want, rather than throw them all in one place. The moment you're adding specific schemas where to put extensions into, you have to adapt your search_path. Some applications already have to manage search_path for their own needs, so we're trying to avoid having those people to care about extensions schemas and application schema at the same time. It could even not be the same people caring about those search_path parts. >> Proposal: Separate search_path into components: pre_search_path, >> search_path, post_search_path > > I don't follow this at all. How to the three components effect behavior? And > what does this mean for where extensions are installed in schemas? This proposal tries to solve previous one limitations. It's very good in the typical case when you want each extension to be installed in one (or more) schemas but don't want to have the application to care about it. Then you add your extensions schemas into pre_search_path and application schemas into search_path, so that the application doesn't have to manage pre_search_path. Now it could be that your application is historically using the same function names as some extension you're now adding to the server, and you want to control which function is called when not schema qualified. So you have the post_search_path to play with too. The idea being that application developpers will maintain search_path for the application schemas (and this search_path can vary depending on the application role which connects to the database, of course), and the DBA team will make extensions available transparently to the application by adding the extension's schemas in either pre_search_path or post_search_path. I hope I've added clarity to the point, rather than only some extra verbosity... :) Regards, -- dim
On May 27, 2009, at 1:50 AM, Dimitri Fontaine wrote: > The moment you're adding specific schemas where to put extensions > into, > you have to adapt your search_path. Some applications already have to > manage search_path for their own needs, so we're trying to avoid > having > those people to care about extensions schemas and application schema > at > the same time. That doesn't seem like much of a problem to me. I already do this for extensions. I agree that what you suggest should be the default, but I should be able to optionally install extensions in whatever schema I deem appropriate, especially if I want to avoid conflicts. > This proposal tries to solve previous one limitations. It's very > good in > the typical case when you want each extension to be installed in one > (or > more) schemas but don't want to have the application to care about it. > Then you add your extensions schemas into pre_search_path and > application schemas into search_path, so that the application doesn't > have to manage pre_search_path. So are pre_search_path and search_path and post_search_path basically just concatenated into that order? That doesn't seem to buy you much. > Now it could be that your application is historically using the same > function names as some extension you're now adding to the server, and > you want to control which function is called when not schema > qualified. So you have the post_search_path to play with too. It seems to me you'd just schema-qualify in this case. I mean, that's kind of the point of schemas. > The idea being that application developpers will maintain search_path > for the application schemas (and this search_path can vary depending > on > the application role which connects to the database, of course), and > the > DBA team will make extensions available transparently to the > application > by adding the extension's schemas in either pre_search_path or > post_search_path. I think more useful would be a way to append or prepend schemas to the search path within a given context (in a transaction or a connection). That way, instead of doing stuff like this: BEGIN; SET search_path = foo,bar,public; -- ... COMMIT; RESET search_path; …which suffers from an inability to easily modify an existing path (yes, I know I can look it up and parse it, but please), I could just do something like this: BEGIN; prepend_search_path('foo,bar'); COMMIT; And then it would be reverted at the end of the transaction. Or it could be for the duration of a connection; that probably makes more sense. > I hope I've added clarity to the point, rather than only some extra > verbosity... :) Yes, but it just seems like unnecessary complexity to me. We don't want to learn the lessons of Java's CLASSPATH by making things *more* complicated. Best, David
>>>>> "David" == "David E Wheeler" <david@kineticode.com> writes: >> The moment you're adding specific schemas where to put extensions>> into, you have to adapt your search_path. Some applications>>already have to manage search_path for their own needs, so we're>> trying to avoid having those people to careabout extensions>> schemas and application schema at the same time. David> That doesn't seem like much of a problem to me. Unfortunately, the fact that something doesn't seem like much of a problem to you doesn't actually make it less of a problem. Splitting up search_path is something I've been thinking about for a while (and threw out on IRC as a suggestion, which is where Dimitri got it); it was based on actual experience running an app that set the search path in the connection parameters in order to select which of several different schemas to use for part (not all) of the data. When setting search_path this way, there is no way to set only part of it; the client-supplied value overrides everything. Obviously there are other possible solutions, but pretending there isn't a problem will get nowhere. (Setting the search path using a function or sql statement _after_ connecting was not an option; it would have confused the connection persistance layer, which needed different parameters to tell the connections apart.) -- Andrew (irc:RhodiumToad)
Andrew Gierth wrote: > Splitting up search_path is something I've been thinking about for a > while (and threw out on IRC as a suggestion, which is where Dimitri > got it); it was based on actual experience running an app that set the > search path in the connection parameters in order to select which of > several different schemas to use for part (not all) of the data. When > setting search_path this way, there is no way to set only part of it; > the client-supplied value overrides everything. > > Obviously there are other possible solutions, but pretending there > isn't a problem will get nowhere. > > (Setting the search path using a function or sql statement _after_ > connecting was not an option; it would have confused the connection > persistance layer, which needed different parameters to tell the > connections apart.) > Another way of handling this might be to provide for prepending or appending to the search path (or even for removing items from it). examples - something like: alter database foo set search_path = '+bar, baz'; -- append alter database foo set search_path = 'bar, baz+'; -- prepend cheers andrew
Andrew Gierth <andrew@tao11.riddles.org.uk> writes: > Splitting up search_path is something I've been thinking about for a > while (and threw out on IRC as a suggestion, which is where Dimitri > got it); it was based on actual experience running an app that set the > search path in the connection parameters in order to select which of > several different schemas to use for part (not all) of the data. When > setting search_path this way, there is no way to set only part of it; > the client-supplied value overrides everything. > Obviously there are other possible solutions, but pretending there > isn't a problem will get nowhere. I agree that some more flexibility in search_path seems reasonable, but what we've got at the moment is pretty handwavy. Dimitri didn't suggest what the uses of the different parts of a three-part path would be, and also failed to say what the implications for the default creation namespace would be, as well as the existing special handling of pg_temp and pg_catalog. That stuff all works together pretty closely; it'd be easy to end up making it less usable not more so. regards, tom lane
Andrew Dunstan <andrew@dunslane.net> writes: > Another way of handling this might be to provide for prepending or > appending to the search path (or even for removing items from it). I was just about to raise that as a requirement. Some folks on this list might recognize the following coding pattern: create schema rhn_channel; --make rhn_channel be the default creation schemaupdate pg_settings set setting = 'rhn_channel,' || setting where name ='search_path'; ... create a bunch of objects in schema rhn_channel ... -- restore the original settingupdate pg_settings set setting = overlay( setting placing '' from 1 for (length('rhn_channel')+1)) where name = 'search_path'; I agree that a nicer way to do that would be good. > alter database foo set search_path = '+bar, baz'; -- append > alter database foo set search_path = 'bar, baz+'; -- prepend ... but that ain't it :-(. SET should mean SET, not "do something magic". Particularly in ALTER DATABASE/ALTER USER, whose execution order relative to other stuff isn't especially well defined. regards, tom lane
All, Wait, I thought we'd given up on the search path model and wanted to track extensions via dependencies. No? -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com
Josh Berkus <josh@agliodbs.com> writes: > Wait, I thought we'd given up on the search path model and wanted to > track extensions via dependencies. No? I think what this discussion is about is trying to gauge just what amount of support we could give someone who insisted on dropping each extension into a different schema. It's not really related to how we track which objects belong to which extension. regards, tom lane
On May 27, 2009, at 1:49 PM, Andrew Gierth wrote: > Splitting up search_path is something I've been thinking about for a > while (and threw out on IRC as a suggestion, which is where Dimitri > got it); it was based on actual experience running an app that set the > search path in the connection parameters in order to select which of > several different schemas to use for part (not all) of the data. When > setting search_path this way, there is no way to set only part of it; > the client-supplied value overrides everything. Right, which is why I was thinking about an interface to push schemas onto the front of the path. Or the end. > Obviously there are other possible solutions, but pretending there > isn't a problem will get nowhere. Yeah, it was just the splitting bit that seemed a bit much to me. > (Setting the search path using a function or sql statement _after_ > connecting was not an option; it would have confused the connection > persistance layer, which needed different parameters to tell the > connections apart.) Okay, then maybe it's the names of the paths in Dimitri's suggestion that were confusing me. prepend_search_path and append_search_path, or something like that, might be better. Best, David
On Mon, May 25, 2009 at 11:16 AM, Dimitri Fontaine <dfontaine@hi-media.com> wrote: > Hi, > > Preliminary note: I'm using the term "extension" as if it's what we > already agree to call them, feel free to ignore this and use whatever > term you see fit. We'll have the naming issue tackled, please not now > though. > [...] Few thoughts about the ideas: Basically I sort of don't like the idea of playing with search_path. In past I have set up a system where each extension had a separate schema. Maintaining per user search_path wasn't a very nice experience. And trying to alter it later on for whatever reason, especially from command line was even worse. :) I tend to avoid such designs now. :) I think it is much better to store objects in one schema (like public) and maintain access rights via roles. Like GRANT ltree_pkg TO userfoo; ...and build upon this idea. One of advantages of roles here is that you can DROP OWNED BY ltree_pkg; just as well as you did DROP SCHEMA ltree_pkg; And they take effect immediately, not requiring all sessions to restart to take up new search_path. Furthermore, I think it would be nice to have a cluster-wide pg_extension table which would list all the available (installed) packages available in the system (much like pg_database lists all databases present). This pg_extension should be used to "rewrite" extension objects into given schema using given role (which would be either fixed or user defined). The idea is that whenever user installs a RPM, DEB or whatever package the system registers the extension. Or she compiles from source and registers extension. Or we get a CPAN style utility which installs source, compiles and register the extension. Then administrator can copy over given extension into specific database, into specific schema. Simplest implementation would be that the pg_extension would contain a package name, package version (we can have multiple versions of the same package installed), install script (series of CREATE FUNCTION or whatever), uninstall script (may not be present) and some upgrade path would be needed as well. The installation would CREATE ROLE <packagename>_pkg and execute all CREATE FUNCTION inside schema PUBLIC. Then GRANT access. If administrator instructs so it might CREATE ROLE <packagename>_<schema>_pkg and execute all CREATE FUNCTION in schema <schema>. Uninstall would mean DROP OWNED BY <packagename>_pkg; OK, enough of my proposal. :-) Coming back to the pre_search_path -- it sounds somewhat like Oracle's PACKAGEs, only different (completely parallel hierarchy, but similar to schemas). I like the Oracle approach better though -- no messing with search_paths please... Best regards, Dawid -- .................. ``The essence of real creativity is a certain: *Dawid Kuroczko* : playfulness, a flittingfrom idea to idea: qnex42@gmail.com : without getting bogged down by fixated demands.''`..................' Sherkaner Underhill, A Deepness in the Sky, V. Vinge
On May 27, 2009, at 2:14 PM, Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: >> Another way of handling this might be to provide for prepending or >> appending to the search path (or even for removing items from it). > > I was just about to raise that as a requirement. Yeah, I likes. > Some folks on this > list might recognize the following coding pattern: > > create schema rhn_channel; > > --make rhn_channel be the default creation schema > update pg_settings set setting = 'rhn_channel,' || setting where > name = 'search_path'; > > ... create a bunch of objects in schema rhn_channel ... > > -- restore the original setting > update pg_settings set setting = overlay( setting placing '' from 1 > for (length('rhn_channel')+1) ) where name = 'search_path'; > > I agree that a nicer way to do that would be good. Oh, yes please. >> alter database foo set search_path = '+bar, baz'; -- append >> alter database foo set search_path = 'bar, baz+'; -- prepend > > ... but that ain't it :-(. SET should mean SET, not "do something > magic". > Particularly in ALTER DATABASE/ALTER USER, whose execution order > relative to other stuff isn't especially well defined. Perhaps a MODIFY keyword? Best, David
Tom, > I think what this discussion is about is trying to gauge just what > amount of support we could give someone who insisted on dropping each > extension into a different schema. It's not really related to how > we track which objects belong to which extension. Really, they're on their own. Either we drop everything into a standard pg_extensions schema (which is then programmatically part of the search path, like pg_catalog is) or we don't install them to any particular schema and leave it up to the DBA to work out any search_path issues on their own. Personally, if we're tracking stuff through special dependancies which pg_dump will be aware of anyway, I don't see why extension objects should go into a special schema. -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com
Josh Berkus <josh@agliodbs.com> writes: > Personally, if we're tracking stuff through special dependancies which > pg_dump will be aware of anyway, I don't see why extension objects > should go into a special schema. Well, we could certainly take that attitude and eliminate all this hassle ;-). However, I think that more-flexible search path handling might have other uses, so I don't see any reason not to think about it. regards, tom lane
Tom Lane wrote: > Josh Berkus <josh@agliodbs.com> writes: > >> Personally, if we're tracking stuff through special dependancies which >> pg_dump will be aware of anyway, I don't see why extension objects >> should go into a special schema. >> > > Well, we could certainly take that attitude and eliminate all this > hassle ;-). However, I think that more-flexible search path handling > might have other uses, so I don't see any reason not to think about it. > > > +1 I think Josh is right about extensions, but we certainly do need more powerful tools to manipulate the search path. cheers andrew
Tom, > Well, we could certainly take that attitude and eliminate all this > hassle ;-). However, I think that more-flexible search path handling > might have other uses, so I don't see any reason not to think about it. Sure. I think that having better search path management would be a wonderful thing; it would encourage people to use schema more in general. However, that doesn't mean that I think it should be part of the extensions design, or even a gating factor. For example, I could see these kinds of settings: search_path_override (suset) would set all users to a specific search path and raise an error at any set search_path attempts. This would be mainly for secure applications. search_path_suffix (suset) would append a certain set of schema to the end of the search path regardless of what else the user put in, e.g.: search_path_suffix = 'pg_modules, information_schema' search_path = 'main,web,accounts' ... would mean that any object named would search in main,web,accounts,pg_modules,information_schema. This would be one way to solve the issue of having extra schema for extensions or other "utilities" in applications. -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com
On Wed, May 27, 2009 at 9:01 PM, Josh Berkus <josh@agliodbs.com> wrote: > Sure. I think that having better search path management would be a > wonderful thing; it would encourage people to use schema more in general. > > However, that doesn't mean that I think it should be part of the extensions > design, or even a gating factor. Agreed, I think this is largely a tangent. However, since we're on that tangent, I'm not completely convinced that additional lists of search paths that get prepended or appended to the main search path are the right way to go. It seems like that's just chopping up the problem into smaller bits without really fixing anything. I wonder if the right solution might be to associate with each schema a list of other schemas to be searched if the object isn't found in that schema. This means that the contents of search_path would really become the roots of the trees of schemas to be searched. Then we could provide DDL commands to do things like: ALTER SCHEMA pg_extensions INHERIT SCHEMA my_new_extension; ALTER SCHEMA pg_extensions NO INHERIT SCHEMA extension_i_want_to_remove; </handwaving> ...Robert
Robert, > However, since we're on that tangent, I'm not completely convinced > that additional lists of search paths that get prepended or appended > to the main search path are the right way to go. It seems like that's > just chopping up the problem into smaller bits without really fixing > anything. I wonder if the right solution might be to associate with > each schema a list of other schemas to be searched if the object isn't > found in that schema. This means that the contents of search_path > would really become the roots of the trees of schemas to be searched. See, that strikes me a completely unmanageable and likely to give rise to application security holes. But you're a smart guy ... so, *why* would that be a better idea than some superuser settings? What am I not thinking of? -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com
On Wed, May 27, 2009 at 10:02 PM, Josh Berkus <josh@agliodbs.com> wrote: > Robert, > >> However, since we're on that tangent, I'm not completely convinced >> that additional lists of search paths that get prepended or appended >> to the main search path are the right way to go. It seems like that's >> just chopping up the problem into smaller bits without really fixing >> anything. I wonder if the right solution might be to associate with >> each schema a list of other schemas to be searched if the object isn't >> found in that schema. This means that the contents of search_path >> would really become the roots of the trees of schemas to be searched. > > See, that strikes me a completely unmanageable and likely to give rise to > application security holes. But you're a smart guy ... so, *why* would that > be a better idea than some superuser settings? What am I not thinking of? Hey, you're a smart guy too, plus you've been around here longer than I have and have more experience. If my idea sounds like it sucks, there's a real possibility that it does. With that caveat, my thought process was approximately as follows. The contents of a particular schema are more or less analagous to an application. In most programming languages, an application informs the system of the libraries that it needs and the system goes off and loads the symbols in those libraries into the application's namespace.Using search path basically requires the user to tellthe application where to find those symbols, which ISTM is exactly backwards. In other words, suppose we have an application in schema S that is designed to use make use of extensions installed in scheams E1, E4, and E7. With the pre- and post- search path approach, it's not sufficient for the user to set his or her search_path to S and then use the application. Instead, the user has to know that the search_path must be set to S, E1, E4, E7, or else E1, E4, and E7 have to be present in the system default post-search-path. And what happens if there is another application in schema S2 that uses E1, E2, and E4, where E2 is an older version of E7 with an overlapping set of names? There's no possible way of configuring your search paths that will make this work, short of explicitly setting the full search path to exactly the right thing for each application when using that application. Also, it seems to me that we could create a system schema called something like pg_extension and make it empty. Every extension could install in its own schema and then tell pg_extension to inherit it that schema. Then if you want to just get all the extensions, you can just set your search path to include pg_extension, and as new extensions are added or old ones are removed, you'll still have all the extensions without changing anything. I don't see how this could be made to work with the pre- and post- search_path idea; you'll be manually fiddling those settings in postgresql.conf, or on a per-user basis, or wherever you set them up, every time you add or remove an extension. I Just Work Here, You Want To Talk To The Boss. ...Robert
Hi all, Seems the night has been providing lots of thoughs :) Josh Berkus <josh@agliodbs.com> writes: > Sure. I think that having better search path management would be a > wonderful thing; it would encourage people to use schema more in general. > > However, that doesn't mean that I think it should be part of the extensions > design, or even a gating factor. First, this thread allowed us to go from: "we don't know where to install extensions" to: "we all agree that a specific pg_extension schema is a good idea, as soon as user is free not to use it at extensioninstall time". So you see, search_path and extensions are related and thinking about their relationship will help design the latter. > search_path_suffix = 'pg_modules, information_schema' > search_path = 'main,web,accounts' > > ... would mean that any object named would search in > main,web,accounts,pg_modules,information_schema. This would be one way to > solve the issue of having extra schema for extensions or other "utilities" > in applications. That really seems exactly to be what we're proposing with pre_ and post_ search_path components: don't change current meaning of search_path, just give DBAs better ways to manage it. And now that you're leaning towards a search_path suffix, don't you want a prefix too? Regards, -- dim
Robert Haas <robertmhaas@gmail.com> writes: > The contents of a particular schema are more or less analagous to an > application. In most programming languages, an application informs > the system of the libraries that it needs and the system goes off and > loads the symbols in those libraries into the application's namespace. > Using search path basically requires the user to tell the application > where to find those symbols, which ISTM is exactly backwards. Well, in fact, not so much, because the application is using SET to tell the system where to search for needed objects. That's about the same as your loading lib into the application namespace analogy. Now, using PostgreSQL, you can pre-set the setting at the database and role levels in order not to have to manage it explicitly in the application code. That's only a DBA convenience though, for places where the code and the database are not managed by the same teams (or at least it's the way I think about it --- this and database upgrades without costly application rewrites). > Also, it seems to me that we could create a system schema called > something like pg_extension and make it empty. Every extension could > install in its own schema and then tell pg_extension to inherit it > that schema. Then if you want to just get all the extensions, you can > just set your search path to include pg_extension, and as new > extensions are added or old ones are removed, you'll still have all > the extensions without changing anything. Then you do the exact same thing with the public schema in the first place, inheriting pg_extension if needed, and you deprecate search_path entirely. Don't forget the schemas are not there to solve extension managing problems, but a separate tool that have a great overlay with extensions, because we tend to like to have a schema (or more) per extension. Your proposal doesn't include any notion of search order within the tree of available schemas, which means we're loosing half of the search_path features (the other half is restricting the searches, which you address). I think I'm failing to understand where your proposal leads us the same way you seem to be failing to follow mine... Regards, -- dim
Andrew Dunstan <andrew@dunslane.net> writes: > Dimitri Fontaine wrote: >> "we all agree that a specific pg_extension schema is a good idea, as >> soon as user is free not to use it at extension install time". > > I don't think we all agree on that at all. ;-) Ooops, my mistake, as few people where taking that as implicit and as a reasoning basepoint in their mails, I assumed we were past the question already. Sorry to see that's too quick a conclusion... and thanks for pointing out the absence of consensus! Regards, -- dim
Hi, Tom Lane <tgl@sss.pgh.pa.us> writes: > Andrew Gierth <andrew@tao11.riddles.org.uk> writes: >> Splitting up search_path is something I've been thinking about for a >> while (and threw out on IRC as a suggestion, which is where Dimitri >> got it); it was based on actual experience running an app that set the >> search path in the connection parameters in order to select which of >> several different schemas to use for part (not all) of the data. When >> setting search_path this way, there is no way to set only part of it; >> the client-supplied value overrides everything. > >> Obviously there are other possible solutions, but pretending there >> isn't a problem will get nowhere. > > I agree that some more flexibility in search_path seems reasonable, > but what we've got at the moment is pretty handwavy. Dimitri didn't > suggest what the uses of the different parts of a three-part path > would be, and also failed to say what the implications for the default > creation namespace would be, as well as the existing special handling > of pg_temp and pg_catalog. That stuff all works together pretty > closely; it'd be easy to end up making it less usable not more so. What I have in mind is not to change current semantics, but allow users to have easier ways to manage things. Some other place in this thread we see syntax sugar propositions or tools to allow adding schemas in first or last place of search_path. It could be that some other ideas or better tools would be a much better way to solve the problem at hand, but as you asked, here's a rough sketch of how I'd use what I'm proposing: The mydb database is used from several applications and roles, and host 10 application schemas and 3 extensions (ip4r, prefix, pgq, say). Depending on the role, not all 10 schemas are in the search_path, and we're using non qualified objects names when the application developer think they're part of the database system (that includes extensions). What this currently means is that all role specific schemas must embed the extensions schemas at the right place. When prefix extension is added, all of them are to get reviewed. A better way to solve this is to have the database post_search_path (or call it search_path_suffix) contain the extensions schemas. Now the roles are set up without search_path_suffix, and it's easy to add an extension living in its own schema. (we'll have to choose whether defining a role specific search_path_suffix overrides the database specific one, too). Having all extensions live in pg_extension schema also solves the problem in a much easier way, except for people who care about not messing it all within a single schema (fourre-tout is the french for a place where you put anything and everything). As Josh is saying too, as soon as we have SQL level extension object with dependancies, we'll be able to list all of a particular extension's objects without needing to have them live in separate schemas.\df pgq. -- list all functions in schema pgq\dt pgq. -- listall tables in schema pgq\de pgq. -- list all objects provided by extension pgq Still, for extension upgrading or name collisions between extensions, or some more cases I'm not thinking about now, pg_extension will not be all what you need. We already have schemas and search_path, and it's not always pretty nor fun to play with. Would prefix/suffix components help? Regards, -- dim
* Dimitri Fontaine (dfontaine@hi-media.com) wrote: > Andrew Dunstan <andrew@dunslane.net> writes: > > Dimitri Fontaine wrote: > >> "we all agree that a specific pg_extension schema is a good idea, as > >> soon as user is free not to use it at extension install time". > > > > I don't think we all agree on that at all. ;-) > > Ooops, my mistake, as few people where taking that as implicit and as a > reasoning basepoint in their mails, I assumed we were past the question > already. Sorry to see that's too quick a conclusion... and thanks for > pointing out the absence of consensus! I'm not real happy with it either. Sure, we can track module dependencies seperately, but if we go down this route then we have to come up with some concept of an extension namespace that different extension use and prefix their functions/tables/etc with to avoid overlap with each other. Gee, doesn't that sound familiar. Not to mention that it's nice to be able to control access to an extension in one place rather than having to figure out all the pieces of a particular extension (sure, through the dependencies, but are we really going to have a "GRANT USAGE ON EXT x TO role1;" ? and what happens if someone changes the permissions on an individual item afterwards? etc..). Almost unrelated, I fail to see the value in continuing to keep the "magic" part of the search_path (eg: pg_catalog) to ourselves and not giving our users some ability to manipulate it. Thanks, Stephen
* Dimitri Fontaine (dfontaine@hi-media.com) wrote: > A better way to solve this is to have the database post_search_path (or > call it search_path_suffix) contain the extensions schemas. Now the > roles are set up without search_path_suffix, and it's easy to add an > extension living in its own schema. (we'll have to choose whether > defining a role specific search_path_suffix overrides the database > specific one, too). > > Having all extensions live in pg_extension schema also solves the > problem in a much easier way, except for people who care about not > messing it all within a single schema (fourre-tout is the french for a > place where you put anything and everything). I certainly agree with this approach, naming aside (I'd probably rather have 'system_search_path' that's added on as a suffix, or something similar). Thanks, Stephen
On May 28, 2009, at 1:34 AM, Dimitri Fontaine wrote: > Andrew Dunstan <andrew@dunslane.net> writes: >> Dimitri Fontaine wrote: >>> "we all agree that a specific pg_extension schema is a good idea, >>> as >>> soon as user is free not to use it at extension install time". >> >> I don't think we all agree on that at all. ;-) > > Ooops, my mistake, as few people where taking that as implicit and > as a > reasoning basepoint in their mails, I assumed we were past the > question > already. Sorry to see that's too quick a conclusion... and thanks for > pointing out the absence of consensus! I somehow missed Andrew's mail, but I agree that we don't all agree on that point. I'm fine with having a standard schema for extensions, just as long as I can tell the installer to actually install it in a different schema if I want/need to do so. Best, David
On May 28, 2009, at 1:13 AM, Dimitri Fontaine wrote: > Having all extensions live in pg_extension schema also solves the > problem in a much easier way, except for people who care about not > messing it all within a single schema (fourre-tout is the french for a > place where you put anything and everything). Yes, just as long as your extensions schema doesn't turn into a bricolage of stuff. I mean, if I use a lot of extensions, it means that I end up with a giant collection of functions and types and whatnot in this one namespace. PHP programmers might be happy with it, but not I. ;-P > As Josh is saying too, as soon as we have SQL level extension object > with dependancies, we'll be able to list all of a particular > extension's > objects without needing to have them live in separate schemas. > \df pgq. -- list all functions in schema pgq > \dt pgq. -- list all tables in schema pgq > \de pgq. -- list all objects provided by extension pgq > > Still, for extension upgrading or name collisions between > extensions, or > some more cases I'm not thinking about now, pg_extension will not be > all > what you need. We already have schemas and search_path, and it's not > always pretty nor fun to play with. Would prefix/suffix components > help? Yes, but I'm not sure that's the best interface for that functionality. Think I'll do some thinking on it myself… Best, David
On Thu, May 28, 2009 at 5:30 PM, David E. Wheeler <david@kineticode.com> wrote: > Yes, just as long as your extensions schema doesn't turn into a bricolage of > stuff. I mean, if I use a lot of extensions, it means that I end up with a > giant collection of functions and types and whatnot in this one namespace. > PHP programmers might be happy with it, but not I. ;-P I don't understand what storing them in different namespaces and then putting them all in your search_path accomplishes. You end up with the same mishmash of things in your namespace. The only way that mode of operation makes any sense to me is if you explicitly prefix every invocation. Ie, you want the stuff installed but not available in your namespace at all unless you explicitly request it. Actually there is another reason separate schemas does make some sense to me. Private objects that the extension will use internally but doesn't intend to make part of its public interface. It might be nice if extensions could mark objects with a token like _private and have that be created in a private schema separate from other extensions and not in the default search path. -- greg
Dimitri Fontaine wrote: > "we all agree that a specific pg_extension schema is a good idea, as > soon as user is free not to use it at extension install time". > > I don't think we all agree on that at all. ;-) cheers andrew
On 5/28/09 12:36 AM, Dimitri Fontaine wrote: > That really seems exactly to be what we're proposing with pre_ and post_ > search_path components: don't change current meaning of search_path, > just give DBAs better ways to manage it. And now that you're leaning > towards a search_path suffix, don't you want a prefix too? Yeah, I thought about a prefix, but I couldn't come up with a way it would be useful, and I could come up with a lot of scenarios where it would be a big foot-gun. -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com
Greg Stark <stark@enterprisedb.com> writes: > I don't understand what storing them in different namespaces and then > putting them all in your search_path accomplishes. You end up with the > same mishmash of things in your namespace. +1 ... naming conflicts between different extensions are going to be a problem for people no matter what. Sticking them in different schemas doesn't really fix anything, it just means that you'll hit the problems later instead of sooner. I suppose there might be some use-case involving concurrent installation of multiple versions of the *same* extension, but I'm not sure we should be designing around that as a key case. > Actually there is another reason separate schemas does make some sense > to me. Private objects that the extension will use internally but > doesn't intend to make part of its public interface. It might be nice > if extensions could mark objects with a token like _private and have > that be created in a private schema separate from other extensions and > not in the default search path. Well, an extension can certainly do that today, so why would it be a factor in this discussion? It's just an extra schema. And I guess the extension author has to explicitly qualify all those names, but if he doesn't want those names in the search path I don't see much choice. regards, tom lane
Josh Berkus <josh@agliodbs.com> writes: > On 5/28/09 12:36 AM, Dimitri Fontaine wrote: >> That really seems exactly to be what we're proposing with pre_ and post_ >> search_path components: don't change current meaning of search_path, >> just give DBAs better ways to manage it. And now that you're leaning >> towards a search_path suffix, don't you want a prefix too? > Yeah, I thought about a prefix, but I couldn't come up with a way it > would be useful, and I could come up with a lot of scenarios where it > would be a big foot-gun. Also, a search path prefix is going to create curious interactions with the default creation schema. A suffix seems much less dangerous in that respect. regards, tom lane
On Thu, May 28, 2009 at 2:27 PM, Greg Stark <stark@enterprisedb.com> wrote: > On Thu, May 28, 2009 at 5:30 PM, David E. Wheeler <david@kineticode.com> wrote: >> Yes, just as long as your extensions schema doesn't turn into a bricolage of >> stuff. I mean, if I use a lot of extensions, it means that I end up with a >> giant collection of functions and types and whatnot in this one namespace. >> PHP programmers might be happy with it, but not I. ;-P > > I don't understand what storing them in different namespaces and then > putting them all in your search_path accomplishes. You end up with the > same mishmash of things in your namespace. +1! That's the thing that's really mystifying me about this whole conversation. It seems this compounds the work of managing extension by requiring every extension to require an extra post-installation step where we update everyone's search path (and that step can't be automated because there's no way for the extension installation process to update all of the places search_paths might be stored, even if it could tell which ones ought to be updated). Having a global search_path_suffix will help with this a little bit, but I think there are corner cases (such as the ones I mentioned upthread) where that's not really going to be enough either. It feels like a Java CLASSPATH, or installing every application into /usr/local/<application-name> so that your path has 50 bin directories in it. It also seems to me that we're getting seriously sidetracked from the dependency-tracking part of this project which seems to me to be a much deeper and more fundamental issue. ...Robert
Robert Haas <robertmhaas@gmail.com> writes: > It also seems to me that we're getting seriously sidetracked from the > dependency-tracking part of this project which seems to me to be a > much deeper and more fundamental issue. I thought that part was a pretty simple problem, actually. Have an object representing the module, make sure each component object in the module has an AUTO dependency link to that object. Where's the difficulty? regards, tom lane
On Thu, May 28, 2009 at 8:16 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> It also seems to me that we're getting seriously sidetracked from the >> dependency-tracking part of this project which seems to me to be a >> much deeper and more fundamental issue. > > I thought that part was a pretty simple problem, actually. Have an > object representing the module, make sure each component object in the > module has an AUTO dependency link to that object. Where's the > difficulty? Is that really a complete answer? How do we deal with upgrading an extension to a more recent version? What happens to objects in the database which depend on objects from the extension? Can we suspend the normal rules for dependency tracking while uninstalling the old version, install the new version, then check that all the dependencies which were left hanging from the old one can be satisfied by similarly named objects in the new one and redirect them? -- greg
Tom Lane wrote: > Robert Haas <robertmhaas@gmail.com> writes: > >> It also seems to me that we're getting seriously sidetracked from the >> dependency-tracking part of this project which seems to me to be a >> much deeper and more fundamental issue. >> > > I thought that part was a pretty simple problem, actually. Have an > object representing the module, make sure each component object in the > module has an AUTO dependency link to that object. Where's the > difficulty? > > > Well, yes. Honestly, I think all this search_path stuff is a red herring. We are once again in danger of over-designing this instead of doing the simple thing first (namely, don't worry about the search_path). cheers andrew
Greg Stark <stark@enterprisedb.com> writes: > Is that really a complete answer? How do we deal with upgrading an > extension to a more recent version? What happens to objects in the > database which depend on objects from the extension? Well, if it's only a code change then you install a newer version of the .so and you're done. If the extension upgrade requires altering any SQL-level properties of the module's objects then I'd expect the extension author to provide a SQL script to do that. Obviously there is value in being able to do things like add new objects to an existing module, but we hashed out the mechanisms for that long ago. IIRC the proposed syntax was along the lines of CREATE EXTENSION foo; BEGIN EXTENSION foo; ... anything created here is automatically tagged as belonging to foo ... END EXTENSION foo; where you can repeat the BEGIN/END later to add more objects. An alternative was to not have BEGIN/END but instead a GUC variable that you can SET to the name of the extension currently being added to. (The main advantage of that is that the state isn't hidden, but easily checkable via existing commands.) > Can we suspend the normal rules for dependency tracking while > uninstalling the old version, install the new version, then check that > all the dependencies which were left hanging from the old one can be > satisfied by similarly named objects in the new one and redirect them? Sounds like a solution in search of a problem. Why would that be a good idea? regards, tom lane
On May 28, 2009, at 11:27 AM, Greg Stark wrote: > On Thu, May 28, 2009 at 5:30 PM, David E. Wheeler <david@kineticode.com > > wrote: >> Yes, just as long as your extensions schema doesn't turn into a >> bricolage of >> stuff. I mean, if I use a lot of extensions, it means that I end up >> with a >> giant collection of functions and types and whatnot in this one >> namespace. >> PHP programmers might be happy with it, but not I. ;-P > > I don't understand what storing them in different namespaces and then > putting them all in your search_path accomplishes. You end up with the > same mishmash of things in your namespace. > > The only way that mode of operation makes any sense to me is if you > explicitly prefix every invocation. Ie, you want the stuff installed > but not available in your namespace at all unless you explicitly > request it. Yes, it allows me to work around a conflict in my application by deciding to schema-qualify use of a one of the two conflicting extensions. It's a way I can quickly work around the issue. Not ideal, I grant you, but I don't see us getting into the business of setting up a registry requiring uniqueness. Besides, some extensions, like pgTAP, pretty much scream for a schema of their own completely independent of everything else. I want the option to be able to do that when appropriate. I don't think I'd ever put each module in its own schema, FWIW. Best, David
On May 28, 2009, at 11:38 AM, Tom Lane wrote: > I suppose there might be some use-case involving concurrent > installation > of multiple versions of the *same* extension, but I'm not sure we > should > be designing around that as a key case. Agreed. One thing at a time. Best, David
On May 28, 2009, at 12:10 PM, Robert Haas wrote: > It feels like a Java CLASSPATH, > or installing every application into /usr/local/<application-name> so > that your path has 50 bin directories in it. +1 Yeah, that was my trouble with it. Best, David
On May 28, 2009, at 12:33 PM, Tom Lane wrote: > Greg Stark <stark@enterprisedb.com> writes: >> Is that really a complete answer? How do we deal with upgrading an >> extension to a more recent version? What happens to objects in the >> database which depend on objects from the extension? > > Well, if it's only a code change then you install a newer version of > the > .so and you're done. If the extension upgrade requires altering any > SQL-level properties of the module's objects then I'd expect the > extension author to provide a SQL script to do that. It would be convenient for me a module/extension author not to have to write upgrade scripts for every version of my module/extension. > Obviously there is value in being able to do things like add new > objects > to an existing module, but we hashed out the mechanisms for that long > ago. IIRC the proposed syntax was along the lines of > > CREATE EXTENSION foo; > > BEGIN EXTENSION foo; > > ... anything created here is automatically tagged as belonging > to foo ... > > END EXTENSION foo; I like it. Best, David
On Thu, May 28, 2009 at 3:32 PM, Andrew Dunstan <andrew@dunslane.net> wrote: > Tom Lane wrote: >> Robert Haas <robertmhaas@gmail.com> writes: >>> It also seems to me that we're getting seriously sidetracked from the >>> dependency-tracking part of this project which seems to me to be a >>> much deeper and more fundamental issue. >> I thought that part was a pretty simple problem, actually. Have an >> object representing the module, make sure each component object in the >> module has an AUTO dependency link to that object. Where's the >> difficulty? I think it's a simple problem too... except for the not-so-small detail of who is going to implement it. > Well, yes. Honestly, I think all this search_path stuff is a red herring. We > are once again in danger of over-designing this instead of doing the simple > thing first (namely, don't worry about the search_path). Right. ...Robert
Hi, Le 29 mai 09 à 02:32, Robert Haas a écrit : > On Thu, May 28, 2009 at 3:32 PM, Andrew Dunstan > <andrew@dunslane.net> wrote: >> Tom Lane wrote: >>> Robert Haas <robertmhaas@gmail.com> writes: >>>> It also seems to me that we're getting seriously sidetracked from >>>> the >>>> dependency-tracking part of this project which seems to me to be a >>>> much deeper and more fundamental issue. >>> I thought that part was a pretty simple problem, actually. Have an >>> object representing the module, make sure each component object in >>> the >>> module has an AUTO dependency link to that object. Where's the >>> difficulty? > > I think it's a simple problem too... except for the not-so-small > detail of who is going to implement it. I kind of said I'd do it, but it's going to be my first attempt to patch backend code. Fortunately, Tom Dunstan did already a big chunk of the work, but without user design approval first. I'm trying to have user design voted, then I hope to reuse as much as Tom Dunstan's work as possible :) And Stephen Frost proposed to be helping too. Maybe we could also open the road for a new way of contributing: have someone discuss the user design on hackers until a consensus raises, then have a developer happily code it without having to care about the "politics" of it. :) >> Well, yes. Honestly, I think all this search_path stuff is a red >> herring. We >> are once again in danger of over-designing this instead of doing >> the simple >> thing first (namely, don't worry about the search_path). > > Right. My feeling is that current way of using extensions is tightly coupled with search_path, and I'm not sure providing a SQL visible extension object with dependancies will make this problem any easier. Now I agree that we certainly can complete the extension support project without having a single thought about schemas and search_path, this problem can be postponed. I figured out it could guide some extension user API design, but let's pretend all of this is orthogonal. Still, extension users will want to have a default schema where the extension is installed, and a way to override it, right? Moving to extension user design per-se on Tuesday, trying to avoid schema discussions while doing so. Regards, -- dim
On Thursday 28 May 2009 02:57:00 Josh Berkus wrote: > Personally, if we're tracking stuff through special dependancies which > pg_dump will be aware of anyway, I don't see why extension objects > should go into a special schema. But they clearly have to go into *some* schema, and it would add some clarity to the world if we made a recommendation which one that is. Which is what some of the subproposals really come down to.
On Thursday 28 May 2009 15:24:21 Stephen Frost wrote: > I'm not real happy with it either. Sure, we can track module > dependencies seperately, but if we go down this route then we have to > come up with some concept of an extension namespace that different > extension use and prefix their functions/tables/etc with to avoid > overlap with each other. Gee, doesn't that sound familiar I think what this comes down to is that you need nested schemas and a global namespace rule. Then you can install things into pg_extensions.postgis.submodule.special_type, etc. Makes sense on paper. Note, however, that historically all the schemes that advocated the use of something like /usr/local/$packagename/ for each package separately have failed. And this is in spite of the fact that search path handling and managing facilities for file systems are somewhat more powerful than PostgreSQL's schema search path handling. So unless we have any new insights in this problem that OS developers haven't dealt with over the last few decades, I would feel more comfortable with an all-in-one directory/schema approach, accompanied by a "package management" system. (One such new insight might be the Python/Java way of deeply nested package naming systems where you have to manually pick out and import the pieces that you want. But that might significantly change the whole schema search path and name resolution system.)
On Thursday 28 May 2009 21:38:29 Tom Lane wrote: > Greg Stark <stark@enterprisedb.com> writes: > > I don't understand what storing them in different namespaces and then > > putting them all in your search_path accomplishes. You end up with the > > same mishmash of things in your namespace. > > +1 ... naming conflicts between different extensions are going to be a > problem for people no matter what. Sticking them in different schemas > doesn't really fix anything, it just means that you'll hit the problems > later instead of sooner. Yeah, to reiterate what I posted elsewhere, perhaps it'd be a good idea to give up on the search path idea altogether and think more in terms of an import facility like Python, Java, and sometimes Perl have. In practice, I find a search path is just a tool to cause you to find the wrong stuff at the wrong time, and it continues to be a cause of confusion and security issues both in PostgreSQL and in Unix operating systems to this day.
Hi, Le 29 mai 09 à 12:18, Peter Eisentraut a écrit : > I think what this comes down to is that you need nested schemas and > a global > namespace rule. Then you can install things into > pg_extensions.postgis.submodule.special_type, etc. Makes sense on > paper. [...] > (One such new insight might be the Python/Java way of deeply nested > package > naming systems where you have to manually pick out and import the > pieces that > you want. But that might significantly change the whole schema > search path > and name resolution system.) We'd still need search_path in there, as Python's still using a path. With 'default' search_path you'd have to qualify your type as pg_extensions.postgis.submodule.special_type, with pg_extensions in search_path the following notation would find it too: postgis.submodule.special_type. And if you have pg_extensions.postgis.submodule in the search_path, then you can use special_type without having to (nest-) schema qualify it. I like this idea, which sounds compatible with what we already have now (meaning current semantics of search_path still apply). Regards, -- dim PS: we still have to provide users with easy tools to (dynamically) manage search_path, don't we? (I prefer not to start the search_path management tool ideas right here). PPS: http://www.gobolinux.org/ doesn't look like it's failing. (yet?) "In GoboLinux you don't need a package database because the filesystem is the database: each program resides in its own directory, such as / Programs/Xorg-Lib/7.4 and /Programs/KDE-Libs/4.2.0."
Peter Eisentraut wrote: > On Thursday 28 May 2009 02:57:00 Josh Berkus wrote: > >> Personally, if we're tracking stuff through special dependancies which >> pg_dump will be aware of anyway, I don't see why extension objects >> should go into a special schema. >> > > But they clearly have to go into *some* schema, and it would add some clarity > to the world if we made a recommendation which one that is. Which is what > some of the subproposals really come down to. > Even that's going to be hard, frankly. The usage pattern is likely to be too varied for any one-size-fits-all recommendation. Proposals to allow a choice of schema at install time sound nice but in practice they are a recipe for massive headaches and maintenance nightmares, I think. It means no extension author will be able to hardcode the schema name in any view, function etc. Yuck. I think almost all these difficulties could be overcome if we had some sort of aliasing support, so that arbitrary objects in schema a could be aliased in schema b. If that were in place, best practice would undoubtedly be for each module to install in its own schema, and for the DBA to alias what is appropriate to their usage scenario. But unless someone wants to tackle that I think we should leave schema management entirely alone, and leave it up to the extension author / DBA between them. cheers andrew
Hi, Le 29 mai 09 à 16:11, Andrew Dunstan a écrit : > I think almost all these difficulties could be overcome if we had > some sort of aliasing support, so that arbitrary objects in schema a > could be aliased in schema b. If that were in place, best practice > would undoubtedly be for each module to install in its own schema, > and for the DBA to alias what is appropriate to their usage scenario. This coupled with Peter's idea of nested namespace seems a killer feature for me. That means the pg_extension namespace isn't a flat mess but a organized one, with private (internal) objects deeper into the hierarchy. It makes it easy to have a top-level schema per extension without rendering search_path impracticable. Then you slice atop of if aliasing so that you can refer to pg_extension.a.part1.obj_x from say utils.a.x or even utils.x, DBA choice. Or simply alias schema pg_extension.a.part1 as a. Whatever. It seems to offer the best of both worlds: we know where extensions are meant to end up getting installed (private (nested) sub schema(s) in pg_extension), and DBA has the option to mask this implementation detail by aliasing it all wherever needed, with a choice of granularity. > But unless someone wants to tackle that I think we should leave > schema management entirely alone, and leave it up to the extension > author / DBA between them. Well, we all know that proper extension/module/plugin packaging system, flexible for both authors and users, reliable and built for next 20 years... it won't be made in a breathe. If some expected it to be easy, I think it's time to revise the plans. Want to have it all in 8.5? Still time to join ;) If the way to have good extension packaging support in PostgreSQL means we need aliasing first (which I think is called synonym in the standard), let's work on this as a first step patch? In theory, it's even possible to begin work on extensions without synonyms/alias, using the pg_extension forced place (but with nested namespace support) and have the alias be done in parallel or after, as soon as we know what we want the big picture to look like when finished. Incremental work, etc. Regards, -- dim
Dimitri Fontaine <dfontaine@hi-media.com> writes: > Le 29 mai 09 � 16:11, Andrew Dunstan a �crit : >> I think almost all these difficulties could be overcome if we had >> some sort of aliasing support, so that arbitrary objects in schema a >> could be aliased in schema b. If that were in place, best practice >> would undoubtedly be for each module to install in its own schema, >> and for the DBA to alias what is appropriate to their usage scenario. > This coupled with Peter's idea of nested namespace seems a killer > feature for me. What it sounds like to me is an amazingly complicated gadget with absolutely no precedent of successful use anywhere. We'll spend a year fooling with the details of this and be no closer to actually solving the problem at hand, namely getting a simple workable extension packaging facility. regards, tom lane
Tom Lane wrote: > Dimitri Fontaine <dfontaine@hi-media.com> writes: > >> Le 29 mai 09 à 16:11, Andrew Dunstan a écrit : >> >>> I think almost all these difficulties could be overcome if we had >>> some sort of aliasing support, so that arbitrary objects in schema a >>> could be aliased in schema b. If that were in place, best practice >>> would undoubtedly be for each module to install in its own schema, >>> and for the DBA to alias what is appropriate to their usage scenario. >>> > > >> This coupled with Peter's idea of nested namespace seems a killer >> feature for me. >> > > What it sounds like to me is an amazingly complicated gadget with > absolutely no precedent of successful use anywhere. We'll spend a year > fooling with the details of this and be no closer to actually solving > the problem at hand, namely getting a simple workable extension > packaging facility. > Well, the part about no precedent is not true. See <http://publib.boulder.ibm.com/infocenter/db2luw/v8/index.jsp?topic=/com.ibm.db2.udb.doc/admin/r0000910.htm> for example. I didn't dream up the idea out of thin air ;-) (I pretty much started my computing career over 20 years ago working on DB2). However, the part about it being complex is true. And that is why I agree completely that we should not hold up the extension work waiting for it. cheers andrew
Le 29 mai 09 à 17:12, Tom Lane a écrit : > What it sounds like to me is an amazingly complicated gadget with > absolutely no precedent of successful use anywhere. We'll spend a > year > fooling with the details of this and be no closer to actually solving > the problem at hand, namely getting a simple workable extension > packaging facility. What it sounds like to me is a way to all agree what the finished feature would look like, allowing us to commit incremental patches. Coarse(?) grained plan: A. nested namespaces B. packaging facility, each module have its own schema in pg_extension sub schemas in pg_extension.myext are possibleand welcomed to organize things C. synonyms, allowing DBA to organise the visibility as they see fit, and to overcome search_path limitations The ordering of those points would still need to be talked about, I'd see A as necessary to get through before B implementation begins, but at least this would solve the search_path and "default" schema destination points while designing the extension packaging facility. Then when B is done, or parallel to development of B, we can have C, so that everyone is happy: it works and is not a PITA to maintain. All in all, agreeing about those steps now would open up the "real" matters of extension packaging to begin. Regards, -- dim PS: I realize that my line of thoughts is tied to imagining that the more visible (and complex, as in agreeing on bikesched color) part of the packaging facility user design is its relationship with schemas and search_path. Even the SQL syntax of creating (altering/droping/ granting) the new SQL object seems like it'll be easier. That done, the rest of it is mainly self-constrained, I don't foresee another such controversial part related to the existing system... Now, that's in the archive and I'll soon really look like a fool :)
On May 29, 2009, at 3:24 AM, Peter Eisentraut wrote: > Yeah, to reiterate what I posted elsewhere, perhaps it'd be a good > idea to > give up on the search path idea altogether and think more in terms > of an > import facility like Python, Java, and sometimes Perl have. +1 Actually, Perl's is usually a file path. `use Foo::Bar::Baz;` triggers a search for Foo/Bar/Baz.pm unless Foo::Bar::Baz is already loaded. Best, David
On May 29, 2009, at 3:38 AM, Dimitri Fontaine wrote: > PS: we still have to provide users with easy tools to (dynamically) > manage search_path, don't we? > (I prefer not to start the search_path management tool ideas right > here). Yes, we do, and that's what at least half this thread is about. Whether or not such tools are put to use for extensions support is a separate issue, but both need addressing, I think. Best, Davdi
Dimitri, > We'd still need search_path in there, as Python's still using a path. > With 'default' search_path you'd have to qualify your type as > pg_extensions.postgis.submodule.special_type, with pg_extensions in > search_path the following notation would find it too: > postgis.submodule.special_type. > And if you have pg_extensions.postgis.submodule in the search_path, then > you can use special_type without having to (nest-) schema qualify it. But *incompatible* with SQL truncation of qualified names. Remember that you can refer to something by any portion of its qualified name, such as: pg_extensions.postgis.submodule.special_type postgis.submodule.special_type submodule.special_type special_type ... are all valid. Which is fine until you think that we could have a: pg_extensions.pg_tap.submodule.special_type or even a: schema submodule.special_type which would confuse both the search path and the user. What this means is that all schema names would have to be unique, whether they are nested or not. Which makes subschema *within* an extension rather useless. -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com
On Fri, May 29, 2009 at 5:23 PM, David E. Wheeler <david@kineticode.com> wrote: >> PS: we still have to provide users with easy tools to (dynamically) manage >> search_path, don't we? >> (I prefer not to start the search_path management tool ideas right here). > > Yes, we do, and that's what at least half this thread is about. Whether or > not such tools are put to use for extensions support is a separate issue, > but both need addressing, I think. Do we really? The only reason people are having trouble managing their search_path is because they're not using it as intended and putting things in lots of different schemas that they intend to all be visible. If they put everything they intend to be visible to users in one schema they wouldn't have this problem. That said, I don't mind the idea of having a way to push things onto search path like you often do in sh using PATH=/foo/bar:$PATH. But I think the only reason to install something into a separate schema is precisely if you *want* that schema to not be visible to users automatically. So having more and more complex ways to include schemas in the search path automatically is fixing a problem created by setting things up wrong in the first place. I'm actually not sure if we should allow extensions to be installed into separate schemas. If you do then it means we can't detect conflicts. A module might refer to an object intending to get its local object but end up getting some object from some other module depending on how the user set up his search_path. To make installing into separate schemas work we would have to have each extension have some magic way to refer to its own schema and enforce that all objects are referred to this way. We don't have any way to do that currently and I think that would also limit our ability to have extensions which depend on other extensions. In short I think people who want to put things in different schemas are being misled by their intuition. By installing everything into one schema you end up with a *more* organized system where everything has a well defined meaning. If you install everything in different schemas then that's where you end up with a mishmash where you're not sure what objects are being used when depending on a global run-time parameter which might need to be set differently for each module. -- greg
Greg Stark <stark@enterprisedb.com> writes: > I'm actually not sure if we should allow extensions to be installed > into separate schemas. It's starting to seem that best practice is to install "public" functions/etc into a common schema and "private" objects into an extension-specific schema. The main problem with that from an extension author's point of view is the need to explicitly qualify all references to private objects, since they won't be in the search path. Which is tedious, but doable. Another issue is that doing that pretty much hard-wires what the extension's private schema name is. Dunno how much we care, though. You could certainly do this without any new search-path-related features, but I wonder whether the system could provide any extra support for it. regards, tom lane
Tom Lane wrote: > Greg Stark <stark@enterprisedb.com> writes: > >> I'm actually not sure if we should allow extensions to be installed >> into separate schemas. >> > > It's starting to seem that best practice is to install "public" > functions/etc into a common schema and "private" objects into an > extension-specific schema. The main problem with that from an extension > author's point of view is the need to explicitly qualify all references > to private objects, since they won't be in the search path. Which is > tedious, but doable. > The main problem as I see it is that you are abandoning one of the two uses of schemas, namely namespace separation. With this pattern an extension author has no guarantee that there won't be a name collision with some other extension. Pace Greg, schemas are not just about privacy. cheers andrew
On Fri, May 29, 2009 at 4:49 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Greg Stark <stark@enterprisedb.com> writes: >> I'm actually not sure if we should allow extensions to be installed >> into separate schemas. > > It's starting to seem that best practice is to install "public" > functions/etc into a common schema and "private" objects into an > extension-specific schema. The main problem with that from an extension > author's point of view is the need to explicitly qualify all references > to private objects, since they won't be in the search path. Which is > tedious, but doable. This sounds quite horrid to me. The way programming languages solve this problem is they have a flag that either makes certain names not visible from other namespaces, or they provide explicit control over which names get exported. Requiring the extension author to split up the objects between two different hard-coded namespaces with schema qualifications on every reference sounds like an unmanageable mess. Of course we have no notion of exporting or importing names at all. Maybe we should. But I'm still of the opinion that this entire discussion is a tangent. ...Robert
On Fri, May 29, 2009 at 10:26 PM, Robert Haas <robertmhaas@gmail.com> wrote: > This sounds quite horrid to me. The way programming languages solve > this problem is they have a flag that either makes certain names not > visible from other namespaces, or they provide explicit control over > which names get exported. There are two factors which distinguish this situation from most programming languages: 1) Hopefully these languages you're thinking of are lexically scoped. So the search path in effect when the objects are defined decide which other objects they reference. In Postgres in many cases we're effectively dynamically scoped. If someone calls us with another search path we'll pick up other objects we weren't expecting. 2) Normally programming languages do early binding so as soon as the code is parsed references are resolved. You can't later define a new function earlier in the search path and have it take over references that have were previously referring to some other function. > Requiring the extension author to split up > the objects between two different hard-coded namespaces with schema > qualifications on every reference sounds like an unmanageable mess. Well I think the thinking is that if the extension author wants to hide some objects from the public he creates a schema for them and references them explicitly. If he pushes that private schema onto the search path he'll find any functions he calls -- admittedly not that common since we don't have any way to do callbacks, i suppose triggers on tables his code modifies counts though -- will have this private schema in its search path... If we do want special handling it does seem to me that it would make sense to have some token like _private_ which the extension loading mechanism would automatically substitute for a unique schema name. Otherwise we're relying on extension authors to come up with unique names. -- greg
Greg, > Do we really? The only reason people are having trouble managing their > search_path is because they're not using it as intended and putting > things in lots of different schemas that they intend to all be > visible. Apparently you've never adminned a database with hundreds (or thousands) of stored procedures. Sometimes one needs to use schemas just for namespacing (they are called "namespaces" after all), and not for security or visibility. In fact, I'd argue that that is one of the problems with the whole schema concept: it's three things at once. > I'm actually not sure if we should allow extensions to be installed > into separate schemas. If you do then it means we can't detect > conflicts. A module might refer to an object intending to get its > local object but end up getting some object from some other module > depending on how the user set up his search_path. I agree with this. Eliminating module naming conflicts is a good in itself. From a DBA and database designer perspective, the missing functionality from being able to do everything with schema that I want are listed below. It's been my experience that the awkwardness of managing search_path has caused a *lot* of our users to ignore schema as a feature and not use schema when they otherwise should. a) the ability to "push" a schema onto the current search path b) the ability to "pull" a schema off the current search path c) the ability as superuser to have my own "special schema" which are always in the search path, as pg_catalog and $user_temp are.* d) the ability as superuser to "lock" specific role so that they can't change their search path** e) having roles somehow inherit search_path on a SET ROLE*** * if you're not sure why someone would want this, consider information_schema. If your application depends on I_S to work, how do you make sure it's always in every user's search_path? ** think about the number of security exploits around search_path we could protect against if we had this. *** this is the same issue as it is with resource management (i.e. work_mem). However, it's particularly apt for search_path; imagine a database with an "accounting" schema and a user who belongs to both the "accounting" and the "HR" roles. -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com
Robert, > Of course we have no notion of exporting or importing names at all. > Maybe we should. But I'm still of the opinion that this entire > discussion is a tangent. As far as Extensions are concerned? Yes, it is. Dimitri: I vote for you to get on with assuming everything goes into pg_extensions. We can always change that later if there's any kind of consensus. -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com
On May 29, 2009, at 12:41 PM, Greg Stark wrote: > That said, I don't mind the idea of having a way to push things onto > search path like you often do in sh using PATH=/foo/bar:$PATH. Yes, +1. > But I think the only reason to install something into a separate > schema is precisely if you *want* that schema to not be visible to > users automatically. So having more and more complex ways to include > schemas in the search path automatically is fixing a problem created > by setting things up wrong in the first place. A reason I've run into is to make database maintenance and migration easier. For a recent client, all contrib modules were installed in a single, separate schema, named contrib. This makes it easy to dump all of the database code but not dump the contrib stuff, and that's useful for two reasons: 1. The client was dumping the schema into svn every night, and the contrib stuff just cluttered it up (I'm not saying checking a schema in like this is a good idea, just that I've seen it). 2. Migrating to a new version of PostgreSQL, the server can be pre- build with the contrib schema, with new versions with the new release, and then the dump from the old server doesn't have the contrib crap in it to cause conflicts. So, yeah, there may be collisions that a given DBA has to deal with, and then will want more than one schema. But for the vast majority of uses, I think that a pg_extensions schema will serve nicely to keep third-party extensions separate from in-house database objects. Best, David
On May 29, 2009, at 2:45 PM, Greg Stark wrote: > 2) Normally programming languages do early binding so as soon as the > code is parsed references are resolved. You can't later define a new > function earlier in the search path and have it take over references > that have were previously referring to some other function. Not functions, but see method dispatch. > Well I think the thinking is that if the extension author wants to > hide some objects from the public he creates a schema for them and > references them explicitly. Agreed. > If he pushes that private schema onto the search path he'll find any > functions he calls -- admittedly not that common since we don't have > any way to do callbacks, i suppose triggers on tables his code > modifies counts though -- will have this private schema in its search > path... Yeah, it'd be nice to lexically scope such search_path modifications, such as for the duration of a function call. > If we do want special handling it does seem to me that it would make > sense to have some token like _private_ which the extension loading > mechanism would automatically substitute for a unique schema name. > Otherwise we're relying on extension authors to come up with unique > names. Agreed. Best, David
On May 29, 2009, at 2:52 PM, Josh Berkus wrote: > a) the ability to "push" a schema onto the current search path > b) the ability to "pull" a schema off the current search path push, pop, shift, unshift. :-) Come to think of it, I want these for arrays, too. ;-) Best, David
On Fri, May 29, 2009 at 5:45 PM, Greg Stark <stark@enterprisedb.com> wrote: > On Fri, May 29, 2009 at 10:26 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> This sounds quite horrid to me. The way programming languages solve >> this problem is they have a flag that either makes certain names not >> visible from other namespaces, or they provide explicit control over >> which names get exported. > > There are two factors which distinguish this situation from most > programming languages: > > 1) Hopefully these languages you're thinking of are lexically scoped. > So the search path in effect when the objects are defined decide which > other objects they reference. In Postgres in many cases we're > effectively dynamically scoped. If someone calls us with another > search path we'll pick up other objects we weren't expecting. > > 2) Normally programming languages do early binding so as soon as the > code is parsed references are resolved. You can't later define a new > function earlier in the search path and have it take over references > that have were previously referring to some other function. Good point. But maybe there's some way of getting some kind of behavior that is closer to lexical scoping/early binding? Because the way it works right now has lousy security implications, beyond being difficult for search_path management. Assign a search path to a schema, that applies to views and functions defined therein? *brainstorming* ...Robert
On Fri, 29 May 2009, Greg Stark wrote: > The only reason people are having trouble managing their search_path is > because they're not using it as intended and putting things in lots of > different schemas that they intend to all be visible. If they put > everything they intend to be visible to users in one schema they > wouldn't have this problem. Every PostgreSQL installation I've ever seen that heavily uses schemas aggressively uses them to partition up the various applications into components that can easily be reinstalled, the goal being to make deploying new versions easier. Put component A into schema A, component B into schema B, and then if you need to make a change just to the workings of B you can easily dump the data from B, "DROP SCHEMA s CASCADE",. apply new DDL change, and then reinstall things associated with that component without touching anything in A. The nice thing about this approach, compared with applying DDL deltas, is that afterwards you know you've got a complete chunk of code each time that will also install somewhere else identically into that schema. That I run into all the time, usually with every schema in the default search_path. Using schemas primarly as a security mechanism isn't nearly as popular as far as I've seen. Anyway, I think the answer to all the extension related questions should be to pick whatever lets a prototype that handles the dependency and dump/reload problems get solved most easily. You really need to use the simplest possible schema standard that works for extensions and decouple the problems from one another if any progress is going to get made here. -- * Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD
On Fri, May 29, 2009 at 11:18 PM, Robert Haas <robertmhaas@gmail.com> wrote: > > Good point. But maybe there's some way of getting some kind of > behavior that is closer to lexical scoping/early binding? Because the > way it works right now has lousy security implications, beyond being > difficult for search_path management. Assign a search path to a > schema, that applies to views and functions defined therein? > *brainstorming* Well we already set search_path locally in SECURITY DEFINER functions. Normal functions run with the credentials of the caller so that's not an issue. But if a SECURITY DEFINER function calls another function that other function will inherit the credentials of the caller so it must inherit the search path of the caller as well. So that has to be dynamically scoped. I'm beginning to understand why Oracle programmers are accustomed to setting SECURITY DEFINER everywhere. I think Oracle also knows to treat such code as lexically scoped and can bind references when loading such code. -- greg
On Fri, May 29, 2009 at 11:03 PM, David E. Wheeler <david@kineticode.com> wrote: > On May 29, 2009, at 2:52 PM, Josh Berkus wrote: > >> a) the ability to "push" a schema onto the current search path >> b) the ability to "pull" a schema off the current search path > > push, pop, shift, unshift. :-) > > Come to think of it, I want these for arrays, too. ;-) push and unshift sure -- and you do have those for arrays, it's spelled ||. I'm not so sure about pop/shift though. How would you know the element you want is at the beginning/end unless you just put it there? I think what you really want is to use SET LOCAL or RESET to restore it to whatever it was before you started futzing with it. We might need a more flexible way to do that that isn't tied to transactions though. -- greg
On Fri, May 29, 2009 at 10:52 PM, Josh Berkus <josh@agliodbs.com> wrote: > Sometimes one needs to use schemas just for namespacing (they are called > "namespaces" after all), and not for security or visibility. What's the point of "namespaces" if not to implement visibility? The interesting thing to do would be to hide all the internal foo functions in a foo.* schema and only put the external api in public. That way you can't accidentally call an internal foo function or have a name conflict between two internal functions. The external api could even just be a bunch of thin wrappers around the implementation functions in foo.* (what Oracle calls public synonyms). If you just put them all in search path you haven't bought anything, all your functions are in the same namespace and one module can override another's objects. Actually it's worse than just putting them all in one schema since you won't even be warned when a conflict happens. It will just silently start doing something different. > c) the ability as superuser to have my own "special schema" which are always > in the search path, as pg_catalog and $user_temp are.* > * if you're not sure why someone would want this, consider > information_schema. If your application depends on I_S to work, how do you > make sure it's always in every user's search_path? Uhm, wouldn't you just refer to information_schema.foo? What if some other part of your application depends on information_schema *not* being in your path? Using global state for this seems destined to leave you with something broken that can't be fixed without breaking something else. > d) the ability as superuser to "lock" specific role so that they can't > change their search path** > ** think about the number of security exploits around search_path we could > protect against if we had this. Actually I'm thinking of how many security exploits this would *create*. So if I call a security_definer function which has a search_path set on it which search_path would it use? This seems like it would make it impossible to code any extension to work reliably. You would never know when some object in your extension was being hidden by some public object which the locked search_path overrode. Hm, I'm beginning to think extensions need to have search_path set on every function or have every object reference everywhere be explicitly pg_extension.* (and/or _private_.* like my earlier suggestion). > e) having roles somehow inherit search_path on a SET ROLE*** Grr. I'm still bitter about "su" doing that on some systems without "su -". I think I've lost that battle though and I'm forever doomed to never know what "su" will do on a new system. -- greg
On Fri, May 29, 2009 at 7:53 PM, Greg Stark <stark@enterprisedb.com> wrote: > On Fri, May 29, 2009 at 11:18 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> >> Good point. But maybe there's some way of getting some kind of >> behavior that is closer to lexical scoping/early binding? Because the >> way it works right now has lousy security implications, beyond being >> difficult for search_path management. Assign a search path to a >> schema, that applies to views and functions defined therein? >> *brainstorming* > > Well we already set search_path locally in SECURITY DEFINER functions. > Normal functions run with the credentials of the caller so that's not > an issue. Maybe not for security, but certainly it is for correctness. > But if a SECURITY DEFINER function calls another function that other > function will inherit the credentials of the caller so it must inherit > the search path of the caller as well. So that has to be dynamically > scoped. > > I'm beginning to understand why Oracle programmers are accustomed to > setting SECURITY DEFINER everywhere. I think Oracle also knows to > treat such code as lexically scoped and can bind references when > loading such code. Uh... if I'm understanding you correctly, then I'm really hoping we engineer a better solution for PostgreSQL. ...Robert
On May 29, 2009, at 5:16 PM, Greg Stark wrote: > On Fri, May 29, 2009 at 11:03 PM, David E. Wheeler <david@kineticode.com > > wrote: >> On May 29, 2009, at 2:52 PM, Josh Berkus wrote: >> >>> a) the ability to "push" a schema onto the current search path >>> b) the ability to "pull" a schema off the current search path >> >> push, pop, shift, unshift. :-) >> >> Come to think of it, I want these for arrays, too. ;-) > > push and unshift sure -- and you do have those for arrays, it's > spelled ||. Well, no, not quite, as push, pop, shift, and unshift change arrays in place, whereas || (and array_concat() and array_append() and array_prepend()) create and return a new array. > I'm not so sure about pop/shift though. How would you know the element > you want is at the beginning/end unless you just put it there? Well, for arrays, I might use them in a function just as I currently use them for various things in Perl. For search_path, yeah, I'd use them for lexical scoping: unshift a path onto the search path at the beginning of the function and shift it off at the end. Or push it onto the end of the search path and pop it off at the end. > I think what you really want is to use SET LOCAL or RESET to restore > it to whatever it was before you started futzing with it. We might > need a more flexible way to do that that isn't tied to transactions > though. Well, lexical scoping for the changes would help, for sure. And I think that RESET is fine, but it would be more useful to have ways to push and unshift, or even to splice (sometimes I might want to remove a schema from the search path no matter where it appears in the path). Best, David
Greg, > What's the point of "namespaces" if not to implement visibility? The > interesting thing to do would be to hide all the internal foo > functions in a foo.* schema and only put the external api in public. That is an interesting idea. However, what our real users are really doing in the field is more diverse. (see below) > That way you can't accidentally call an internal foo function or have > a name conflict between two internal functions. The external api could > even just be a bunch of thin wrappers around the implementation > functions in foo.* (what Oracle calls public synonyms). This assumes that all users should have access to the same public APIs as all other users. Real applications are more complex. In my experience of PostgreSQL applications, people use schema for three different reasons: Organization/Maintainability: when you have hundreds or thousands of database objects, you want "folders" to put them in just so that you can keep track of them and view them in easy-to-digest groups, just as you deal with files in a filesystem. DBAs no more want to put everything in one big flat namespace, even if the individual names are unique and the permissions are the same, than we want to have all of the PostgreSQL source code in one big directory.Further, these schema names generally indicate something about the purpose of the objects in them: "cms","matviews", "reports","calendar". When accurate, these schema names aid the DBA inmaintaining and troubleshooting the application, and are more convenient than hungarian notation schemes. Visibility: some applications use schema to hide objects from roles which shouldn't see them: "inner","cronjobs","acl", whether for data hiding or just to keep "private" functions and tables separate from what the application accesses directly. However, this approach is not very common *because of* the awkwardness and overhead of search_path; DBAs are constantly troubleshooting search_path omissions and errors and eventually give up on visibility rules, making all schema visible to all users.This gets even more difficult when you consider that in a large complex application with multiple ROLEs, not all ROLEs should see all schema, but what an individual user can access might be a list of schema which represent some-but-not-all schema. The lack of a convenient "search_path_add" or "SET ROLE ... WITH DEFAULTS" makes this an unmanageable mess; DBAs find themselves constantly ALTERing each user's search_path individually. Security: schema provide convenient containers to lock up groups of objects by role. "admin", "permissions" etc. schemas combine visibility and USE restrictions to make sql injection much harder, and administrative tasks are supported by objects in schema not accessible to the "webuser". As I said before, schema conbine 3 purposes: organization, visibility and security, into one structure. Which is why it's difficult to make them work perfectly for all 3 purposes. We could, however, make them work better. >> c) the ability as superuser to have my own "special schema" which are always >> in the search path, as pg_catalog and $user_temp are.* > >> * if you're not sure why someone would want this, consider >> information_schema. If your application depends on I_S to work, how do you >> make sure it's always in every user's search_path? > > Uhm, wouldn't you just refer to information_schema.foo? What if some > other part of your application depends on information_schema *not* > being in your path? Using global state for this seems destined to > leave you with something broken that can't be fixed without breaking > something else. Easily said for someone who doesn't have to adapt a 3rd-party vendor application or support real users on the phone. Insisting that all of your application developers remember to type "information_schema." all of the time really makes them love the DBA. Mostly, this simply results in people not using information_schema, and instead using their own home-grown system view scripts, which are often wrong. However, if we had push/pop/shift/unshift for search_path, the need for search_path_suffix would be considerably diminished, since application code (& DBAs) would use push/pop instead of replacing the entire search_path. > Hm, I'm beginning to think extensions need to have search_path set on > every function or have every object reference everywhere be explicitly > pg_extension.* (and/or _private_.* like my earlier suggestion). Again, I'm not talking about Extensions. I think that Extensions are completely orthagonal to search_path, hence the change of subject line. I'm talking about making search_path (and schema)more useful to DBAs and application designers. >> e) having roles somehow inherit search_path on a SET ROLE*** > > Grr. I'm still bitter about "su" doing that on some systems without > "su -". I think I've lost that battle though and I'm forever doomed to > never know what "su" will do on a new system. As previously discussed, this would work via something like SET ROLE ... WITH DEFAULTS, rather than with just SET ROLE. We don't want to break backwards compatibility. -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com
On Sun, May 31, 2009 at 9:12 PM, Josh Berkus <josh@agliodbs.com> wrote: > This assumes that all users should have access to the same public APIs as > all other users. Real applications are more complex. Well the goal is to make them simpler. I don't know any language that has implemented what you describe. Either you have access to the internal methods of a class or you don't and you only have access to the public api. That seems to work for much more sophisticated languages than ours just fine. > > In my experience of PostgreSQL applications, people use schema for three > different reasons: > > Organization/Maintainability: when you have hundreds or thousands of > database objects, you want "folders" to put them in just so that you can > keep track of them and view them in easy-to-digest groups, just as you deal > with files in a filesystem. Well when you ls a directory or perform some operation on a file you only work with what's in that directory, not everything in the hierarchy. > Visibility: some applications use schema to hide objects from roles which > shouldn't see them: "inner","cronjobs","acl", whether for data hiding or > just to keep "private" functions and tables separate from what the > application accesses directly. However, this approach is not very common > *because of* the awkwardness and overhead of search_path; DBAs are > constantly troubleshooting search_path omissions and errors and eventually > give up on visibility rules, making all schema visible to all users. I don't think that gets any easier with better tools. This is the same reason unix systems don't put every tool in a different directory and then insist you put every directory in your path based on which tools each user should have access to. What you're describing is a fundamentally painful thing to do. You have to decide for every user what objects they should have access to and which they shouldn't. It doesn't get any ideasier if you have every function hard coding inside it assumptions about what schemas it will need. > As I said before, schema conbine 3 purposes: organization, visibility and > security, into one structure. Which is why it's difficult to make them work > perfectly for all 3 purposes. We could, however, make them work better. How is this different from any other analogous system? The filesystem uses directories for all three of the above, for example? Having three different namespaces, one for organizing your code, one to control visibility, and one to control security would be 9 times more complex, i think. >> Uhm, wouldn't you just refer to information_schema.foo? What if some >> other part of your application depends on information_schema *not* >> being in your path? Using global state for this seems destined to >> leave you with something broken that can't be fixed without breaking >> something else. > > Easily said for someone who doesn't have to adapt a 3rd-party vendor > application or support real users on the phone. Insisting that all of your > application developers remember to type "information_schema." all of the > time really makes them love the DBA. Mostly, this simply results in people > not using information_schema, and instead using their own home-grown system > view scripts, which are often wrong. 3rd-party vendor code is precisely what I'm thinking of when I point out that having global state to override what the code requests is a recipe for problems. 3rd-party vendors would be left with no way to write their code such that they could guarantee it would work -- the DBA would always be able to break it by setting this variable. And some other code might require this variable to be set leaving the hapless DBA with no right option. > However, if we had push/pop/shift/unshift for search_path, the need for > search_path_suffix would be considerably diminished, since application code > (& DBAs) would use push/pop instead of replacing the entire search_path. Well I don't mind push but I still think pop is an error. What you really want to do is restore it to the value you started with. You don't want to remove the last element since that may not be the element you added. Some function you called may have added an extra element on the head. -- greg
On May 31, 2009, at 3:47 PM, Greg Stark wrote: > On Sun, May 31, 2009 at 9:12 PM, Josh Berkus <josh@agliodbs.com> > wrote: >> This assumes that all users should have access to the same public >> APIs as >> all other users. Real applications are more complex. > > Well the goal is to make them simpler. I don't know any language that > has implemented what you describe. Either you have access to the > internal methods of a class or you don't and you only have access to > the public api. That seems to work for much more sophisticated > languages than ours just fine. Right, but PostgreSQL isn't a language, it's a database. And PostgreSQL already has stuff in place to affect visibility of objects. Such is not reasonable for Python, but makes perfect sense in an RDBMS IMHO. > Well when you ls a directory or perform some operation on a file you > only work with what's in that directory, not everything in the > hierarchy. I don't see how this relates to what Josh said. He was just talking about organizing object into schemas, not about trees of schemas AFAICT. > I don't think that gets any easier with better tools. This is the same > reason unix systems don't put every tool in a different directory and > then insist you put every directory in your path based on which tools > each user should have access to. But they have tools in a few different directories (/bin, /sbin, /usr/ bin, /usr/sbin, /usr/local/bin, etc.), and it gives you the $PATH environment variable to affect visibility. > What you're describing is a fundamentally painful thing to do. You > have to decide for every user what objects they should have access to > and which they shouldn't. Well, groups of users, yes. Roles. Pretty standard security stuff. > It doesn't get any ideasier if you have > every function hard coding inside it assumptions about what schemas it > will need. It's worth avoiding that, too. Or perhaps objects are aware of their own schemas, just as a subroutine in the Foo::Bar package in Perl can access another subroutine in the same package without having to put Foo::Bar:: in front of it. > How is this different from any other analogous system? The filesystem > uses directories for all three of the above, for example? Maybe it's not, but it could still be easier to use. > Having three different namespaces, one for organizing your code, one > to control visibility, and one to control security would be 9 times > more complex, i think. But people are doing this already. We should make their jobs easier. > 3rd-party vendor code is precisely what I'm thinking of when I point > out that having global state to override what the code requests is a > recipe for problems. 3rd-party vendors would be left with no way to > write their code such that they could guarantee it would work -- the > DBA would always be able to break it by setting this variable. And > some other code might require this variable to be set leaving the > hapless DBA with no right option. But not, perhaps, if objects automatically know about other objects in their own schemas. Put another way, when a function in the "foo" schema is called, it would transparently use the search path "foo, $search_path" whenever it tried to do anything. >> However, if we had push/pop/shift/unshift for search_path, the need >> for >> search_path_suffix would be considerably diminished, since >> application code >> (& DBAs) would use push/pop instead of replacing the entire >> search_path. > > Well I don't mind push but I still think pop is an error. What you > really want to do is restore it to the value you started with. You > don't want to remove the last element since that may not be the > element you added. Some function you called may have added an extra > element on the head. I think it's worth it to be complete in the implementation, and not leave things out because we think someone might shoot themselves in the foot. Best, David
Greg, > Well the goal is to make them simpler. I don't know any language that > has implemented what you describe. Either you have access to the > internal methods of a class or you don't and you only have access to > the public api. That seems to work for much more sophisticated > languages than ours just fine. Um, PostgreSQL isn't a programming language. It's a DBMS. You're arguing what DBAs should do using some theoretical idealized DBMS. You're ignoring what DBAs *do* do currently using the real world PostgreSQL. I, for one, am not interested in theory. >> Organization/Maintainability: when you have hundreds or thousands of >> database objects, you want "folders" to put them in just so that you can >> keep track of them and view them in easy-to-digest groups, just as you deal >> with files in a filesystem. > > Well when you ls a directory or perform some operation on a file you > only work with what's in that directory, not everything in the > hierarchy. Precisely! And that's why DBAs often use lots of schema to divide up their hundreds of database objects. > I don't think that gets any easier with better tools. This is the same > reason unix systems don't put every tool in a different directory and > then insist you put every directory in your path based on which tools > each user should have access to. What, you're telling me you never had a cron job break because of $PATH issues? > It doesn't get any ideasier if you have > every function hard coding inside it assumptions about what schemas it > will need. When have I proposed that? >> As I said before, schema conbine 3 purposes: organization, visibility and >> security, into one structure. Which is why it's difficult to make them work >> perfectly for all 3 purposes. We could, however, make them work better. > > How is this different from any other analogous system? The filesystem > uses directories for all three of the above, for example? It's very similar to issues with the filesystem. Unfortunately, while very familiar, Unix filesystems aren't really a positive example; $PATHs and UMASK are a PITA an have forced many and admin (and OS) to come up with complex tools to manage them. > Having three different namespaces, one for organizing your code, one > to control visibility, and one to control security would be 9 times > more complex, i think. I didn't say I had a good answer to this problem. You just need to be aware of the three purposes of schema when proposing any improvements; your previous e-mails kept making the assumption that schema were used *only* for visibility, and never for security or organization. > 3rd-party vendor code is precisely what I'm thinking of when I point > out that having global state to override what the code requests is a > recipe for problems. 3rd-party vendors would be left with no way to > write their code such that they could guarantee it would work -- the > DBA would always be able to break it by setting this variable. And > some other code might require this variable to be set leaving the > hapless DBA with no right option. "You must set search_path_suffix='information_schema'" to use this tool is vastly simpler than what you'd deal with currently if you had to deal with a Microsoftian tool which assumed that information_schema was automatically in your search path. Again, I'm looking to improve what we actually *have* right now, rather than implement some theoretically ideal database. > Well I don't mind push but I still think pop is an error. What you > really want to do is restore it to the value you started with. You > don't want to remove the last element since that may not be the > element you added. Some function you called may have added an extra > element on the head. Yeah, "pop" is a misnomer; what I'd want is search_path_del(search_path,'admin') ... that is, a way to remove a specific schema from the list. -- Josh Berkus PostgreSQL Experts Inc. www.pgexperts.com
Josh Berkus wrote: >> Well I don't mind push but I still think pop is an error. What you >> really want to do is restore it to the value you started with. You >> don't want to remove the last element since that may not be the >> element you added. Some function you called may have added an extra >> element on the head. > > Yeah, "pop" is a misnomer; what I'd want is > search_path_del(search_path,'admin') ... that is, a way to remove a > specific schema from the list. Except that "del" shouldn't delete if your "push" didn't add it because it was already present. So you actually want some sort of refcounting there somehow. -- Alvaro Herrera http://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc.
On Mon, Jun 1, 2009 at 7:57 PM, Alvaro Herrera <alvherre@commandprompt.com> wrote: > Josh Berkus wrote: > >>> Well I don't mind push but I still think pop is an error. What you >>> really want to do is restore it to the value you started with. You >>> don't want to remove the last element since that may not be the >>> element you added. Some function you called may have added an extra >>> element on the head. >> >> Yeah, "pop" is a misnomer; what I'd want is >> search_path_del(search_path,'admin') ... that is, a way to remove a >> specific schema from the list. > > Except that "del" shouldn't delete if your "push" didn't add it because > it was already present. So you actually want some sort of refcounting > there somehow. As I said earlier I doubt "pop" or "delete" is ever going to actually be what you want. I suspect you're far more likely to want to restore it to what it was before you started altering it. As support I'll point out this is what our C api has. There's no short cut to strip out a single element of the path but the normal calling pattern is to set aside a copy of the old path, add modify it in some way -- often adding a schema to the head -- then restore the old path. Note that you may want to make other modifications such as adding several paths -- it would suck to have to hard code those twice once to add and once to remove. Or remove a search path element and then later restore it. Or for that matter to replace the whole search path wholesale temporarily... -- greg
Tom Lane escribió: > An > alternative was to not have BEGIN/END but instead a GUC variable that > you can SET to the name of the extension currently being added to. > (The main advantage of that is that the state isn't hidden, but easily > checkable via existing commands.) With the CREATE EXTENSION you could still have a read-only GUC var "current_extension" or so. -- Alvaro Herrera http://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support
On Mon, Jun 01, 2009 at 08:05:33PM +0100, Greg Stark wrote: > As I said earlier I doubt "pop" or "delete" is ever going to actually > be what you want. I suspect you're far more likely to want to restore > it to what it was before you started altering it. > > As support I'll point out this is what our C api has. There's no short > cut to strip out a single element of the path but the normal calling > pattern is to set aside a copy of the old path, add modify it in some > way -- often adding a schema to the head -- then restore the old path. Without reading much of what's been said here (I've read maybe ten of the posts in this thread) I'll say it sounds a lot like lexical closures are needed. Code is free to define and use generally use whatever is in their closure, but can't affect what's outside it unless explicitly granted. I saw these mentioned in another post by David Wheeler[1] but my client says it wasn't directly responded to. He calls it "lexical scoping" but I think closing over the environment seems more suitable---mainly because it'll "go wrong" less often in the presence of functions defined as "security definer". -- Sam http://samason.me.uk/ [1] http://archives.postgresql.org/message-id/5A1FE6B1-9857-454C-A385-BA061DED346F@kineticode.com
On Mon, Jun 1, 2009 at 3:27 PM, Sam Mason <sam@samason.me.uk> wrote: > On Mon, Jun 01, 2009 at 08:05:33PM +0100, Greg Stark wrote: >> As I said earlier I doubt "pop" or "delete" is ever going to actually >> be what you want. I suspect you're far more likely to want to restore >> it to what it was before you started altering it. >> >> As support I'll point out this is what our C api has. There's no short >> cut to strip out a single element of the path but the normal calling >> pattern is to set aside a copy of the old path, add modify it in some >> way -- often adding a schema to the head -- then restore the old path. > > Without reading much of what's been said here (I've read maybe ten of > the posts in this thread) I'll say it sounds a lot like lexical closures > are needed. Code is free to define and use generally use whatever is > in their closure, but can't affect what's outside it unless explicitly > granted. > > I saw these mentioned in another post by David Wheeler[1] but my client > says it wasn't directly responded to. He calls it "lexical scoping" > but I think closing over the environment seems more suitable---mainly > because it'll "go wrong" less often in the presence of functions defined > as "security definer". +1. ...Robert