Thread: security_definer_search_path GUC
Hi,
Since writing SECURITY DEFINER functions securely requires annoying incantations[1], wouldn't it be nice if we provided a way for the superuser to override the default search path via a GUC in postgresql.conf? That way you can set search_path if you want to override the default, but if you leave it out you're not vulnerable, assuming security_definer_search_path only contains secure schemas.
.m
Glad you bring this problem up for discussion, something should be done to improve the situation.
Personally, as I really dislike search_path and consider using it an anti-pattern.
I would rather prefer a GUC to hard-code search_path to a constant default value of just ‘public’ that cannot be changed by anyone or any function. Trying to change it to a different value would raise an exception.
This would work for me since I always fully-qualify all objects except the ones in public.
/Joel
On Thu, May 27, 2021, at 13:23, Marko Tiikkaja wrote:
Hi,Since writing SECURITY DEFINER functions securely requires annoying incantations[1], wouldn't it be nice if we provided a way for the superuser to override the default search path via a GUC in postgresql.conf? That way you can set search_path if you want to override the default, but if you leave it out you're not vulnerable, assuming security_definer_search_path only contains secure schemas..m
Kind regards,
Joel
On Sat, May 29, 2021 at 11:06 PM Joel Jacobson <joel@compiler.org> wrote:
Glad you bring this problem up for discussion, something should be done to improve the situation.Personally, as I really dislike search_path and consider using it an anti-pattern.I would rather prefer a GUC to hard-code search_path to a constant default value of just ‘public’ that cannot be changed by anyone or any function. Trying to change it to a different value would raise an exception.
That would work, too! I think it's a nice idea, perhaps even better than what I proposed. I would be happy to see either one incorporated.
.m
On Sat, May 29, 2021, at 22:10, Marko Tiikkaja wrote:
On Sat, May 29, 2021 at 11:06 PM Joel Jacobson <joel@compiler.org> wrote:Glad you bring this problem up for discussion, something should be done to improve the situation.Personally, as I really dislike search_path and consider using it an anti-pattern.I would rather prefer a GUC to hard-code search_path to a constant default value of just ‘public’ that cannot be changed by anyone or any function. Trying to change it to a different value would raise an exception.That would work, too! I think it's a nice idea, perhaps even better than what I proposed. I would be happy to see either one incorporated.
Another idea would be to create an extension that removes the search_path feature entirely,
not sure though if the current hooks would allow creating such an extension.
Maybe "extensions" that only removes unwanted core features could be by convention be prefixed with "no_"?
CREATE EXTENSION no_search_path;
That way, a company with a company-wide policy against using search_path,
could add this to all their company .control extension files:
requires = 'no_search_path'
If some employee would try to `DROP EXTENSION no_search_path` they would get an error:
# DROP EXTENSION no_search_path;
ERROR: cannot drop extension no_search_path because other objects depend on it
DETAIL: extension acme_inc depends on extension no_search_path
This would be especially useful when a company has a policy to use some extension,
instead of relying on the built-in functionality provided.
I'm not using "zson" myself, but perhaps it could be a good example to illustrate my point:
Let's say a company has decided to use zson instead of json/jsonb,
the company would then ensure nothing is using json/jsonb
via the top-level .control file for the company's own extension:
requires = 'no_json, no_jsonb, zson'
Or if not shipping the company's product as an extension,
they could instead add this to the company's install script:
CREATE EXTENSION no_json;
CREATE EXTENSION no_jsonb;
CREATE EXTENSION zson;
Maybe this is out of scope for extensions, since I guess extensions are supposed to add features?
If so, how about a new separate command `CREATE REDUCTION` specifically to remove unwanted core features,
which then wouldn't need the "no_" prefix since it would be implicit and in a different namespace:
E.g.
CREATE REDUCTION search_path;
and
CREATE REDUCTION json;
CREATE REDUCTION jsonb;
CREATE EXTENSION zson;
/Joel
On Sun, May 30, 2021, at 08:51, Joel Jacobson wrote:
Maybe this is out of scope for extensions, since I guess extensions are supposed to add features?If so, how about a new separate command `CREATE REDUCTION` specifically to remove unwanted core features,which then wouldn't need the "no_" prefix since it would be implicit and in a different namespace:
Another idea would be to extract features that are considered deprecated/legacy into separate extensions,
and ship them pre-installed for compatibility reasons,
but this would allow uninstalling them using DROP EXTENSION,
similar to how e.g. "plpgsql" which is a pre-installed extension can be uninstalled.
(Except I wouldn't want to uninstall plpgsql, I think it's great! But I note it's the only pre-installed extension shipped with PostgreSQL, so it's a good example on the concept.)
/Joel
ne 30. 5. 2021 v 8:52 odesílatel Joel Jacobson <joel@compiler.org> napsal:
On Sat, May 29, 2021, at 22:10, Marko Tiikkaja wrote:On Sat, May 29, 2021 at 11:06 PM Joel Jacobson <joel@compiler.org> wrote:Glad you bring this problem up for discussion, something should be done to improve the situation.Personally, as I really dislike search_path and consider using it an anti-pattern.I would rather prefer a GUC to hard-code search_path to a constant default value of just ‘public’ that cannot be changed by anyone or any function. Trying to change it to a different value would raise an exception.That would work, too! I think it's a nice idea, perhaps even better than what I proposed. I would be happy to see either one incorporated.Another idea would be to create an extension that removes the search_path feature entirely,not sure though if the current hooks would allow creating such an extension.Maybe "extensions" that only removes unwanted core features could be by convention be prefixed with "no_"?CREATE EXTENSION no_search_path;That way, a company with a company-wide policy against using search_path,could add this to all their company .control extension files:
Maybe inverted design can work better - there can be GUC - "qualified_names_required" with a list of schemas without enabled implicit access.
The one possible value can be "all".
The advantage of this design can be the possibility of work on current extensions.
I don't think so search_path can be disabled - but there can be checks that disallow non-qualified names.
Pavel
requires = 'no_search_path'If some employee would try to `DROP EXTENSION no_search_path` they would get an error:# DROP EXTENSION no_search_path;ERROR: cannot drop extension no_search_path because other objects depend on itDETAIL: extension acme_inc depends on extension no_search_pathThis would be especially useful when a company has a policy to use some extension,instead of relying on the built-in functionality provided.I'm not using "zson" myself, but perhaps it could be a good example to illustrate my point:Let's say a company has decided to use zson instead of json/jsonb,the company would then ensure nothing is using json/jsonbvia the top-level .control file for the company's own extension:requires = 'no_json, no_jsonb, zson'Or if not shipping the company's product as an extension,they could instead add this to the company's install script:CREATE EXTENSION no_json;CREATE EXTENSION no_jsonb;CREATE EXTENSION zson;Maybe this is out of scope for extensions, since I guess extensions are supposed to add features?If so, how about a new separate command `CREATE REDUCTION` specifically to remove unwanted core features,which then wouldn't need the "no_" prefix since it would be implicit and in a different namespace:E.g.CREATE REDUCTION search_path;andCREATE REDUCTION json;CREATE REDUCTION jsonb;CREATE EXTENSION zson;/Joel
On Sun, May 30, 2021, at 09:54, Pavel Stehule wrote:
Maybe inverted design can work better - there can be GUC - "qualified_names_required" with a list of schemas without enabled implicit access.The one possible value can be "all".The advantage of this design can be the possibility of work on current extensions.I don't think so search_path can be disabled - but there can be checks that disallow non-qualified names.
I would prefer a pre-installed search_path-extension that can be uninstalled,
instead of yet another GUC, but if that's not an option, I'm happy with a GUC as well.
IMO, the current search_path default behaviour is a minefield.
For users like myself, who prefer a safer context-free name resolution behaviour, here is how I think it should work:
* The only schemas that don't require fully-qualified schemas are 'pg_catalog' and 'public'
* The $user schema feature is removed, i.e:
- $user is not part of the search_path
- objects are not created nor looked for in a $user schema if such a schema exists
- objects are always created in 'public' if a schema is not explicitly specified
* Temp objects always needs to be fully-qualified using 'pg_temp'
* 'pg_catalog' and 'public' are enforced to be completely disjoint.
That is, trying to create an object in 'public' that is in conflict with 'pg_catalog' would raise an error.
More ideas?
/Joel
út 1. 6. 2021 v 8:59 odesílatel Joel Jacobson <joel@compiler.org> napsal:
On Sun, May 30, 2021, at 09:54, Pavel Stehule wrote:Maybe inverted design can work better - there can be GUC - "qualified_names_required" with a list of schemas without enabled implicit access.The one possible value can be "all".The advantage of this design can be the possibility of work on current extensions.I don't think so search_path can be disabled - but there can be checks that disallow non-qualified names.I would prefer a pre-installed search_path-extension that can be uninstalled,instead of yet another GUC, but if that's not an option, I'm happy with a GUC as well.IMO, the current search_path default behaviour is a minefield.For users like myself, who prefer a safer context-free name resolution behaviour, here is how I think it should work:* The only schemas that don't require fully-qualified schemas are 'pg_catalog' and 'public'* The $user schema feature is removed, i.e:- $user is not part of the search_path- objects are not created nor looked for in a $user schema if such a schema exists- objects are always created in 'public' if a schema is not explicitly specified* Temp objects always needs to be fully-qualified using 'pg_temp'* 'pg_catalog' and 'public' are enforced to be completely disjoint.That is, trying to create an object in 'public' that is in conflict with 'pg_catalog' would raise an error.More ideas?
Operators use schemas too. I cannot imagine any work with operators with the necessity of explicit schemas.
Regards
Pavel
/Joel
On Tue, Jun 1, 2021, at 10:44, Pavel Stehule wrote:
Operators use schemas too. I cannot imagine any work with operators with the necessity of explicit schemas.
I thought operators are mostly installed in the public schema, in which case that wouldn't be a problem, or am I missing something here?
/Joel
út 1. 6. 2021 v 12:53 odesílatel Joel Jacobson <joel@compiler.org> napsal:
On Tue, Jun 1, 2021, at 10:44, Pavel Stehule wrote:Operators use schemas too. I cannot imagine any work with operators with the necessity of explicit schemas.I thought operators are mostly installed in the public schema, in which case that wouldn't be a problem, or am I missing something here?
/Joel
On Tue, Jun 1, 2021, at 12:55, Pavel Stehule wrote:
út 1. 6. 2021 v 12:53 odesílatel Joel Jacobson <joel@compiler.org> napsal:On Tue, Jun 1, 2021, at 10:44, Pavel Stehule wrote:Operators use schemas too. I cannot imagine any work with operators with the necessity of explicit schemas.I thought operators are mostly installed in the public schema, in which case that wouldn't be a problem, or am I missing something here?It is inconsistency - if I use schema for almost all, then can be strange to store operators just to public.
I don't agree. If an extension provides functionality that is supposed to be used by all parts of the system, then I think the 'public' schema is a good choice.
Using schemas only for the sake of separation, i.e. adding the schemas to the search_path, to make them implicitly available, is IMO an ugly hack, since if just wanting separation without fully-qualifying, then packaging the objects are separate extensions is much cleaner. That way you can easily see what objects are provided by each extension using \dx+.
/Joel
út 1. 6. 2021 v 13:13 odesílatel Joel Jacobson <joel@compiler.org> napsal:
On Tue, Jun 1, 2021, at 12:55, Pavel Stehule wrote:út 1. 6. 2021 v 12:53 odesílatel Joel Jacobson <joel@compiler.org> napsal:On Tue, Jun 1, 2021, at 10:44, Pavel Stehule wrote:Operators use schemas too. I cannot imagine any work with operators with the necessity of explicit schemas.I thought operators are mostly installed in the public schema, in which case that wouldn't be a problem, or am I missing something here?It is inconsistency - if I use schema for almost all, then can be strange to store operators just to public.I don't agree. If an extension provides functionality that is supposed to be used by all parts of the system, then I think the 'public' schema is a good choice.
I disagree
usual design of extensions (when schema is used) is
create schema ...
set schema ...
create table
create function
...
It is hard to say if it is good or it is bad. Orafce using my own schema, and some things are in public (and some in pg_catalog), and people don't tell me, so it was a good choice.
Regards
Pavel
Using schemas only for the sake of separation, i.e. adding the schemas to the search_path, to make them implicitly available, is IMO an ugly hack, since if just wanting separation without fully-qualifying, then packaging the objects are separate extensions is much cleaner. That way you can easily see what objects are provided by each extension using \dx+./Joel
On Tue, Jun 1, 2021, at 14:41, Pavel Stehule wrote:
út 1. 6. 2021 v 13:13 odesílatel Joel Jacobson <joel@compiler.org> napsal:I don't agree. If an extension provides functionality that is supposed to be used by all parts of the system, then I think the 'public' schema is a good choice.I disagreeusual design of extensions (when schema is used) iscreate schema ...set schema ...create tablecreate functionIt is hard to say if it is good or it is bad.
Yes, it's hard, because it's a matter of taste.
Some prefer convenience, others clarity/safety.
Orafce using my own schema, and some things are in public (and some in pg_catalog), and people don't tell me, so it was a good choice.
I struggle to understand this last sentence.
So you orafce extension installs objects in both public and pg_catalog, right.
But what do you mean with "people don't tell me"?
And what "was a good choice"?
Thanks for explaining.
/Joel
út 1. 6. 2021 v 17:57 odesílatel Joel Jacobson <joel@compiler.org> napsal:
On Tue, Jun 1, 2021, at 14:41, Pavel Stehule wrote:út 1. 6. 2021 v 13:13 odesílatel Joel Jacobson <joel@compiler.org> napsal:I don't agree. If an extension provides functionality that is supposed to be used by all parts of the system, then I think the 'public' schema is a good choice.I disagreeusual design of extensions (when schema is used) iscreate schema ...set schema ...create tablecreate functionIt is hard to say if it is good or it is bad.Yes, it's hard, because it's a matter of taste.Some prefer convenience, others clarity/safety.Orafce using my own schema, and some things are in public (and some in pg_catalog), and people don't tell me, so it was a good choice.I struggle to understand this last sentence.So you orafce extension installs objects in both public and pg_catalog, right.But what do you mean with "people don't tell me"?And what "was a good choice"?
I learned programming on Orafce, and I didn't expect any success, so I designed it quickly, and the placing of old Orafce's functions to schemas is messy.
I am sure, if I started again, I would never use pg_catalog or public schema. I think if somebody uses schema, then it is good to use schema for all without exceptions - but it expects usage of search_path. I am not sure if using public schema or using search_path are two sides of one thing.
Pavel
Thanks for explaining./Joel
On Tue, Jun 1, 2021, at 18:05, Pavel Stehule wrote:
I learned programming on Orafce, and I didn't expect any success, so I designed it quickly, and the placing of old Orafce's functions to schemas is messy.I am sure, if I started again, I would never use pg_catalog or public schema. I think if somebody uses schema, then it is good to use schema for all without exceptions - but it expects usage of search_path. I am not sure if using public schema or using search_path are two sides of one thing.
I think you're right they both try to provide solutions to the same problem, i.e. when wanting to avoid having to fully-qualify.
However, they are very different, and while I think the 'public' schema is a great idea, I think 'search_path' has some serious problems. I'll explain why:
'search_path' is a bit like a global variable in C, that can change the behaviour of the SQL commands executed.
It makes unqualified SQL code context-sensitive; you don't know by looking at a piece of code what objects are referred to, you also need to figure out what the active search_path is at this place in the code.
'public' schema if used (without ever changing the default 'search_path'), allows creating unqualified database objects, which I think can be useful in at least three situations:
1) when the application is a monolith inside a company, when there is only one version of the database, i.e. not having to worry about name collision with other objects in some other version, since the application is hidden in the company and the schema design is not exposed to the public
2) when installing a extension that uses schemas, when wanting the convenience of unqualified access to some functions frequently used, instead of adding its schema to the search_path for convenience, one can instead add wrapper-functions in the 'public' schema. This way, all internal functions in the extension, that are not meant to be executed by users, are still hidden in its schema and won't bother anyone (i.e. can't cause unexpected conflicts). Of course, access can also be controlled via REVOKE EXECUTE ... FROM PUBLIC for such internal functions, which is probably a good idea as well.
In a similar way, specific tables in the extension's schema can be made unqualified as well by adding simple views, installed in the public schema, if insisting on unqualified convenience.
In conclusion:
The main difference is 'public' makes it possible to make *specific* objects unqualified,
while 'search_path' makes *all* objects in such schema(s) unqualified.
/Joel
st 2. 6. 2021 v 8:45 odesílatel Joel Jacobson <joel@compiler.org> napsal:
On Tue, Jun 1, 2021, at 18:05, Pavel Stehule wrote:I learned programming on Orafce, and I didn't expect any success, so I designed it quickly, and the placing of old Orafce's functions to schemas is messy.I am sure, if I started again, I would never use pg_catalog or public schema. I think if somebody uses schema, then it is good to use schema for all without exceptions - but it expects usage of search_path. I am not sure if using public schema or using search_path are two sides of one thing.I think you're right they both try to provide solutions to the same problem, i.e. when wanting to avoid having to fully-qualify.However, they are very different, and while I think the 'public' schema is a great idea, I think 'search_path' has some serious problems. I'll explain why:'search_path' is a bit like a global variable in C, that can change the behaviour of the SQL commands executed.It makes unqualified SQL code context-sensitive; you don't know by looking at a piece of code what objects are referred to, you also need to figure out what the active search_path is at this place in the code.
sometimes this is wanted feature - some sharding is based on this
set search_path = 'custormerx'
...
'public' schema if used (without ever changing the default 'search_path'), allows creating unqualified database objects, which I think can be useful in at least three situations:1) when the application is a monolith inside a company, when there is only one version of the database, i.e. not having to worry about name collision with other objects in some other version, since the application is hidden in the company and the schema design is not exposed to the public2) when installing a extension that uses schemas, when wanting the convenience of unqualified access to some functions frequently used, instead of adding its schema to the search_path for convenience, one can instead add wrapper-functions in the 'public' schema. This way, all internal functions in the extension, that are not meant to be executed by users, are still hidden in its schema and won't bother anyone (i.e. can't cause unexpected conflicts). Of course, access can also be controlled via REVOKE EXECUTE ... FROM PUBLIC for such internal functions, which is probably a good idea as well.In a similar way, specific tables in the extension's schema can be made unqualified as well by adding simple views, installed in the public schema, if insisting on unqualified convenience.In conclusion:The main difference is 'public' makes it possible to make *specific* objects unqualified,while 'search_path' makes *all* objects in such schema(s) unqualified.
These arguments are valid, but I think so it is not all. If you remove search_path, then the "public" schema will be overused. I think we should ask - who can change the search path and how. Now, there are not any limits. I can imagine the situation when search_path can be changed by only some dedicated role - it can be implemented in a security definer function. Or another solution, we can fix the search path to one value, or only a few possibilities.
Maybe for your purpose is just enough to introduce syntax for defining all possibilities of search path:
search_path = "public" # now, just default
search_path = ["public"] # future - define vector of possible values of search path - in this case, only "public" is allowed - and if you want to change it, you should be database owner
or there can be hook for changing search_path, and it can be implemented dynamically in extension
Pavel
/Joel
On Wed, Jun 2, 2021, at 09:07, Pavel Stehule wrote:
st 2. 6. 2021 v 8:45 odesílatel Joel Jacobson <joel@compiler.org> napsal:'search_path' is a bit like a global variable in C, that can change the behaviour of the SQL commands executed.It makes unqualified SQL code context-sensitive; you don't know by looking at a piece of code what objects are referred to, you also need to figure out what the active search_path is at this place in the code.sometimes this is wanted feature - some sharding is based on thisset search_path = 'custormerx'
Oh, interesting, didn't know abou that one. Is that recommended best practise, or more of a hack?
I also think we can never get rid of search_path by default, since so much legacy depend on it.
But I think it would be good to provide a way to effectively uninstall the search_path for users who prefer to do so, in databases where it's possible, and where clarity and safety is desired.
'public' schema if used (without ever changing the default 'search_path'), allows creating unqualified database objects, which I think can be useful in at least three situations:1) when the application is a monolith inside a company, when there is only one version of the database, i.e. not having to worry about name collision with other objects in some other version, since the application is hidden in the company and the schema design is not exposed to the public2) when installing a extension that uses schemas, when wanting the convenience of unqualified access to some functions frequently used, instead of adding its schema to the search_path for convenience, one can instead add wrapper-functions in the 'public' schema. This way, all internal functions in the extension, that are not meant to be executed by users, are still hidden in its schema and won't bother anyone (i.e. can't cause unexpected conflicts). Of course, access can also be controlled via REVOKE EXECUTE ... FROM PUBLIC for such internal functions, which is probably a good idea as well.In a similar way, specific tables in the extension's schema can be made unqualified as well by adding simple views, installed in the public schema, if insisting on unqualified convenience.In conclusion:The main difference is 'public' makes it possible to make *specific* objects unqualified,while 'search_path' makes *all* objects in such schema(s) unqualified.These arguments are valid, but I think so it is not all. If you remove search_path, then the "public" schema will be overused.
What makes you think that? If a database object is to be accessed unqualified by all users, isn't the 'public' schema a perfect fit for it? How will it be helpful to create different database objects in different schemas, if also adding all such schemas to the search_path so they can be accessed unqualified? In such a scenario you risk unintentionally creating conflicting objects, and whatever schema happened to be first in the search_path will be resolved. Seems insecure and messy to me.
Much safer to install objects that you want to access unqualified in 'public', and get an error if you try to create a new object with a conflicting name of an existing one.
I think we should ask - who can change the search path and how. Now, there are not any limits. I can imagine the situation when search_path can be changed by only some dedicated role - it can be implemented in a security definer function. Or another solution, we can fix the search path to one value, or only a few possibilities.Maybe for your purpose is just enough to introduce syntax for defining all possibilities of search path:search_path = "public" # now, just defaultsearch_path = ["public"] # future - define vector of possible values of search path - in this case, only "public" is allowed - and if you want to change it, you should be database owneror there can be hook for changing search_path, and it can be implemented dynamically in extension
Not bad ideas. I think they would improve the situation. Maybe it could even be a global immutable constant value, the same for all users, that could only be set upon initdb, similar to how encoding can only be set via initdb.
initdb --search_path "pg_catalog, public, pg_temp" foobar
But perhaps the search_path as an uninstallable extension is a less invasive idea.
Looking at the code, this seems to be the commit that introduced search_path back in 2002:
I'm not sure how difficult it would be to extract search_path into an extension.
Doesn't look to be that much code. Here is the initial commit that introduced the concept.
But perhaps it's more complex today due to new dependencies.
commit 838fe25a9532ab2e9b9b7517fec94e804cf3da81
Author: Tom Lane <tgl@sss.pgh.pa.us>
Date: Mon Apr 1 03:34:27 2002 +0000
Create a new GUC variable search_path to control the namespace search
path. The default behavior if no per-user schemas are created is that
all users share a 'public' namespace, thus providing behavior backwards
compatible with 7.2 and earlier releases. Probably the semantics and
default setting will need to be fine-tuned, but this is a start.
But search_path is not the only problem. I think it's also a problem objects with the same identifies can be created in both pg_catalog and public. Can we think of a valid reason why it is a good idea to continue to allow that? In what real-life scenario is it needed?
/Joel
On Wed, Jun 2, 2021 at 3:46 PM Joel Jacobson <joel@compiler.org> wrote:
If a database object is to be accessed unqualified by all users, isn't the 'public' schema a perfect fit for it? How will it be helpful to create different database objects in different schemas, if also adding all such schemas to the search_path so they can be accessed unqualified? In such a scenario you risk unintentionally creating conflicting objects, and whatever schema happened to be first in the search_path will be resolved. Seems insecure and messy to me.
Heh. This is actually exactly what I wanted to do.
The use case is: version upgrades. I want to be able to have a search_path of something like 'pg_catalog, compat, public'. That way we can provide compatibility versions of newer functions in the "compat" schema, which get taken over by pg_catalog when running on a newer version. That way all the compatibility crap is clearly separated from the stuff that should be in "public".
.m
st 2. 6. 2021 v 14:46 odesílatel Joel Jacobson <joel@compiler.org> napsal:
On Wed, Jun 2, 2021, at 09:07, Pavel Stehule wrote:st 2. 6. 2021 v 8:45 odesílatel Joel Jacobson <joel@compiler.org> napsal:'search_path' is a bit like a global variable in C, that can change the behaviour of the SQL commands executed.It makes unqualified SQL code context-sensitive; you don't know by looking at a piece of code what objects are referred to, you also need to figure out what the active search_path is at this place in the code.sometimes this is wanted feature - some sharding is based on thisset search_path = 'custormerx'Oh, interesting, didn't know abou that one. Is that recommended best practise, or more of a hack?
I have not any statistics, but I think it was relatively common until we had good partitioning. I know two big customers from Czech Republic.
Some people use schema as a database - without overhead of system catalogue and without necessity of reconnects to other databases.
Using search_path is very common for applications ported from Oracle.
I also think we can never get rid of search_path by default, since so much legacy depend on it.But I think it would be good to provide a way to effectively uninstall the search_path for users who prefer to do so, in databases where it's possible, and where clarity and safety is desired.'public' schema if used (without ever changing the default 'search_path'), allows creating unqualified database objects, which I think can be useful in at least three situations:1) when the application is a monolith inside a company, when there is only one version of the database, i.e. not having to worry about name collision with other objects in some other version, since the application is hidden in the company and the schema design is not exposed to the public2) when installing a extension that uses schemas, when wanting the convenience of unqualified access to some functions frequently used, instead of adding its schema to the search_path for convenience, one can instead add wrapper-functions in the 'public' schema. This way, all internal functions in the extension, that are not meant to be executed by users, are still hidden in its schema and won't bother anyone (i.e. can't cause unexpected conflicts). Of course, access can also be controlled via REVOKE EXECUTE ... FROM PUBLIC for such internal functions, which is probably a good idea as well.In a similar way, specific tables in the extension's schema can be made unqualified as well by adding simple views, installed in the public schema, if insisting on unqualified convenience.In conclusion:The main difference is 'public' makes it possible to make *specific* objects unqualified,while 'search_path' makes *all* objects in such schema(s) unqualified.These arguments are valid, but I think so it is not all. If you remove search_path, then the "public" schema will be overused.What makes you think that? If a database object is to be accessed unqualified by all users, isn't the 'public' schema a perfect fit for it? How will it be helpful to create different database objects in different schemas, if also adding all such schemas to the search_path so they can be accessed unqualified? In such a scenario you risk unintentionally creating conflicting objects, and whatever schema happened to be first in the search_path will be resolved. Seems insecure and messy to me.Much safer to install objects that you want to access unqualified in 'public', and get an error if you try to create a new object with a conflicting name of an existing one.
I think people usually prefer simple solutions - like use for all public or use for all schemas.
I think we should ask - who can change the search path and how. Now, there are not any limits. I can imagine the situation when search_path can be changed by only some dedicated role - it can be implemented in a security definer function. Or another solution, we can fix the search path to one value, or only a few possibilities.Maybe for your purpose is just enough to introduce syntax for defining all possibilities of search path:search_path = "public" # now, just defaultsearch_path = ["public"] # future - define vector of possible values of search path - in this case, only "public" is allowed - and if you want to change it, you should be database owneror there can be hook for changing search_path, and it can be implemented dynamically in extensionNot bad ideas. I think they would improve the situation. Maybe it could even be a global immutable constant value, the same for all users, that could only be set upon initdb, similar to how encoding can only be set via initdb.initdb --search_path "pg_catalog, public, pg_temp" foobarBut perhaps the search_path as an uninstallable extension is a less invasive idea.Looking at the code, this seems to be the commit that introduced search_path back in 2002:I'm not sure how difficult it would be to extract search_path into an extension.Doesn't look to be that much code. Here is the initial commit that introduced the concept.But perhaps it's more complex today due to new dependencies.commit 838fe25a9532ab2e9b9b7517fec94e804cf3da81Author: Tom Lane <tgl@sss.pgh.pa.us>Date: Mon Apr 1 03:34:27 2002 +0000Create a new GUC variable search_path to control the namespace searchpath. The default behavior if no per-user schemas are created is thatall users share a 'public' namespace, thus providing behavior backwardscompatible with 7.2 and earlier releases. Probably the semantics anddefault setting will need to be fine-tuned, but this is a start.But search_path is not the only problem. I think it's also a problem objects with the same identifies can be created in both pg_catalog and public. Can we think of a valid reason why it is a good idea to continue to allow that? In what real-life scenario is it needed?
Probably it has not sense, but there is simple implementation - you can use just unique index(schema name, object name), and you don't need any other locks and checks
Pavel
/Joel
On Wed, Jun 02, 2021 at 02:46:08PM +0200, Joel Jacobson wrote: > > But perhaps the search_path as an uninstallable extension is a less invasive idea. I don't that that happening any time soon. An extension only adds SQL objects, it doesn't impact backend code. You can ship a module with your extension, but dropping an extension won't unload the module. And if it were then there's the *_preload_libraries that would totally nullify what you want. On top of that, it would also mean that the relation resolving could be changed by any other extension, which seems like a bad idea. > But search_path is not the only problem. I think it's also a problem objects > with the same identifies can be created in both pg_catalog and public. Can we > think of a valid reason why it is a good idea to continue to allow that? In > what real-life scenario is it needed? One somewhat acceptable use case is to replace catalog access with views to give access to some data e.g. some monitoring users. That's less a problem recently with the default roles, but still. There might be others.
On 2021-Jun-02, Marko Tiikkaja wrote: > The use case is: version upgrades. I want to be able to have a search_path > of something like 'pg_catalog, compat, public'. That way we can provide > compatibility versions of newer functions in the "compat" schema, which get > taken over by pg_catalog when running on a newer version. That way all the > compatibility crap is clearly separated from the stuff that should be in > "public". Can't you achieve that with "ALTER DATABASE .. SET search_path"? -- Álvaro Herrera Valdivia, Chile
On Wed, Jun 2, 2021, at 18:36, Marko Tiikkaja wrote:
On Wed, Jun 2, 2021 at 3:46 PM Joel Jacobson <joel@compiler.org> wrote:If a database object is to be accessed unqualified by all users, isn't the 'public' schema a perfect fit for it? How will it be helpful to create different database objects in different schemas, if also adding all such schemas to the search_path so they can be accessed unqualified? In such a scenario you risk unintentionally creating conflicting objects, and whatever schema happened to be first in the search_path will be resolved. Seems insecure and messy to me.Heh. This is actually exactly what I wanted to do.The use case is: version upgrades. I want to be able to have a search_path of something like 'pg_catalog, compat, public'. That way we can provide compatibility versions of newer functions in the "compat" schema, which get taken over by pg_catalog when running on a newer version. That way all the compatibility crap is clearly separated from the stuff that should be in "public".
That's a neat trick, probably the best solution in a really old PostgreSQL version, before we had extensions.
But if running a recent PostgreSQL version, with support for extensions, I think an even cleaner solution
would be to package such compatibility versions in a "compat" extension, that would just install them into the public schema.
Then, when upgrading, you would just not install the compat extension.
And if you wonder what functions in public come from the compat extension, you can use use \dx+.
/Joel
On Wed, Jun 2, 2021 at 10:20 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
On 2021-Jun-02, Marko Tiikkaja wrote:
> The use case is: version upgrades. I want to be able to have a search_path
> of something like 'pg_catalog, compat, public'. That way we can provide
> compatibility versions of newer functions in the "compat" schema, which get
> taken over by pg_catalog when running on a newer version. That way all the
> compatibility crap is clearly separated from the stuff that should be in
> "public".
Can't you achieve that with "ALTER DATABASE .. SET search_path"?
No, because I have a thousand SECURITY DEFINER functions which have to override search_path or they'd be insecure.
.m
On Wed, Jun 2, 2021 at 11:32 PM Joel Jacobson <joel@compiler.org> wrote:
On Wed, Jun 2, 2021, at 18:36, Marko Tiikkaja wrote:The use case is: version upgrades. I want to be able to have a search_path of something like 'pg_catalog, compat, public'. That way we can provide compatibility versions of newer functions in the "compat" schema, which get taken over by pg_catalog when running on a newer version. That way all the compatibility crap is clearly separated from the stuff that should be in "public".That's a neat trick, probably the best solution in a really old PostgreSQL version, before we had extensions.But if running a recent PostgreSQL version, with support for extensions, I think an even cleaner solutionwould be to package such compatibility versions in a "compat" extension, that would just install them into the public schema.
Writing, verifying and shipping extension upgrade scripts is not pleasant. I'd much prefer something that's integrated to the workflow I already have.
And if you wonder what functions in public come from the compat extension, you can use use \dx+.
They still show up everywhere when looking at "public". So this is only slightly better, and a maintenance burden.
.m
On Thu, Jun 3, 2021, at 00:55, Marko Tiikkaja wrote:
On Wed, Jun 2, 2021 at 11:32 PM Joel Jacobson <joel@compiler.org> wrote:But if running a recent PostgreSQL version, with support for extensions, I think an even cleaner solutionwould be to package such compatibility versions in a "compat" extension, that would just install them into the public schema.Writing, verifying and shipping extension upgrade scripts is not pleasant.
I agree. Thanks for acknowledging this problem.
I'm experimenting with an idea that I hope can simplify the "verifying" part of the problem.
hope to have something to show you all soon.
I'd much prefer something that's integrated to the workflow I already have.
Fair point. I guess also the initial switching cost of changing workflow is quite high and difficult to motivate. So even if extensions ergonomics are improved, many existing users will not migrate their workflows anyway due to this.
And if you wonder what functions in public come from the compat extension, you can use use \dx+.They still show up everywhere when looking at "public". So this is only slightly better, and a maintenance burden.
Good point. I find this annoying as well sometimes.
It's easy to get a list of all objects for an extension, via \dx+
But it's hard to see what objects in a schema, that are provided by different extensions, via e.g. \df public.*
What about adding a new "Extension" column next to "Schema" to the relevant commands, such as \df?
/Joel
čt 3. 6. 2021 v 8:14 odesílatel Joel Jacobson <joel@compiler.org> napsal:
On Thu, Jun 3, 2021, at 00:55, Marko Tiikkaja wrote:On Wed, Jun 2, 2021 at 11:32 PM Joel Jacobson <joel@compiler.org> wrote:But if running a recent PostgreSQL version, with support for extensions, I think an even cleaner solutionwould be to package such compatibility versions in a "compat" extension, that would just install them into the public schema.Writing, verifying and shipping extension upgrade scripts is not pleasant.I agree. Thanks for acknowledging this problem.I'm experimenting with an idea that I hope can simplify the "verifying" part of the problem.hope to have something to show you all soon.I'd much prefer something that's integrated to the workflow I already have.Fair point. I guess also the initial switching cost of changing workflow is quite high and difficult to motivate. So even if extensions ergonomics are improved, many existing users will not migrate their workflows anyway due to this.And if you wonder what functions in public come from the compat extension, you can use use \dx+.They still show up everywhere when looking at "public". So this is only slightly better, and a maintenance burden.Good point. I find this annoying as well sometimes.It's easy to get a list of all objects for an extension, via \dx+But it's hard to see what objects in a schema, that are provided by different extensions, via e.g. \df public.*What about adding a new "Extension" column next to "Schema" to the relevant commands, such as \df?
I think so for \df+ it can be very useful. I don't think it is important enough to be in short form, but it can be nice in enhanced form.
Pavel
/Joel
On Thu, Jun 3, 2021 at 9:14 AM Joel Jacobson <joel@compiler.org> wrote:
On Thu, Jun 3, 2021, at 00:55, Marko Tiikkaja wrote:They still show up everywhere when looking at "public". So this is only slightly better, and a maintenance burden.Good point. I find this annoying as well sometimes.It's easy to get a list of all objects for an extension, via \dx+But it's hard to see what objects in a schema, that are provided by different extensions, via e.g. \df public.*What about adding a new "Extension" column next to "Schema" to the relevant commands, such as \df?
That's just one part of it. The other part of my original proposal was to avoid having to SET search_path for all SECURITY DEFINER functions. I still think either being able to lock search_path or the separate prosecdef search_path is the best option here.
.m
čt 3. 6. 2021 v 17:54 odesílatel Marko Tiikkaja <marko@joh.to> napsal:
On Thu, Jun 3, 2021 at 9:14 AM Joel Jacobson <joel@compiler.org> wrote:On Thu, Jun 3, 2021, at 00:55, Marko Tiikkaja wrote:They still show up everywhere when looking at "public". So this is only slightly better, and a maintenance burden.Good point. I find this annoying as well sometimes.It's easy to get a list of all objects for an extension, via \dx+But it's hard to see what objects in a schema, that are provided by different extensions, via e.g. \df public.*What about adding a new "Extension" column next to "Schema" to the relevant commands, such as \df?That's just one part of it. The other part of my original proposal was to avoid having to SET search_path for all SECURITY DEFINER functions. I still think either being able to lock search_path or the separate prosecdef search_path is the best option here.
I agree so some possibility of locking search_path or possibility to control who and when can change it can increase security. This should be a core feature. It's maybe more generic issue - same functionality can be required for work_mem setting, maybe max_paralel_workers_per_gather, and other GUC
Regards
Pavel
.m
> On Jun 3, 2021, at 9:03 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote: > > I agree so some possibility of locking search_path or possibility to control who and when can change it can increase security.This should be a core feature. It's maybe more generic issue - same functionality can be required for work_mem setting,maybe max_paralel_workers_per_gather, and other GUC Chapman already suggested a mechanism in [1] to allow chaining together additional validators for GUCs. When setting search_path, the check_search_path(char **newval, void **extra, GucSource source) function is invoked. As Iunderstand Chapman's proposal, additional validators could be added to any GUC. You could implement search_path restrictionsby defining additional validators that enforce whatever restriction you like. Marko, does his idea sound workable for your needs? I understood your original proposal as only restricting the value ofsearch_path within security definer functions. This idea would allow you to restrict it everywhere, and not tailored tojust that context. [1] https://www.postgresql.org/message-id/608C9A81.3020006@anastigmatix.net — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, Jun 3, 2021 at 7:30 PM Mark Dilger <mark.dilger@enterprisedb.com> wrote:
> On Jun 3, 2021, at 9:03 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>
> I agree so some possibility of locking search_path or possibility to control who and when can change it can increase security. This should be a core feature. It's maybe more generic issue - same functionality can be required for work_mem setting, maybe max_paralel_workers_per_gather, and other GUC
Chapman already suggested a mechanism in [1] to allow chaining together additional validators for GUCs.
When setting search_path, the check_search_path(char **newval, void **extra, GucSource source) function is invoked. As I understand Chapman's proposal, additional validators could be added to any GUC. You could implement search_path restrictions by defining additional validators that enforce whatever restriction you like.
Marko, does his idea sound workable for your needs? I understood your original proposal as only restricting the value of search_path within security definer functions. This idea would allow you to restrict it everywhere, and not tailored to just that context.
Yeah, that would work for my use case just as well.
.m
čt 3. 6. 2021 v 18:30 odesílatel Mark Dilger <mark.dilger@enterprisedb.com> napsal:
> On Jun 3, 2021, at 9:03 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>
> I agree so some possibility of locking search_path or possibility to control who and when can change it can increase security. This should be a core feature. It's maybe more generic issue - same functionality can be required for work_mem setting, maybe max_paralel_workers_per_gather, and other GUC
Chapman already suggested a mechanism in [1] to allow chaining together additional validators for GUCs.
When setting search_path, the check_search_path(char **newval, void **extra, GucSource source) function is invoked. As I understand Chapman's proposal, additional validators could be added to any GUC. You could implement search_path restrictions by defining additional validators that enforce whatever restriction you like.
This design looks good for extensions, but I am not sure if it is good for users. Some declarative way without necessity to programming or install some extension can be nice.
Pavel
Marko, does his idea sound workable for your needs? I understood your original proposal as only restricting the value of search_path within security definer functions. This idea would allow you to restrict it everywhere, and not tailored to just that context.
[1] https://www.postgresql.org/message-id/608C9A81.3020006@anastigmatix.net
—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
> On Jun 3, 2021, at 9:38 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote: > > This design looks good for extensions, but I am not sure if it is good for users. Some declarative way without necessityto programming or install some extension can be nice. I agree, though "some declarative way" is a bit vague. I've had ideas that perhaps superusers should be able to furtherrestrict the [min,max] fields of int and real GUCs. Since -1 is sometimes used to mean "disabled", syntax to allowspecifying a set might be necessary, something like [-1, 60..600]. For text and enum GUCs, perhaps a set of regexpswould work, some being required to match and others being required not to match, such as: search_path !~ '\mcustomerx\M' search_path ~ '^pg_catalog,' If we did something like this, we'd need it to play nicely with other filters provided by extensions, because I'm reasonablysure not all filters could be done merely using set notation and regular expression syntax. In fact, I find ithard to convince myself that set notation and regular expression syntax would even be useful in a large enough number ofcases to be worth implementing. What are your thought on that? — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
I thought everybody was already doing this, but maybe not. I put the following in all my function definitions:
SET search_path FROM CURRENT
(with the exception of a very few functions which explicitly need to use the caller's search path)
It seems to me that if this was the default (note: I'm totally ignoring backward compatibility issues for now), then most of these issues wouldn't exist. My schema creation scripts start with an appropriate search path setting and that value then gets built into every function they create.
Related question: how can function compilation work when the behaviour depends on the search path of the caller? In other words, the behaviour of the function can be totally different on each call. Are there any popular programming environments in which the behaviour of a called function depends on the caller's environment (actually yes: shell scripting, with $PATH especially; but besides that and stored procedures)?
I also want to mention that I consider any suggestion to eliminate the search_path concept as a complete non-starter. It would be no different from proposing that the next version of a programming language eliminate (or stop using) the module system. If I could make it happen easily, I would go in the other direction and allow schemas to be hierarchical (note: totally ignoring all sorts of very important choices which are more than just details about how this should work). I would like to be able to have an extension or subsystem exist in a single schema, with its objects broken up into schemas within the schema. Same reason as most languages have hierarchical module systems.
On Thu, 3 Jun 2021 at 14:25, Mark Dilger <mark.dilger@enterprisedb.com> wrote:
> On Jun 3, 2021, at 9:38 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>
> This design looks good for extensions, but I am not sure if it is good for users. Some declarative way without necessity to programming or install some extension can be nice.
I agree, though "some declarative way" is a bit vague. I've had ideas that perhaps superusers should be able to further restrict the [min,max] fields of int and real GUCs. Since -1 is sometimes used to mean "disabled", syntax to allow specifying a set might be necessary, something like [-1, 60..600]. For text and enum GUCs, perhaps a set of regexps would work, some being required to match and others being required not to match, such as:
search_path !~ '\mcustomerx\M'
search_path ~ '^pg_catalog,'
If we did something like this, we'd need it to play nicely with other filters provided by extensions, because I'm reasonably sure not all filters could be done merely using set notation and regular expression syntax. In fact, I find it hard to convince myself that set notation and regular expression syntax would even be useful in a large enough number of cases to be worth implementing. What are your thought on that?
—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
čt 3. 6. 2021 v 20:25 odesílatel Mark Dilger <mark.dilger@enterprisedb.com> napsal:
> On Jun 3, 2021, at 9:38 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>
> This design looks good for extensions, but I am not sure if it is good for users. Some declarative way without necessity to programming or install some extension can be nice.
I agree, though "some declarative way" is a bit vague. I've had ideas that perhaps superusers should be able to further restrict the [min,max] fields of int and real GUCs. Since -1 is sometimes used to mean "disabled", syntax to allow specifying a set might be necessary, something like [-1, 60..600]. For text and enum GUCs, perhaps a set of regexps would work, some being required to match and others being required not to match, such as:
search_path !~ '\mcustomerx\M'
search_path ~ '^pg_catalog,'
If we did something like this, we'd need it to play nicely with other filters provided by extensions, because I'm reasonably sure not all filters could be done merely using set notation and regular expression syntax. In fact, I find it hard to convince myself that set notation and regular expression syntax would even be useful in a large enough number of cases to be worth implementing. What are your thought on that?
I don't think so for immutable strings we need regular expressions. Maybe use some special keyword
search_path only "pg_catalog"
—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
> On Jun 3, 2021, at 12:06 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote: > > > > čt 3. 6. 2021 v 20:25 odesílatel Mark Dilger <mark.dilger@enterprisedb.com> napsal: > > > > On Jun 3, 2021, at 9:38 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote: > > > > This design looks good for extensions, but I am not sure if it is good for users. Some declarative way without necessityto programming or install some extension can be nice. > > I agree, though "some declarative way" is a bit vague. I've had ideas that perhaps superusers should be able to furtherrestrict the [min,max] fields of int and real GUCs. Since -1 is sometimes used to mean "disabled", syntax to allowspecifying a set might be necessary, something like [-1, 60..600]. For text and enum GUCs, perhaps a set of regexpswould work, some being required to match and others being required not to match, such as: > > search_path !~ '\mcustomerx\M' > search_path ~ '^pg_catalog,' > > If we did something like this, we'd need it to play nicely with other filters provided by extensions, because I'm reasonablysure not all filters could be done merely using set notation and regular expression syntax. In fact, I find ithard to convince myself that set notation and regular expression syntax would even be useful in a large enough number ofcases to be worth implementing. What are your thought on that? > > I don't think so for immutable strings we need regular expressions. Maybe use some special keyword > > search_path only "pg_catalog" I think we're trying to solve different problems. I'm trying to allow non-superusers to set GUCs while putting constraintson what values they choose. You appear to be trying to revoke the ability to set a GUC by forcing it to onlyever have a single value. — Mark Dilger EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
čt 3. 6. 2021 v 21:11 odesílatel Mark Dilger <mark.dilger@enterprisedb.com> napsal:
> On Jun 3, 2021, at 12:06 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>
>
>
> čt 3. 6. 2021 v 20:25 odesílatel Mark Dilger <mark.dilger@enterprisedb.com> napsal:
>
>
> > On Jun 3, 2021, at 9:38 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> >
> > This design looks good for extensions, but I am not sure if it is good for users. Some declarative way without necessity to programming or install some extension can be nice.
>
> I agree, though "some declarative way" is a bit vague. I've had ideas that perhaps superusers should be able to further restrict the [min,max] fields of int and real GUCs. Since -1 is sometimes used to mean "disabled", syntax to allow specifying a set might be necessary, something like [-1, 60..600]. For text and enum GUCs, perhaps a set of regexps would work, some being required to match and others being required not to match, such as:
>
> search_path !~ '\mcustomerx\M'
> search_path ~ '^pg_catalog,'
>
> If we did something like this, we'd need it to play nicely with other filters provided by extensions, because I'm reasonably sure not all filters could be done merely using set notation and regular expression syntax. In fact, I find it hard to convince myself that set notation and regular expression syntax would even be useful in a large enough number of cases to be worth implementing. What are your thought on that?
>
> I don't think so for immutable strings we need regular expressions. Maybe use some special keyword
>
> search_path only "pg_catalog"
I think we're trying to solve different problems. I'm trying to allow non-superusers to set GUCs while putting constraints on what values they choose. You appear to be trying to revoke the ability to set a GUC by forcing it to only ever have a single value.
My proposal doesn't mean the search_path cannot be changed - it limits possible values like your patch. Maybe we can get inspiration from pg_hba.conf
—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On Thu, Jun 3, 2021, at 20:42, Isaac Morland wrote:
I also want to mention that I consider any suggestion to eliminate the search_path concept as a complete non-starter.It would be no different from proposing that the next version of a programming language eliminate (or stop using) the module system.
I think the suggestion of making it possible (but not a default) to eliminate search_path,
is very similar to C compiler flags that turn specific language features into hard errors, such as "-Werror=vla".
If you know your C code base doesn't contain vla, you can compile with that compiler flag.
If you know your SQL code base doesn't makes use of search_path, nor any installed EXTENSIONs,
I'm suggesting it would be nice to have a way to effectively ensure that stays the case.
I realise "eliminate" is not really necessary, it would suffice to just allow setting a a sane default per database, and make that value immutable, then all data structures and code using wouldn't need to change, one would then only need to change the code that can mutate search_path, to prevent that from happening.
If I could make it happen easily, I would go in the other direction and allow schemas to be hierarchical (note: totally ignoring all sorts of very important choices which are more than just details about how this should work). I would like to be able to have an extension or subsystem exist in a single schema, with its objects broken up into schemas within the schema. Same reason as most languages have hierarchical module systems.
I note we already have a hierarchical extension system; EXTENSIONs can specify their dependencies (parents) via "requires" in the .control file. The entire hierarchical tree can then can be created/dropped using CASCADE.
I can possibly see some value in hierarchical schemas too, that is completely unrelated to my distaste for search_path.
I never felt I needed more than one namespace level, but I've only worked in companies with <1000 employees, so I can imagine it would be useful if the data needs for >100k employees needs to be organised in one and the same database. Is this how large companies organise their data? Or do they instead break up things into multiple databases?
Do we have some example of an extension that is complex enough where it would be good to organise it into multiple schema levels?
If reducing complexity by not using search_path, the complexity budget might afford hierarchical schemas, so I think the two ideas seem very compatible.
/Joel
Hi
I realise "eliminate" is not really necessary, it would suffice to just allow setting a a sane default per database, and make that value immutable, then all data structures and code using wouldn't need to change, one would then only need to change the code that can mutate search_path, to prevent that from happening.
I understand that for some specific cases the search_path can be problematic. On the other hand, the SQL database supports interactive work, and then the search_path can save a lot of monkey work.
It is the same as using the command line without the possibility to customize the PATH variable. The advantages and disadvantages are exactly the same.
Regards
Pavel
On Fri, Jun 4, 2021, at 08:58, Pavel Stehule wrote:
It is the same as using the command line without the possibility to customize the PATH variable. The advantages and disadvantages are exactly the same.
The reason why we even have PATH in the *nix world,
is not because they *wanted* to separate things (like we want with schemas or extensions),
but because they *needed* to, because /bin was overflowed:
"The UNIX shell gave up the Multics idea of a search path and looked for program names that weren’t
file names in just one place, /bin. Then in v3 /bin overflowed the small (256K), fast fixed-head drive.
Thus was /usr/bin born, and the idea of a search path reinstated." [1]
/Joel
pá 4. 6. 2021 v 11:17 odesílatel Joel Jacobson <joel@compiler.org> napsal:
On Fri, Jun 4, 2021, at 08:58, Pavel Stehule wrote:It is the same as using the command line without the possibility to customize the PATH variable. The advantages and disadvantages are exactly the same.The reason why we even have PATH in the *nix world,is not because they *wanted* to separate things (like we want with schemas or extensions),but because they *needed* to, because /bin was overflowed:"The UNIX shell gave up the Multics idea of a search path and looked for program names that weren’tfile names in just one place, /bin. Then in v3 /bin overflowed the small (256K), fast fixed-head drive.Thus was /usr/bin born, and the idea of a search path reinstated." [1]
It's funny - sometimes too restrictive limits are reason for design of longer living concepts
Pavel
/Joel
On Fri, Jun 4, 2021, at 11:45, Pavel Stehule wrote:
pá 4. 6. 2021 v 11:17 odesílatel Joel Jacobson <joel@compiler.org> napsal:On Fri, Jun 4, 2021, at 08:58, Pavel Stehule wrote:It is the same as using the command line without the possibility to customize the PATH variable. The advantages and disadvantages are exactly the same.The reason why we even have PATH in the *nix world,is not because they *wanted* to separate things (like we want with schemas or extensions),but because they *needed* to, because /bin was overflowed:"The UNIX shell gave up the Multics idea of a search path and looked for program names that weren’tfile names in just one place, /bin. Then in v3 /bin overflowed the small (256K), fast fixed-head drive.Thus was /usr/bin born, and the idea of a search path reinstated." [1]It's funny - sometimes too restrictive limits are reason for design of longer living conceptsPavel
Yes, it’s funny, I bet there is some English word for this phenomenon?
I just read an article discussing similar problems in *nix and found the extract below very interesting.
Maybe there are takeaways from this article that can inspire us, when thinking about PostgreSQL The Next 50 Years.
”Unix Shell Programming: The Next 50 Years
…
2 THE GOOD, THE BAD, AND THE UGLY
…
2.2 The Bad
…
U4: No support for contemporary deployments. The shell’s core abstractions were designed to facilitate orchestra- tion, management, and processing on a single machine. How- ever, the overabundance of non-solutions—e.g., pssh, GNU parallel, web interfaces—for these classes of computation on today’s distributed environments indicates an impedance mismatch between what the shell provides and the needs of these environments. This mismatch is caused by shell programs being pervasively side-effectful, and exacerbated by classic single-system image issues, where configuration scripts, program and library paths, and environment vari- ables are configured ad hoc. The composition primitives do not compose at scale.”
/Joel
Maybe this could work:
CREATE SCHEMA schema_name UNQUALIFIED;
Which would explicitly make all the objects created in the schema accessible unqualified, but also enforce there are no conflicts with other objects in existence in all unqualified schemas, upon the creation of new objects.
/Joel
CREATE SCHEMA schema_name UNQUALIFIED;
Which would explicitly make all the objects created in the schema accessible unqualified, but also enforce there are no conflicts with other objects in existence in all unqualified schemas, upon the creation of new objects.
/Joel
pá 4. 6. 2021 v 17:43 odesílatel Joel Jacobson <joel@compiler.org> napsal:
Maybe this could work:
CREATE SCHEMA schema_name UNQUALIFIED;
Which would explicitly make all the objects created in the schema accessible unqualified, but also enforce there are no conflicts with other objects in existence in all unqualified schemas, upon the creation of new objects.
Yes, it can work. I am not sure if "unqualified" is the best keyword, but the idea is workable.
Regards
Pavel
/Joel
On Fri, Jun 4, 2021, at 18:03, Pavel Stehule wrote:
pá 4. 6. 2021 v 17:43 odesílatel Joel Jacobson <joel@compiler.org> napsal:Maybe this could work:CREATE SCHEMA schema_name UNQUALIFIED;Which would explicitly make all the objects created in the schema accessible unqualified, but also enforce there are no conflicts with other objects in existence in all unqualified schemas, upon the creation of new objects.Yes, it can work. I am not sure if "unqualified" is the best keyword, but the idea is workable.
So maybe a combination of some kind of GUC to restrict search_path in some way,
and a settable/unsettable new boolean pg_namespace column
to control if the schema should be accessible unqualified or not?
If we don't like "UNQUALIFIED" as a keyword, maybe we could reuse "PUBLIC"?
Or will that be confusing since "PUBLIC" is also a role_specification?
I think unqualified=true should mean a schema doesn't need to be part of the search_path, to be accessible unqualified.
This means, "pg_catalog" and "public", might have unqualified=false, as their default values.
Setting unqualified=true for "pg_catalog" and "public" would enforce there are no overlapping objects between the two.
Marko, in your use-case with the "compat" schema, do you think it would work to just do
ALTER SCHEMA compat DROP UNQUALIFIED (or whatever the command should be)
when upgrading to the new major version, where compat isn't necessary,
similar to changing the GUC to not include "compat"?
IMO, the biggest disadvantage with this idea is that it undeniably increases complexity of name resolution further,
since it's then yet another thing to take into account. But maybe it's worth it, if the GUC to restrict search_path,
can effectively reduce complexity, when used in combination with this other proposed feature.
I think it's a really difficult question. I strongly feel something should be done in this area to improve the situation,
but it's far from obvious what the right thing to do is.
/Joel
On Fri, Jun 4, 2021 at 9:03 AM Pavel Stehule <pavel.stehule@gmail.com> wrote:
pá 4. 6. 2021 v 17:43 odesílatel Joel Jacobson <joel@compiler.org> napsal:Maybe this could work:
CREATE SCHEMA schema_name UNQUALIFIED;
Which would explicitly make all the objects created in the schema accessible unqualified, but also enforce there are no conflicts with other objects in existence in all unqualified schemas, upon the creation of new objects.Yes, it can work. I am not sure if "unqualified" is the best keyword, but the idea is workable.
Sounds like a job for an event trigger listening to CREATE/ALTER schema.
David J.
On Mon, Jun 7, 2021 at 2:09 PM David G. Johnston <david.g.johnston@gmail.com> wrote:
On Fri, Jun 4, 2021 at 9:03 AM Pavel Stehule <pavel.stehule@gmail.com> wrote:pá 4. 6. 2021 v 17:43 odesílatel Joel Jacobson <joel@compiler.org> napsal:Maybe this could work:
CREATE SCHEMA schema_name UNQUALIFIED;
Which would explicitly make all the objects created in the schema accessible unqualified, but also enforce there are no conflicts with other objects in existence in all unqualified schemas, upon the creation of new objects.Yes, it can work. I am not sure if "unqualified" is the best keyword, but the idea is workable.Sounds like a job for an event trigger listening to CREATE/ALTER schema.
Never mind...I got mixed up a bit on what this all is purporting to do. My intent was to try and solve the problem with existing features (event triggers) instead of inventing new ones, which is still desirable.
David J.
On Mon, Jun 7, 2021 at 1:55 PM Joel Jacobson <joel@compiler.org> wrote:
If we don't like "UNQUALIFIED" as a keyword, maybe we could reuse "PUBLIC"?Or will that be confusing since "PUBLIC" is also a role_specification?
For me the concept resembles explicitly denoting certain schemas as being simple tags, while the actual "namespace" is the GLOBAL namespace. Today there is no global namespace, all schemas generate their own individual namespace in addition to "tagging" their objects with a textual label.
Avoiding "public" is highly desirable.
To access a global object you should be able to still specify its schema tag. Unqualified means "use search_path"; and "use search_path" includes global. But there is a truth table waiting to be created to detail what combinations result in errors (including where those errors occur - runtime or creation time).
David J.
On Mon, Jun 7, 2021, at 23:26, David G. Johnston wrote:
On Mon, Jun 7, 2021 at 1:55 PM Joel Jacobson <joel@compiler.org> wrote:If we don't like "UNQUALIFIED" as a keyword, maybe we could reuse "PUBLIC"?Or will that be confusing since "PUBLIC" is also a role_specification?For me the concept resembles explicitly denoting certain schemas as being simple tags, while the actual "namespace" is the GLOBAL namespace. Today there is no global namespace, all schemas generate their own individual namespace in addition to "tagging" their objects with a textual label.Avoiding "public" is highly desirable.To access a global object you should be able to still specify its schema tag. Unqualified means "use search_path"; and "use search_path" includes global. But there is a truth table waiting to be created to detail what combinations result in errors (including where those errors occur - runtime or creation time).
+1
/Joel