Re: search_path improvements - Mailing list pgsql-hackers

From Josh Berkus
Subject Re: search_path improvements
Date
Msg-id 4A22E4B4.9060304@agliodbs.com
Whole thread Raw
In response to Re: search_path improvements WAS: search_path vs extensions  (Greg Stark <stark@enterprisedb.com>)
Responses Re: search_path improvements  (Greg Stark <stark@enterprisedb.com>)
List pgsql-hackers
Greg,

> What's the point of "namespaces" if not to implement visibility? The
> interesting thing to do would be to hide all the internal foo
> functions in a foo.* schema and only put the external api in public.

That is an interesting idea.  However, what our real users are really 
doing in the field is more diverse. (see below)

> That way you can't accidentally call an internal foo function or have
> a name conflict between two internal functions. The external api could
> even just be a bunch of thin wrappers around the implementation
> functions in foo.* (what Oracle calls public synonyms).

This assumes that all users should have access to the same public APIs 
as all other users.  Real applications are more complex.

In my experience of PostgreSQL applications, people use schema for three 
different reasons:

Organization/Maintainability: when you have hundreds or thousands of 
database objects, you want "folders" to put them in just so that you can 
keep track of them and view them in easy-to-digest groups, just as you 
deal with files in a filesystem.  DBAs no more want to put everything in 
one big flat namespace, even if the individual names are unique and the 
permissions are the same, than we want to have all of the PostgreSQL 
source code in one big directory.Further, these schema names generally indicate something about the 
purpose of the objects in them: "cms","matviews", "reports","calendar".  When accurate, these schema names aid the DBA
inmaintaining and 
 
troubleshooting the application, and are more convenient than hungarian 
notation schemes.

Visibility: some applications use schema to hide objects from roles 
which shouldn't see them: "inner","cronjobs","acl", whether for data 
hiding or just to keep "private" functions and tables separate from what 
the application accesses directly.  However, this approach is not very 
common *because of* the awkwardness and overhead of search_path; DBAs 
are constantly troubleshooting search_path omissions and errors and 
eventually give up on visibility rules, making all schema visible to all 
users.This gets even more difficult when you consider that in a large complex 
application with multiple ROLEs, not all ROLEs should see all schema, 
but what an individual user can access might be a list of schema which 
represent some-but-not-all schema.  The lack of a convenient 
"search_path_add" or "SET ROLE ... WITH DEFAULTS" makes this an 
unmanageable mess; DBAs find themselves constantly ALTERing each user's 
search_path individually.

Security: schema provide convenient containers to lock up groups of 
objects by role.  "admin", "permissions" etc. schemas combine visibility 
and USE restrictions to make sql injection much harder, and 
administrative tasks are supported by objects in schema not accessible 
to the "webuser".

As I said before, schema conbine 3 purposes: organization, visibility 
and security, into one structure.  Which is why it's difficult to make 
them work perfectly for all 3 purposes.  We could, however, make them 
work better.

>> c) the ability as superuser to have my own "special schema" which are always
>> in the search path, as pg_catalog and $user_temp are.*
>
>> * if you're not sure why someone would want this, consider
>> information_schema.  If your application depends on I_S to work, how do you
>> make sure it's always in every user's search_path?
>
> Uhm, wouldn't you just refer to information_schema.foo? What if some
> other part of your application depends on information_schema *not*
> being in your path? Using global state for this seems destined to
> leave you with something broken that can't be fixed without breaking
> something else.

Easily said for someone who doesn't have to adapt a 3rd-party vendor 
application or support real users on the phone.  Insisting that all of 
your application developers remember to type "information_schema." all 
of the time really makes them love the DBA.  Mostly, this simply results 
in people not using information_schema, and instead using their own 
home-grown system view scripts, which are often wrong.

However, if we had push/pop/shift/unshift for search_path, the need for 
search_path_suffix would be considerably diminished, since application 
code (& DBAs) would use push/pop instead of replacing the entire 
search_path.

> Hm, I'm beginning to think extensions need to have search_path set on
> every function or have every object reference everywhere be explicitly
> pg_extension.* (and/or _private_.* like my earlier suggestion).

Again, I'm not talking about Extensions.  I think that Extensions are 
completely orthagonal to search_path, hence the change of subject line.  I'm talking about making search_path (and
schema)more useful to DBAs 
 
and application designers.

>> e) having roles somehow inherit search_path on a SET ROLE***
>
> Grr. I'm still bitter about "su" doing that on some systems without
> "su -". I think I've lost that battle though and I'm forever doomed to
> never know what "su" will do on a new system.

As previously discussed, this would work via something like SET ROLE ... 
WITH DEFAULTS, rather than with just SET ROLE.  We don't want to break 
backwards compatibility.

-- 
Josh Berkus
PostgreSQL Experts Inc.
www.pgexperts.com


pgsql-hackers by date:

Previous
From: Andrew Chernow
Date:
Subject: Re: check for missing tablespaces?
Next
From: Bruce Momjian
Date:
Subject: Re: pg_migrator and an 8.3-compatible tsvector data type