Thread: RFD: schemas and different kinds of Postgres objects

RFD: schemas and different kinds of Postgres objects

From
Tom Lane
Date:
Continuing to think about implementing SQL schemas for 7.3 ...

Today's topic for discussion: which types of Postgres objects should
belong to schemas, and which ones should have other name scopes?

Relations (tables, indexes, views, sequences) clearly belong to schemas.
Since each relation has an associated datatype with the same name, it
seems that datatypes must belong to schemas as well.  (Even if that
argument doesn't convince you, SQL99 says that user-defined datatypes
belong to schemas.)  However the situation is murkier for other kinds of
objects.

Here are all the kinds of named objects that exist in Postgres today,
with some comments on whether they should belong to schemas or not:

relations        Must be in schemas
types            Must be in schemas
databases        Databases contain schemas, not vice versa
users            Users are cross-database, so not in schemas
groups            User groups are cross-database, so not in schemas
languages        Probably should not be in schemas
access methods        Probably should not be in schemas
opclasses        See below
operators        See below
functions/procedures    See below
aggregates        Should treat same as regular functions
constraints        See below
rules            See below
triggers        See below
NOTIFY conditions    See below

Languages and access methods are not trivial to add to the system, so
there's not much risk of name conflicts, and no reason to make their name
scope less than global.

The situation is a lot murkier for operators and functions.  These should
probably be treated alike, since operators are just syntactic sugar for
functions.  I think the basic argument for making them schema-local is
that different users might conceivably want to define conflicting
functions or operators of the same name.  Against that, however, there
are a number of reasons for wanting to keep these objects database-wide.
First off there are syntactic problems.  Do you really want to write    A schemaname.+ B
to qualify an ambiguous "+" operator?  Looks way too much like a syntax
error to me.  Allowing this would probably turn a lot of simple syntax
errors into things that get past the grammar and end up producing truly
confusing error messages.  Qualified function names also pose some
problems, not so much with    schemaname.function(args)
which seems reasonable, but with the Berkeley-derived syntax that allows
"foo.function" to mean "function(foo)" --- there's no way to squeeze a
schema-name for the function into that.  (And you'll recall from my note
of the other day that we don't want to abandon this syntax entirely,
because people would like us to support "sequencename.nextval" for Oracle
compatibility.)  Notice that we are not forced to make functions/operators
schema-local just because datatypes are, because overloading will save the
day.  func(schema1.type1) and func(schema2.type1) are distinct functions
because the types are different, even if they live in the same function
namespace.  Finally, SQL99 doesn't appear to think that operator and
function names are schema-local; though that may just be because it hasn't
got user-defined operators AFAICT.

I am leaning towards keeping functions/operators database-wide, but would
like to hear comments.  Is there any real value in, eg, allowing different
users to define different "+" operators *on the same datatypes*?

Not sure about index opclasses.  Given that datatype names are
schema-local, one can think of scenarios where two users define similar
datatypes and then try to use the same index opclass name for both.
But it seems pretty unlikely.  I'd prefer to leave opclass names
database-wide for simplicity.  Comments?

As for constraints, currently we only support table-level constraints,
and we do not enforce any uniqueness at all on their names; multiple
constraints for the same table can have the same name (and if so, ALTER
TABLE DROP CONSTRAINT drops all matching names).  SQL92 requires named
constraints to have names that are unique within their schema, which is
okay for standalone assertions (which we haven't got) but seems quite
unnecessary for constraints attached to tables.  And what's really odd,
it appears to allow a table constraint to belong to a different schema
than the table it is on!  This is pretty bogus.  I'd prefer to ignore the
part of the spec that says that table constraint names can be qualified
names, and either keep our existing behavior or require constraint names
to be unique per-table.  Thoughts?

Rewrite rules are currently required to have a name unique within their
database.  We clearly don't want that to still be true in the schema
environment.  Personally I would like to make rules' names unique only
among rules on the same table (like we handle triggers).  That would
mean an incompatible change in the syntax of DROP RULE: it'd have to
become DROP RULE rule ON table, much like DROP TRIGGER.  Is that okay?
If not, probably we must make rulenames local to schemas and say they
implicitly belong to the schema of the associated table.

Triggers are already handled as being named uniquely among the triggers
of the same table.  This behavior is fine with me, and doesn't need to
be changed for schema support.

I can see some advantage to considering NOTIFY condition names to be local
to a schema, but I can also see risks of breaking existing applications.
Currently, "NOTIFY foo" will signal to "LISTEN foo" even if the two
processes belong to different users.  With an implicit schema qualifier
attached to foo, very likely this would fail to work.  Since NOTIFY names
aren't officially registered anywhere, the implicit qualifier would have
to correspond to the front schema of one's schema search path, and there'd
be no way for such processes to communicate if their search paths didn't
match.  I think people would end up always qualifying NOTIFY names with
a single schema name, which means we might as well continue to consider
them global.  On the other hand, if you assume that NOTIFY names are often
the names of tables, it'd make sense to allow them to be qualified.  Any
thoughts on this?
        regards, tom lane


Re: RFD: schemas and different kinds of Postgres objects

From
Mike Mascari
Date:
Tom Lane wrote:
> 
> Continuing to think about implementing SQL schemas for 7.3 ...
> 
> Today's topic for discussion: which types of Postgres objects should
> belong to schemas, and which ones should have other name scopes?
...
> 
> I am leaning towards keeping functions/operators database-wide, but would
> like to hear comments.  Is there any real value in, eg, allowing different
> users to define different "+" operators *on the same datatypes*?

With regard to functions, I believe they should be schema specific.
Oracle allows the creation of procedures/functions in specific schema.
User2 may then execute user1's function as:

EXECUTE user1.myfunction();

However, as you suggest, the fully qualified naming of functions gets
messy. So Oracle allows (and I think we would need) PUBLIC SYNONYMs.
This allows user1 to do:

CREATE TABLE employees(key integer, name VARCHAR(20));

CREATE SEQUENCE s;

CREATE PROCEDURE newemployee(n IN VARCHAR)
AS 
BEGIN
INSERT INTO employees
SELECT s.nextval, n
FROM DUAL;
END;
/

GRANT INSERT ON employees TO user2;
GRANT EXECUTE ON newemployee TO user2;
CREATE PUBLIC SYNONYM newemployee FOR user1.newemployee;

Now, user2 just does:

EXECUTE newemployee(10);


In fact, with regard to the package discussion a while back, Oracle
allows this:

Database->Schema->Package->Procedure

and this:

Database->Schema->Procedure

and effectively this:

Database->Procedure via Database->PUBLIC Schema->Procedure

I really think that the main purpose of schemas is to prevent an
ill-informed or malicious user from engaging in unacceptable behavior.
By placing everything in schemas, it allows the Oracle DBA to have a
very fine-grained control over the ability of user1 to interfere with
user2. Before user1 above could pollute the global namespace, the dba
must have:

GRANT user1 CREATE PUBLIC SYNONYM

privilege, or created the synonym himself. This allows things like
pg_class to reside within their own schema, as well as all built-in
PostgreSQL functions. After the bootstrapping, PUBLIC SYNONYMs are
created for all of the system objects which should have global scope:

CREATE PUBLIC SYNONYM pg_class FOR system.pg_class;
CREATE PUBLIC SYNONYM abs(int) FOR system.abs(int);

One major benefit of Oracle is that the DBA, through the use of
STATEMENT privileges (i.e. GRANT CREATE TABLE to user1), resource
PROFILEs, and TABLESPACES can easily admin a database used by 20
different deparments and 1000 different users without the fear that one
might step on the other's toes. If the accounting department wants to
create an addtax() function, it shouldn't have to ask the receiving
deptartment to do so. 

Just my thoughts,

Mike Mascari
mascarm@mascari.com


Re: RFD: schemas and different kinds of Postgres objects

From
Peter Eisentraut
Date:
Tom Lane writes:

> languages        Probably should not be in schemas
> access methods        Probably should not be in schemas
> opclasses        See below
> operators        See below
> functions/procedures    See below
> aggregates        Should treat same as regular functions
> constraints        See below
> rules            See below
> triggers        See below
> NOTIFY conditions    See below

Remember that a schema is a named representation of ownership, so anything
that can be owned must be in a schema.  (Unless you want to invent a
parallel universe for a different kind of ownership, which would be
incredibly confusing.)  Also remember that we wanted to use schemas as a
way to prevent unprivileged users from creating anything by default.  So
it would be much simpler if "everything" were in a schema.

I wouldn't worry so much about the invocation syntax -- if you don't like
ugly don't make ugly.  For instance, if you add a user-defined operator
you would probably either put it in the same schema with the rest of your
project or put it in some sort of a global or default schema (to be
determined) to make it available to the whole system, my assumption being
that either way you don't need to qualify the operator.  But the important
thing is that you *could* make cross-schema operator calls, say during
development or testing.

Consequentially, I opine that all of the things listed above should be in
a schema.  (Although I don't have a strong opinion about notifications,
yet.)

> namespace.  Finally, SQL99 doesn't appear to think that operator and
> function names are schema-local; though that may just be because it hasn't
> got user-defined operators AFAICT.

Check clause 10.4 <routine invocation>: routine names are (optionally)
schema qualified like everything else.

> Rewrite rules are currently required to have a name unique within their
> database.  We clearly don't want that to still be true in the schema
> environment.  Personally I would like to make rules' names unique only
> among rules on the same table (like we handle triggers).  That would
> mean an incompatible change in the syntax of DROP RULE: it'd have to
> become DROP RULE rule ON table, much like DROP TRIGGER.  Is that okay?
> If not, probably we must make rulenames local to schemas and say they
> implicitly belong to the schema of the associated table.

I'd rather make the opposite change (make trigger names schema-global)
because that aligns with SQL99 and it would make more sense for overall
consistency (e.g., you can't have indexes with the same names on different
tables).  The syntax change would also be backward compatible.  I think
either way, schema-global or table-global namespace, can be argued to be
more useful or more confusion-prone, so I side with the standard on this
one.

-- 
Peter Eisentraut   peter_e@gmx.net



Re: RFD: schemas and different kinds of Postgres objects

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> Remember that a schema is a named representation of ownership, so anything
> that can be owned must be in a schema.  (Unless you want to invent a
> parallel universe for a different kind of ownership, which would be
> incredibly confusing.)

I don't buy that premise.  It's true that SQL92 equates ownership of a
schema with ownership of the objects therein, but AFAICS we have no hope
of being forward-compatible with existing database setups (wherein there
can be multiple tables of different ownership all in a single namespace)
if we don't allow varying ownership within a schema.  I think we can
arrange things so that we are upward compatible with both SQL92 and
the old way.  Haven't worked out details yet though.

Have to run, more later.
        regards, tom lane


Re: RFD: schemas and different kinds of Postgres objects

From
Peter Harvey
Date:
FYI: Applications like Data Architect would benefit from a consistent and 
complete interface to the schema. For example; I found that we had to bypass 
the DD views which exist (as I recall) because they did not give us all 
information we needed. So we selected stuff from the system tables directly. 
Yucky. Sorry I can not recall details but thought that I would mention this 
here. The MySQL 'SHOW' statements seem to work pretty well and shields us 
from changes to the system tables.

Peter

> I don't buy that premise.  It's true that SQL92 equates ownership of a
> schema with ownership of the objects therein, but AFAICS we have no hope
> of being forward-compatible with existing database setups (wherein there
> can be multiple tables of different ownership all in a single namespace)
> if we don't allow varying ownership within a schema.  I think we can
> arrange things so that we are upward compatible with both SQL92 and
> the old way.  Haven't worked out details yet though.
>
> Have to run, more later.


Re: RFD: schemas and different kinds of Postgres objects

From
Fernando Nasser
Date:
Tom Lane wrote:
> 
> Peter Eisentraut <peter_e@gmx.net> writes:
> > Remember that a schema is a named representation of ownership, so anything
> > that can be owned must be in a schema.  (Unless you want to invent a
> > parallel universe for a different kind of ownership, which would be
> > incredibly confusing.)
> 
> I don't buy that premise.  It's true that SQL92 equates ownership of a
> schema with ownership of the objects therein, but AFAICS we have no hope
> of being forward-compatible with existing database setups (wherein there
> can be multiple tables of different ownership all in a single namespace)
> if we don't allow varying ownership within a schema.  I think we can
> arrange things so that we are upward compatible with both SQL92 and
> the old way.  Haven't worked out details yet though.
> 

Peter is right.  Schemas is just a practical way of creating things
under
the same authorization-id + crating a namespace so that different
authorization-ids can have objects with the same (unqualified name).

Quoting Date (pg. 221): "The schema authID for a given schema identifies
the owner of that schema (and hence the owner of everything described by
that schema also)."

It is very important that we reach a conclusion on this as it simplifies
things a lot.

Regards,
Fernando


P.S.: That is why I was telling you that, except for the namespace part,
we already have the groundwork for Entry-level SQL-Schemas (where the 
schema is always the authorization-id of the creator) -- it is just
a question of handling the "owner" appropriately.


-- 
Fernando Nasser
Red Hat Canada Ltd.                     E-Mail:  fnasser@redhat.com
2323 Yonge Street, Suite #300
Toronto, Ontario   M4P 2C9


Re: RFD: schemas and different kinds of Postgres objects

From
Tom Lane
Date:
Fernando Nasser <fnasser@redhat.com> writes:
> Tom Lane wrote:
>> I don't buy that premise.  It's true that SQL92 equates ownership of a
>> schema with ownership of the objects therein, but AFAICS we have no hope
>> of being forward-compatible with existing database setups (wherein there
>> can be multiple tables of different ownership all in a single namespace)
>> if we don't allow varying ownership within a schema.

> Quoting Date (pg. 221): "The schema authID for a given schema identifies
> the owner of that schema (and hence the owner of everything described by
> that schema also)."

Yes, I know what the spec says.  I also think we'll have a revolt on our
hands if we don't make it possible for existing Postgres applications to
continue working as they have in the past --- and that means allowing
tables of different ownerships to be accessible in a single namespace.

Although I haven't thought through the details yet, it seems to me that
a solution exists along these lines:

1. The creator of an object owns it.  (With some special cases, eg the  superuser should be able to create a schema
ownedby someone else.)
 

2. Whether you can create an object in a schema that is owned by someone  else depends on permissions attached to the
schema. By default only  the owner of a schema can create anything in it.
 

3. SQL92-compatible behavior is achieved when everyone has their own  schema and they don't grant each other
create-in-schemarights.  Backwards-compatible behavior is achieved when everyone uses a  shared "public" schema.
 

We'd probably need GUC variable(s) to make it possible to choose which
behavior is the default.  I haven't thought much about exactly what
knobs should be provided.  I do think we will want at least these two
knobs:

1. A "search path" that is an ordered list of schemas to look in  when trying to resolve an unqualified name.

2. A "default schema" variable that identifies the schema to create  objects in, if a fully qualified name is not
given.

The default creation location shouldn't be hardwired to equal the
front of the search path, because the front item of the search path
is probably always going to be a backend-local temporary schema
(this is where we'll create temporary tables).

The most dumbed-down version of this that would work is to reduce the
search path to just a fixed list of three locations: temp schema, a
selectable default schema (which is also the default creation location),
and a system schema (where pg_class and friends live).  But a
user-settable path wouldn't be any more effort to support, and might
offer some useful capability.
        regards, tom lane


Re: RFD: schemas and different kinds of Postgres objects

From
Tom Lane
Date:
Fernando Nasser <fnasser@redhat.com> writes:
> But them it is not SQL-Schemas.  Call it something else, "packages"
> for instance.  The standard has lots of rules and other considerations
> all around the document that depend on schemas have the meaning they 
> assigned to it.

Where?  And are there any cases where it really matters?

> If someone wants to really make use of SQL-Schemas, he/she will need to 
> reorg the database anyway, which will probably mean dumping the data,
> massaging the DLL and recreating it.  I guess most users of SQL-Schemas
> will be people creating new databases.

No doubt.  That still leaves us with the problem of providing
backward-compatible behavior in an engine that is going to be designed
to support schemas.  I'm not sure what you think the implementation of
schemas is going to look like --- but I think it's not going to be
something that can be turned off or ignored.  Every table is going to
belong to some schema, and the old behavior has to be available within
that framework.

We are not working in a vacuum here, and that means that "implement
the specification and nothing but" is not a workable design approach.
We are going to end up with something that does the things SQL92 asks
for, but does other things too.
        regards, tom lane


Re: RFD: schemas and different kinds of Postgres objects

From
Fernando Nasser
Date:
Tom Lane wrote:
> 
> > Quoting Date (pg. 221): "The schema authID for a given schema identifies
> > the owner of that schema (and hence the owner of everything described by
> > that schema also)."
> 
> Yes, I know what the spec says.  I also think we'll have a revolt on our
> hands if we don't make it possible for existing Postgres applications to
> continue working as they have in the past --- and that means allowing
> tables of different ownerships to be accessible in a single namespace.
> 

But them it is not SQL-Schemas.  Call it something else, "packages"
for instance.  The standard has lots of rules and other considerations
all around the document that depend on schemas have the meaning they 
assigned to it.

If someone wants to really make use of SQL-Schemas, he/she will need to 
reorg the database anyway, which will probably mean dumping the data,
massaging the DLL and recreating it.  I guess most users of SQL-Schemas
will be people creating new databases.

For the current users, (based on your idea below) a default behavior of
searching the current-AuthID schema, them the "default" schema them
"any"
schema will probably make things work.

Fernando

P.S.: Note that the standard has no GRANTs for SCHEMAs themselves; all
GRANTS go to the specific objects as before.


> Although I haven't thought through the details yet, it seems to me that
> a solution exists along these lines:
> 
> 1. The creator of an object owns it.  (With some special cases, eg the
>    superuser should be able to create a schema owned by someone else.)
> 
> 2. Whether you can create an object in a schema that is owned by someone
>    else depends on permissions attached to the schema.  By default only
>    the owner of a schema can create anything in it.
> 
> 3. SQL92-compatible behavior is achieved when everyone has their own
>    schema and they don't grant each other create-in-schema rights.
>    Backwards-compatible behavior is achieved when everyone uses a
>    shared "public" schema.
> 
> We'd probably need GUC variable(s) to make it possible to choose which
> behavior is the default.  I haven't thought much about exactly what
> knobs should be provided.  I do think we will want at least these two
> knobs:
> 
> 1. A "search path" that is an ordered list of schemas to look in
>    when trying to resolve an unqualified name.
> 
> 2. A "default schema" variable that identifies the schema to create
>    objects in, if a fully qualified name is not given.
> 
> The default creation location shouldn't be hardwired to equal the
> front of the search path, because the front item of the search path
> is probably always going to be a backend-local temporary schema
> (this is where we'll create temporary tables).
> 
> The most dumbed-down version of this that would work is to reduce the
> search path to just a fixed list of three locations: temp schema, a
> selectable default schema (which is also the default creation location),
> and a system schema (where pg_class and friends live).  But a
> user-settable path wouldn't be any more effort to support, and might
> offer some useful capability.
> 
>                         regards, tom lane
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
> subscribe-nomail command to majordomo@postgresql.org so that your
> message can get through to the mailing list cleanly

-- 
Fernando Nasser
Red Hat Canada Ltd.                     E-Mail:  fnasser@redhat.com
2323 Yonge Street, Suite #300
Toronto, Ontario   M4P 2C9


Re: RFD: schemas and different kinds of Postgres objects

From
Fernando Nasser
Date:
OK, so the proposal is that we dissociate the ownership from the
namespace when we implement our version of SQL-Schemas, right?
This way an object will have both an owner and a schema (while in
the standard they are the same thing).

The important is not much to accommodate someone who is creating
schemas (a new thing, so objects can be created the "right" way)
but rather to accommodate current code that does not use schemas
and have several owners for the objects (which would all fall
into a "default" schema). Can you agree with that?

I was looking to see if we choose the proper defaults and search paths
and the "default" schema we could make it look for SQL-compliant code 
as if it was a vanilla SQL-Schemas implementation.

To support the current database schema definitions (without SQL-Schemas
and with different owners), things would be created with the current
user authorization id and will have its name defined in the "default"
SQL-Schema
namespace.  Also, when referring to an object if the name is not
qualified
with the schema name, the search would go through the schema with the
current authorization id (as required by the std) and proceed to check 
the "default" schema.

The only problem in the scenario above is that the standard says that 
when creating objects and not specifying the schema the schema name
should be assumed to be the current user authorization id (or whatever
authorization id the code is running as).  In our case it would go to
the default schema.  If someone wants the SQL std behavior then, he/she
must create things inside a CREATE SCHEMA statement or explicitly
qualify
with the schema name the objects being created.  Can we live with that?
Will we pass the conformance tests? (I saw tests that test the schema
name
that is assumed when referencing but I do not recall seeing one that
tests
what is assumed on creation -- things I saw were all created inside
CREATE SCHEMA statements.  Note, also, that passing the NIST tests
doesn't
make us compliant if we know that we are doing something different than
what is specified -- it just means that we got away with it :-)


-- 
Fernando Nasser
Red Hat Canada Ltd.                     E-Mail:  fnasser@redhat.com
2323 Yonge Street, Suite #300
Toronto, Ontario   M4P 2C9


Re: RFD: schemas and different kinds of Postgres objects

From
Tom Lane
Date:
Fernando Nasser <fnasser@redhat.com> writes:
> The only problem in the scenario above is that the standard says that 
> when creating objects and not specifying the schema the schema name
> should be assumed to be the current user authorization id (or whatever
> authorization id the code is running as).  In our case it would go to
> the default schema.  If someone wants the SQL std behavior then, he/she
> must create things inside a CREATE SCHEMA statement or explicitly
> qualify
> with the schema name the objects being created.  Can we live with
> that?

Huh?  You seem to be assuming that we need to support both the
historical Postgres behavior and the SQL-standard behavior with exactly
the same configuration switches.  That's not how I'm seeing it at all.
The way I'm envisioning it, you could get either the historical
behavior, or the standard's behavior, depending on how you set up the
configuration variables.  I don't see any particular reason to limit the
system to just those two cases, either, if the underlying implementation
has enough flexibility to support custom namespace configurations.

I believe that we could get the historical behavior with something like
schema search path = ("public" schema, system schema);
default creation schema = "public" schema

and the standard's behavior with something like
schema search path = (user's own schema, system schema);
default creation schema = user's own schema

(ignoring the issue of a schema for temp tables for the moment).

If you prefer to think of these things as "namespaces" rather than
"schemas", that's fine with me ... what we're talking about here
is an implementation that can support SQL-style schemas, but isn't
narrowly able to do only that.
        regards, tom lane


Re: RFD: schemas and different kinds of Postgres objects

From
Fernando Nasser
Date:
Tom Lane wrote:
> 
> Huh?  You seem to be assuming that we need to support both the
> historical Postgres behavior and the SQL-standard behavior with exactly
> the same configuration switches.  That's not how I'm seeing it at all.
> The way I'm envisioning it, you could get either the historical
> behavior, or the standard's behavior, depending on how you set up the
> configuration variables.  

Then we can live just with the schema being the ownership.

Switches set to standard:
 schema search path = ("user's own schema", postgres)
 [ default creation schema = user's own schema ]  same as below,                                                  we
don'tneed this
 
switch

Switches set to historical:
 schema search path = (user's own schema, "any" schema, postgres)
 [ default creation schema = user's own schema ]

The searching in "any" schema (i.e., any owner) will let will find 
things that where defined the way they are today, i.e., possibly
by several different users.


P.S.: You can even add the "default" schema in the standard case and
I believe you are still compliant and can handle things easier: schema search path = ("user's own schema", postgres)


Maybe you could give an example of a case where the schema meaning
ownership breaks things.  Or what kind of additional things you have
in mind that would require orthogonal schema and ownership spaces.


Regards,
Fernando




-- 
Fernando Nasser
Red Hat Canada Ltd.                     E-Mail:  fnasser@redhat.com
2323 Yonge Street, Suite #300
Toronto, Ontario   M4P 2C9


Re: RFD: schemas and different kinds of Postgres objects

From
Tom Lane
Date:
Fernando Nasser <fnasser@redhat.com> writes:
> Switches set to historical:

>   schema search path = (user's own schema, "any" schema, postgres)

>   [ default creation schema = user's own schema ]

> The searching in "any" schema (i.e., any owner) will let will find 
> things that where defined the way they are today, i.e., possibly
> by several different users.

No, it won't, because nothing will ever get put into that schema.
(At least not by existing pg_dump scripts, which are the things that
really need to see the historical behavior.)  The
default-creation-schema variable has got to point at any/public/
whatever-we-call it, or you do not have the historical behavior.
        regards, tom lane


Re: RFD: schemas and different kinds of Postgres objects

From
Tom Lane
Date:
Fernando Nasser <fnasser@redhat.com> writes:
> In the historical mode:  look into schema B (=> not found), look into
> ANY schema (finds it in A).  Works as it is today.

No, it doesn't work the same as today, because in that implementation
both A and B can create the same tablename without complaint.  It then
becomes very unclear which instance other people will get (unless your
"any" placeholder somehow implies a search order).

The idea of being able to put an "any" placeholder into the search list
is an interesting one, though.  If we can resolve the ambiguity problem
it might be a useful feature.

I am a little troubled by the idea of placing "any" before the system
schema (what if JRandomLuser creates a table named "pg_class"?) but it
might be workable at the tail end of the path.
        regards, tom lane


Re: RFD: schemas and different kinds of Postgres objects

From
Fernando Nasser
Date:
Tom Lane wrote:
> 
> Fernando Nasser <fnasser@redhat.com> writes:
> > Switches set to historical:
> 
> >   schema search path = (user's own schema, "any" schema, postgres)
> 
> >   [ default creation schema = user's own schema ]
> 
> > The searching in "any" schema (i.e., any owner) will let will find
> > things that where defined the way they are today, i.e., possibly
> > by several different users.
> 
> No, it won't, because nothing will ever get put into that schema.
> (At least not by existing pg_dump scripts, which are the things that
> really need to see the historical behavior.)  The
> default-creation-schema variable has got to point at any/public/
> whatever-we-call it, or you do not have the historical behavior.
> 

You did not understand what I meant by "any".  It is not a schema
called "any".  It is _any_ schema.

Example:

A creates a table (do not specify the schema) so it gets into
the schema named A (as per standard).

B refers to the table without qualifying it...

In the standard case:  look into schema B (=> not found), not in
postgres either.
ERROR: Inv. relation     As the standard requires.

In the historical mode:  look into schema B (=> not found), look into
ANY
schema (finds it in A).  Works as it is today.


Note that I only suggest looking in B first (in the historical case)
because
this will allow for the coexistence of the current mode with a
quasi-compliant
use of SQL-Schemas.  You only need to change the switch if you want
strict
compliance.



-- 
Fernando Nasser
Red Hat Canada Ltd.                     E-Mail:  fnasser@redhat.com
2323 Yonge Street, Suite #300
Toronto, Ontario   M4P 2C9


Re: RFD: schemas and different kinds of Postgres objects

From
Fernando Nasser
Date:
Tom Lane wrote:
> 
> Fernando Nasser <fnasser@redhat.com> writes:
> > In the historical mode:  look into schema B (=> not found), look into
> > ANY schema (finds it in A).  Works as it is today.
> 
> No, it doesn't work the same as today, because in that implementation
> both A and B can create the same tablename without complaint.

I agree that we won't be able to catch this as an error unless we turn 
another switch that requires unique names (there goes one of the
advantages
of having schemas, but there is always the option of leaving it on).

In this case it would be more close to the current behavior but what is
left of the SQL-Schemas will be more of a syntactic sugar (although it
can
be still used by the DBA to better organize the grant of privileges).

Anyway, it would be a DBA option to live with not detecting duplicate
names. And, I hope all our tools, graphical or not, will make it clear
what
is the schema things are defined into, so it would not be difficult to 
figure out what is going wrong if something goes wrong (and we can also
print the relation oid on messages).  

>  It then
> becomes very unclear which instance other people will get (unless your
> "any" placeholder somehow implies a search order).
> 

If someone is just using the current mode, there shouldn't be (all names
are
database-unique).

The only case where this situation can happen is if someone is trying
to use schemas and the historical non-schema organization in the same
database, right?  Can we make the search order per database?)

One possibility is to state that this is not recommended (one should 
organize things as schemas or not at all in a database) and say that 
the search order, besides the current AuthId, is unspecified (random). 

Another possibility is to allow only one object with that name in the
"any" space.  If someone means an object that was defined on a schema,
he/she can qualify the name with the schema (good practice).  The only
case where this is not possible is the legacy case, where there is 
exactly one object with that name anyway.

I prefer this second solution.

> The idea of being able to put an "any" placeholder into the search list
> is an interesting one, though.  If we can resolve the ambiguity problem
> it might be a useful feature.
> 

See above.

> I am a little troubled by the idea of placing "any" before the system
> schema (what if JRandomLuser creates a table named "pg_class"?) but it
> might be workable at the tail end of the path.
> 

Yes, I thought of that as I was typing, but it was not the important
point at that time.  You're right, should go at the end.

-- 
Fernando Nasser
Red Hat Canada Ltd.                     E-Mail:  fnasser@redhat.com
2323 Yonge Street, Suite #300
Toronto, Ontario   M4P 2C9


Re: RFD: schemas and different kinds of Postgres objects

From
Peter Eisentraut
Date:
Tom Lane writes:

> No, it doesn't work the same as today, because in that implementation
> both A and B can create the same tablename without complaint.  It then
> becomes very unclear which instance other people will get (unless your
> "any" placeholder somehow implies a search order).

The "search any schema" switch is only intended for use with legacy
databases, where duplicate names don't occur anyway.  If someone uses it
with a new schema-using database design, then he kind of ought to know
that the switch probably doesn't make a whole lot of sense.  However, to
get reproduceable behaviour anyway we can just define a search order, such
as by schema name.

-- 
Peter Eisentraut   peter_e@gmx.net



Re: RFD: schemas and different kinds of Postgres objects

From
Peter Eisentraut
Date:
Tom Lane writes:

> I don't buy that premise.  It's true that SQL92 equates ownership of a
> schema with ownership of the objects therein, but AFAICS we have no hope
> of being forward-compatible with existing database setups (wherein there
> can be multiple tables of different ownership all in a single namespace)
> if we don't allow varying ownership within a schema.

We could have a Boolean knob that says "if you don't find the object in
the default schema, search all other schemas".  That should provide all
the backward compatibility we need.  Moreover, I figure if we do it that
way, the whole schema implementation reduces itself mostly to parser work,
no complicated system catalog changes, no complex overhaul of the
privilege system -- at least initially.

-- 
Peter Eisentraut   peter_e@gmx.net



Re: RFD: schemas and different kinds of Postgres objects

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> Moreover, I figure if we do it that
> way, the whole schema implementation reduces itself mostly to parser work,
> no complicated system catalog changes, no complex overhaul of the
> privilege system -- at least initially.

Why are you guys so eager to save me work?  I'm not in the least
interested in implementing a "schema" feature that can only handle
the entry-level user == schema case.  Therefore, just relabeling the
owner column as schema isn't an interesting option.

I really don't see what's wrong with building a namespace mechanism
that is orthogonal to ownership and then using that to implement what
SQL92 wants.  I think this will be cleaner, simpler, and more flexible
than trying to equate ownership with namespace.
        regards, tom lane


Re: RFD: schemas and different kinds of Postgres objects

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> Tom Lane writes:
>> No, it doesn't work the same as today, because in that implementation
>> both A and B can create the same tablename without complaint.  It then
>> becomes very unclear which instance other people will get (unless your
>> "any" placeholder somehow implies a search order).

> The "search any schema" switch is only intended for use with legacy
> databases, where duplicate names don't occur anyway.

That's a mighty narrow view of the world.  Do you think that people had
better convert to SQL schemas before they ever again create a table?
The fact is that ordinary non-schema-aware usage will certainly lead to
the above scenario.

> that the switch probably doesn't make a whole lot of sense.  However, to
> get reproduceable behaviour anyway we can just define a search order, such
> as by schema name.

Or say that you get an "ambiguous reference" error if there is more than
one possible candidate in the "any" namespace.  (Although that opens the
door for innocent creation of a table foo by one user to break other
people's formerly-working queries that reference some other foo.)
Bottom line for me is that this is an untried concept.  I think the
concept of an "any" searchlist entry is risky enough that I don't much
want to hang the entire usability of the implementation on the
assumption that we won't find any fatal problems with "any".


However, the argument over whether SQL92's concept of ownership should
be taken as gospel is not really the argument I wanted to have in this
thread.  Is it possible to go back to the original point concerning
whether there should be different namespace boundaries for different
types of objects?  You aren't going to avoid those issues by saying that
namespace == ownership is good enough.

I'm particularly troubled by the idea of trying to apply this "any"
lookup concept to resolution of overloaded operators and functions.
Suppose I have a reference func(type1,type2) that I'm trying to resolve,
and I have an inexact match (one requiring coercion) in my own schema.
Do I look to the "any" schema to see if there are better matches?
If so, what happens if the "any" schema contains multiple possibilities
with identical signatures (presumably created by different users)?  ISTM
this will positively guarantee a resolution failure, since there's no
way for the resolver to prefer one over another.  Thus, by creating
a "func(foo,bar)" function --- quite legally --- JRandomLuser might
break other people's formerly working queries that use other functions
named func.  Although it's possible for this to happen now, it'll be
a lot more surprising if JRandomLuser thinks that his functions live
in his own private schema namespace.

I'm thinking that the overloading concept is not going to play well
at all with multiple namespaces for functions or operators, and that
we'd be best off to say that there is only one namespace (per database)
for these things.
        regards, tom lane


Re: RFD: schemas and different kinds of Postgres objects

From
"Ross J. Reedstrom"
Date:
On Tue, Jan 22, 2002 at 06:31:08PM -0500, Tom Lane wrote:
> Peter Eisentraut <peter_e@gmx.net> writes:
> > Moreover, I figure if we do it that
> > way, the whole schema implementation reduces itself mostly to parser work,
> > no complicated system catalog changes, no complex overhaul of the
> > privilege system -- at least initially.
> 
> Why are you guys so eager to save me work?  I'm not in the least
> interested in implementing a "schema" feature that can only handle
> the entry-level user == schema case.  Therefore, just relabeling the
> owner column as schema isn't an interesting option.
> 
> I really don't see what's wrong with building a namespace mechanism
> that is orthogonal to ownership and then using that to implement what
> SQL92 wants.  I think this will be cleaner, simpler, and more flexible
> than trying to equate ownership with namespace.
> 

I'm with Tom on this: the extended Schema capability he's described
combines the best of both authentication mechanisms, IMHO. It give
us namespace separation ala the SQL standard, and individual object
ownership, like unix FS semantics. Only having ownership on the
'containers' strikes me as limiting, even if that is how the standard
describes it.

Ross


Re: RFD: schemas and different kinds of Postgres objects

From
Mike Mascari
Date:
Tom Lane wrote:
> 
> I'm particularly troubled by the idea of trying to apply this "any"
> lookup concept to resolution of overloaded operators and functions.
> Suppose I have a reference func(type1,type2) that I'm trying to resolve,
> and I have an inexact match (one requiring coercion) in my own schema.
> Do I look to the "any" schema to see if there are better matches?
> If so, what happens if the "any" schema contains multiple possibilities
> with identical signatures (presumably created by different users)?  ISTM
> this will positively guarantee a resolution failure, since there's no
> way for the resolver to prefer one over another.  Thus, by creating
> a "func(foo,bar)" function --- quite legally --- JRandomLuser might
> break other people's formerly working queries that use other functions
> named func.  Although it's possible for this to happen now, it'll be
> a lot more surprising if JRandomLuser thinks that his functions live
> in his own private schema namespace.
>

So, in a nutshell, the price we pay for function overloading is the
inability to have schema-specific functions. Right? Possibly why Oracle
doesn't allow function overloading? As a user, I'd much rather have
schema-specific functions than only global. I'm not downplaying the
value of function overloading, but if I had the choice (which I guess I
can't/won't), I'd choose schema-specific functions over function
overloading...

Mike Mascari
mascarm@mascari.com


Re: RFD: schemas and different kinds of Postgres objects

From
Bill Studenmund
Date:
On Mon, 21 Jan 2002, Tom Lane wrote:

> Continuing to think about implementing SQL schemas for 7.3 ...
>
> Today's topic for discussion: which types of Postgres objects should
> belong to schemas, and which ones should have other name scopes?
>
> Relations (tables, indexes, views, sequences) clearly belong to schemas.
> Since each relation has an associated datatype with the same name, it
> seems that datatypes must belong to schemas as well.  (Even if that
> argument doesn't convince you, SQL99 says that user-defined datatypes
> belong to schemas.)  However the situation is murkier for other kinds of
> objects.
>
> Here are all the kinds of named objects that exist in Postgres today,
> with some comments on whether they should belong to schemas or not:
>
> relations        Must be in schemas
> types            Must be in schemas
> databases        Databases contain schemas, not vice versa
> users            Users are cross-database, so not in schemas
> groups            User groups are cross-database, so not in schemas
> languages        Probably should not be in schemas
> access methods        Probably should not be in schemas
> opclasses        See below
> operators        See below
> functions/procedures    See below
> aggregates        Should treat same as regular functions
> constraints        See below
> rules            See below
> triggers        See below
> NOTIFY conditions    See below
>
> Languages and access methods are not trivial to add to the system, so
> there's not much risk of name conflicts, and no reason to make their name
> scope less than global.
>
> The situation is a lot murkier for operators and functions.  These should
> probably be treated alike, since operators are just syntactic sugar for
> functions.  I think the basic argument for making them schema-local is
> that different users might conceivably want to define conflicting
> functions or operators of the same name.  Against that, however, there
> are a number of reasons for wanting to keep these objects database-wide.
> First off there are syntactic problems.  Do you really want to write
>         A schemaname.+ B
> to qualify an ambiguous "+" operator?  Looks way too much like a syntax
> error to me.  Allowing this would probably turn a lot of simple syntax
> errors into things that get past the grammar and end up producing truly
> confusing error messages.  Qualified function names also pose some
> problems, not so much with
>         schemaname.function(args)
> which seems reasonable, but with the Berkeley-derived syntax that allows
> "foo.function" to mean "function(foo)" --- there's no way to squeeze a
> schema-name for the function into that.  (And you'll recall from my note

Why not? What's wrong with either schema.foo.function (==>
function(schema.foo)) or foo.schema.function (==> schema.function(foo))?
Tables and functions can't have the same names as schemas, so we will be
able to notice the schema names in there. Oh, and I worked out how to get
parse.y to be happy with x.y & x.y.z (schema.package.function) names. :-)

> of the other day that we don't want to abandon this syntax entirely,
> because people would like us to support "sequencename.nextval" for Oracle
> compatibility.)  Notice that we are not forced to make functions/operators
> schema-local just because datatypes are, because overloading will save the
> day.  func(schema1.type1) and func(schema2.type1) are distinct functions
> because the types are different, even if they live in the same function
> namespace.  Finally, SQL99 doesn't appear to think that operator and
> function names are schema-local; though that may just be because it hasn't
> got user-defined operators AFAICT.

Actually functions do have to be schema local. It's in the spec (don't
have exactly where with me).

> I am leaning towards keeping functions/operators database-wide, but would
> like to hear comments.  Is there any real value in, eg, allowing different
> users to define different "+" operators *on the same datatypes*?

Yes. It means that third-party developers can develop routines and then
operators based on them without having to worry about conflicts. Obviously
these two different operators would have to be in different schemas. Also,
it would mean that someone could ship replacement operators for built-in
operators. Say adding a + operator which throws exceptions on overflow.
:-)

> Not sure about index opclasses.  Given that datatype names are
> schema-local, one can think of scenarios where two users define similar
> datatypes and then try to use the same index opclass name for both.
> But it seems pretty unlikely.  I'd prefer to leave opclass names
> database-wide for simplicity.  Comments?

My vote would be to make them schema-specific. As Peter pointed out,
schemas are how you own things, so put them in a schema so we can keep
track of ownership.

Take care,

Bill



Re: RFD: schemas and different kinds of Postgres objects

From
Peter Eisentraut
Date:
Tom Lane writes:

> I really don't see what's wrong with building a namespace mechanism
> that is orthogonal to ownership and then using that to implement what
> SQL92 wants.  I think this will be cleaner, simpler, and more flexible
> than trying to equate ownership with namespace.

OK, I can accept that.  But then I want to get back at my original point,
namely that all database objects (except users and groups) should be in
schemas.  This is also cleaner, simpler, and more flexible.  There is
clearly demand for schema-local functions.  So I think that designing this
system from the premise that a schema-qualified operator call will look
strange is the wrong end to start at.

-- 
Peter Eisentraut   peter_e@gmx.net



Re: RFD: schemas and different kinds of Postgres objects

From
Tom Lane
Date:
Bill Studenmund <wrstuden@netbsd.org> writes:
> Why not? What's wrong with either schema.foo.function (==>
> function(schema.foo)) or foo.schema.function (==> schema.function(foo))?

Neither is wrong in isolation, but how do you tell the difference?
More to the point, given input x.y.z, how do you tell which component
is what?

> Tables and functions can't have the same names as schemas,

News to me.  Where is that written on stone tablets?  Even if that's
considered an acceptable limitation from a purely functional point of
view, I don't like using it to disambiguate input.  The error messages
you'll get from incorrect input to an implementation that depends on
that to disambiguate cases will not be very helpful.

> Actually functions do have to be schema local. It's in the spec (don't
> have exactly where with me).

(A) I don't believe that; please cite chapter and verse; (B) even if
SQL92 thinks that's okay, we can't do it that way because of
backwards-compatibility issues.

> My vote would be to make them schema-specific. As Peter pointed out,
> schemas are how you own things,

Sorry, but this line of argument is trying to assume the very point in
dispute.
        regards, tom lane


Re: RFD: schemas and different kinds of Postgres objects

From
Peter Eisentraut
Date:
Tom Lane writes:

> > Actually functions do have to be schema local. It's in the spec (don't
> > have exactly where with me).
>
> (A) I don't believe that; please cite chapter and verse;

In SQL99, chapter 4 verse 23 it says

"An SQL-invoked routine is an element of an SQL-schema and is called a
schema-level routine."

> (B) even if
> SQL92 thinks that's okay, we can't do it that way because of
> backwards-compatibility issues.

I don't buy that.  If all you're looking for is preserving

foo.bar  <==>  bar(foo)

for compatibility, then you can simply say that "bar" cannot be
schema-qualified in the left form (so it needs to live in the current or
the default schema).  We currently only have one default schema, so that's
backward compatible.  I think this syntax is a mistake, so I don't feel
compelled to provide more than backwards compatibility.

-- 
Peter Eisentraut   peter_e@gmx.net



Re: RFD: schemas and different kinds of Postgres objects

From
"Zeugswetter Andreas SB SD"
Date:
> > Switches set to historical:
> 
> >   schema search path = (user's own schema, "any" schema, postgres)
> 
> >   [ default creation schema = user's own schema ]
> 
> > The searching in "any" schema (i.e., any owner) will let will find 
> > things that where defined the way they are today, i.e., possibly
> > by several different users.
> 
> No, it won't, because nothing will ever get put into that schema.
> (At least not by existing pg_dump scripts, which are the things that
> really need to see the historical behavior.)  The
> default-creation-schema variable has got to point at any/public/
> whatever-we-call it, or you do not have the historical behavior.

When configured for historical behavior would need to:
1. have search path: temp, any, system
2. guard against duplicate table names across all schemas (except temp schema)

Or are you thinking about a per session behavior ?
I would rather envision a per database behavior.

Maybe the easy way out would be a "default creation schema" property for 
each user, that would default to the username. If you want everything in one 
schema simply alter the users.

Andreas


Re: RFD: schemas and different kinds of Postgres objects

From
"Zeugswetter Andreas SB SD"
Date:
> I don't buy that.  If all you're looking for is preserving
> 
> foo.bar  <==>  bar(foo)
> 
> for compatibility, then you can simply say that "bar" cannot be
> schema-qualified in the left form (so it needs to live in the current or
> the default schema).  We currently only have one default schema, so that's
> backward compatible.  I think this syntax is a mistake, so I don't feel
> compelled to provide more than backwards compatibility.

This syntax is actually my favorite :-) I use it heavily for calculated
columns. I don't feel it is a mistake.

Andreas


Re: RFD: schemas and different kinds of Postgres objects

From
"Henshall, Stuart - WCP"
Date:

> -----Original Message-----
> From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
> Sent: 23 January 2002 00:02
> To: Peter Eisentraut
> Cc: Fernando Nasser; pgsql-hackers@postgresql.org
> Subject: Re: RFD: schemas and different kinds of Postgres objects 
> 
> 
> Peter Eisentraut <peter_e@gmx.net> writes:
> > Tom Lane writes:
> >> No, it doesn't work the same as today, because in that 
> implementation
> >> both A and B can create the same tablename without 
> complaint.  It then
> >> becomes very unclear which instance other people will get 
> (unless your
> >> "any" placeholder somehow implies a search order).
> 
> > The "search any schema" switch is only intended for use with legacy
> > databases, where duplicate names don't occur anyway.
> 
> That's a mighty narrow view of the world.  Do you think that 
> people had
> better convert to SQL schemas before they ever again create a table?
> The fact is that ordinary non-schema-aware usage will 
> certainly lead to
> the above scenario.
> 
> > that the switch probably doesn't make a whole lot of sense. 
>  However, to
> > get reproduceable behaviour anyway we can just define a 
> search order, such
> > as by schema name.
> 
> Or say that you get an "ambiguous reference" error if there 
> is more than
> one possible candidate in the "any" namespace.  (Although 
> that opens the
> door for innocent creation of a table foo by one user to break other
> people's formerly-working queries that reference some other foo.)
> Bottom line for me is that this is an untried concept.  I think the
> concept of an "any" searchlist entry is risky enough that I don't much
> want to hang the entire usability of the implementation on the
> assumption that we won't find any fatal problems with "any".
> 
> 
> However, the argument over whether SQL92's concept of ownership should
> be taken as gospel is not really the argument I wanted to have in this
> thread.  Is it possible to go back to the original point concerning
> whether there should be different namespace boundaries for different
> types of objects?  You aren't going to avoid those issues by 
> saying that
> namespace == ownership is good enough.
> 
> I'm particularly troubled by the idea of trying to apply this "any"
> lookup concept to resolution of overloaded operators and functions.
> Suppose I have a reference func(type1,type2) that I'm trying 
> to resolve,
> and I have an inexact match (one requiring coercion) in my own schema.
> Do I look to the "any" schema to see if there are better matches?
> If so, what happens if the "any" schema contains multiple 
> possibilities
> with identical signatures (presumably created by different 
> users)?  ISTM
> this will positively guarantee a resolution failure, since there's no
> way for the resolver to prefer one over another.  Thus, by creating
> a "func(foo,bar)" function --- quite legally --- JRandomLuser might
> break other people's formerly working queries that use other functions
> named func.  Although it's possible for this to happen now, it'll be
> a lot more surprising if JRandomLuser thinks that his functions live
> in his own private schema namespace.
> 
> I'm thinking that the overloading concept is not going to play well
> at all with multiple namespaces for functions or operators, and that
> we'd be best off to say that there is only one namespace (per 
> database)
> for these things.
> 
>             regards, tom lane
> 
Could you just have a general rule of search in order of age (by OID)? This
should prevent changes to existing operation when new definitions come along
(unless new definition is in new own schema or default).
Cheers,
- Stuart


Re: RFD: schemas and different kinds of Postgres objects

From
Tom Lane
Date:
"Henshall, Stuart - WCP" <SHenshall@westcountrypublications.co.uk> writes:
> Could you just have a general rule of search in order of age (by OID)?

No, unless you plan to abandon the whole notion of resolving ambiguous
operator/function calls.  (Which'd cut down our TODO list a good bit ;-)
but I don't think users would be happy...)  OID/age ordering generally
has little to do with reasonable resolution behavior.
        regards, tom lane


Re: RFD: schemas and different kinds of Postgres objects

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> OK, I can accept that.  But then I want to get back at my original point,
> namely that all database objects (except users and groups) should be in
> schemas.  This is also cleaner, simpler, and more flexible.  There is
> clearly demand for schema-local functions.  So I think that designing this
> system from the premise that a schema-qualified operator call will look
> strange is the wrong end to start at.

Okay, a fair point --- or you could have used my own argument against
me: there's nothing wrong with designing a general mechanism and then
choosing not to expose all of the functionality.  So let's assume that
functions and operators live in namespaces, and that we have some kind
of search path across multiple namespaces for use when an unqualified
name is given.

Now, how is that going to play with resolution of ambiguous calls?

The most reasonable semantics I can think of are to collect all the
potential matches (matching op/func name) across all the searchable
namespaces, discarding only those that have exactly the same signature
as one in a prior namespace.  Thus, eg, plus(int4,int4) in an earlier
namespace would hide plus(int4,int4) in a later namespace in the search
path, but it wouldn't hide plus(int8,int8).  After we've collected all
the visible alternatives, do resolution based on argument types the same
way as we do now.

The only alternative semantics that seem defensible at all are to stop
at the first namespace that contains any matching-by-name op or func,
and do resolution using only the candidates available in that namespace.
That strikes me as not a good idea; for example, a user who defines a
"+" operator in his own schema for his own datatype would be quite
unhappy to find it masking all the "+" operators in the system schema.

I believe that this behavior would be fairly reasonable if our
backward-compatibility feature consists of a "public" namespace
that all users can write in.  OTOH I think it would not play at all
well if we use Fernando's idea of an "any" wildcard in the search
path.  (1) Imagine the case where we have some users who are using
the backward-compatible behavior while others have set up private
namespaces.  If Joe SmartGuy creates a "+" operator in his private
namespace, it'll be visible to people using the "any" wildcard and
possibly cause resolution-ambiguity failures for them, even though
Joe deliberately did what he should do to avoid that.  (2) "any"
creates the problem of resolving multiple functions with identical
signatures in different namespaces, with no reasonable rule for
making the choice.

So I'm still of the opinion that an "any" wildcard is too risky a
solution for our backwards-compatibility problem.
        regards, tom lane


Re: RFD: schemas and different kinds of Postgres objects

From
Tom Lane
Date:
"Zeugswetter Andreas SB SD" <ZeugswetterA@spardat.at> writes:
> When configured for historical behavior would need to:
> 1. have search path: temp, any, system
> 2. guard against duplicate table names across all schemas (except temp schema)

This would be a *whole* lot simpler if we forgot the notion of "any"
and made the search order look like
(temp, private, public, system)

where the public namespace is world-writable but the private per-user
ones are (typically at least) not.

It occurs to me that we can get both backward-compatible and SQL92
semantics with this same search path; the only thing that needs to
be different in the two cases is whether the default place to create
objects is your private schema or the public one.  If you don't ever
use your private schema then it doesn't matter if it's on the search
path or not.  I would still prefer that the search path be a settable
option, since a paranoid person might well wish to not have public in
his path at all ... but the default could be as-above.

> Or are you thinking about a per session behavior ?
> I would rather envision a per database behavior.
> Maybe the easy way out would be a "default creation schema" property for 
> each user, that would default to the username. If you want everything in one 
> schema simply alter the users.

I hadn't really gotten to the point of thinking about exactly what and
where the control knobs should be.  I suspect you are right that we will
want the default behavior to be selectable on a per-user or per-database
basis, which seems to eliminate the option of using GUC (at least in its
current form).  We could easily add a field to pg_shadow or pg_database
respectively to determine the default behavior.  It'd be nice though if
the behavior could be changed after connection by a SET statement, which
would be lots easier if the setting were GUC-controlled.  Peter, you see
any way to resolve that?
        regards, tom lane


Re: RFD: schemas and different kinds of Postgres objects

From
Stephan Szabo
Date:
On Wed, 23 Jan 2002, Tom Lane wrote:

> The only alternative semantics that seem defensible at all are to stop
> at the first namespace that contains any matching-by-name op or func,
> and do resolution using only the candidates available in that namespace.
> That strikes me as not a good idea; for example, a user who defines a
> "+" operator in his own schema for his own datatype would be quite
> unhappy to find it masking all the "+" operators in the system schema.
>
> I believe that this behavior would be fairly reasonable if our
> backward-compatibility feature consists of a "public" namespace
> that all users can write in.  OTOH I think it would not play at all
> well if we use Fernando's idea of an "any" wildcard in the search
> path.  (1) Imagine the case where we have some users who are using
> the backward-compatible behavior while others have set up private
> namespaces.  If Joe SmartGuy creates a "+" operator in his private
> namespace, it'll be visible to people using the "any" wildcard and
> possibly cause resolution-ambiguity failures for them, even though
> Joe deliberately did what he should do to avoid that.  (2) "any"
> creates the problem of resolving multiple functions with identical
> signatures in different namespaces, with no reasonable rule for
> making the choice.

Wouldn't it make sense to prefer operators/functions earlier in the search
path for resolving ambiguity.  So if you had plus(int4, int4) in my
schema and plus(int8, int8) in system, and they'd otherwise cause an
ambiguity failure for the query, use the plus(int4, int4) on mine. It
seems not too far from having the search path shadow later exact matches.




Re: RFD: schemas and different kinds of Postgres objects

From
"Joe Conway (wwc)"
Date:
Tom Lane wrote:

> 
> This would be a *whole* lot simpler if we forgot the notion of "any"
> and made the search order look like
> 
>     (temp, private, public, system)
> 
> where the public namespace is world-writable but the private per-user
> ones are (typically at least) not.
> 
> It occurs to me that we can get both backward-compatible and SQL92
> semantics with this same search path; the only thing that needs to
> be different in the two cases is whether the default place to create
> objects is your private schema or the public one.  If you don't ever
> use your private schema then it doesn't matter if it's on the search
> path or not.  I would still prefer that the search path be a settable
> option, since a paranoid person might well wish to not have public in
> his path at all ... but the default could be as-above.
> 


I think it would be desirable to be able to restrict users from 
"publishing" objects into the public schema. As an admin, I'd like some 
control over the objects in this namespace. Hand-in-hand with this would 
be the ability for the superuser to move (or "promote") an object from a 
private schema to the public one. This would allow a user to develop 
their own objects without interfering with others, but then make it 
public with the superuser's assistance.

The search path you suggest above would then lead to the behavior that 
unqualified references to objects will see my own objects before the 
public ones, and other people's private objects must be explicitly 
qualified.

Joe





Re: RFD: schemas and different kinds of Postgres objects

From
Tom Lane
Date:
Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:
> Wouldn't it make sense to prefer operators/functions earlier in the search
> path for resolving ambiguity.  So if you had plus(int4, int4) in my
> schema and plus(int8, int8) in system, and they'd otherwise cause an
> ambiguity failure for the query, use the plus(int4, int4) on mine. It
> seems not too far from having the search path shadow later exact matches.

Given the complexity of the resolution rules (cf.
http://developer.postgresql.org/docs/postgres/typeconv.html),
it's not clear that we can determine exactly which "later" entry ought
to be blamed for causing a resolution failure.  I'd be interested to
hear Lockhart's opinion on this --- but my gut feeling is we don't
want to go there.  The resolution rules are already complicated enough,
and I think layering an additional mechanism like that onto them might
make the behavior totally unpredictable.

Another problem is that this would probably cause earlier namespace
entries to be over-preferred.  For example, suppose that the system
namespace has plus(int4,int4) and plus(int8,int8) and you choose to
define plus(int4,int8) locally.  I believe you'd suddenly find yours
being used for *any* cross-datatype addition, including cases that
had nothing obvious to do with either int4 or int8 ...
        regards, tom lane


Re: RFD: schemas and different kinds of Postgres objects

From
Tom Lane
Date:
"Joe Conway (wwc)" <jconway@cox.net> writes:
> I think it would be desirable to be able to restrict users from 
> "publishing" objects into the public schema.

Sure.  I'm envisioning that namespaces will have ACLs --- that's what
will keep private namespaces private.  So, while public would by default
be world-writable (at least in the backwards-compatibility case),
there'd be nothing stopping you from marking it as read-only to some
users.

Come to think of it, that's still another reason not to have an "any"
wildcard: there's no way to put any restrictions on what appears in
such a namespace.
        regards, tom lane


Re: RFD: schemas and different kinds of Postgres objects

From
Stephan Szabo
Date:
On Wed, 23 Jan 2002, Tom Lane wrote:

> Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:
> > Wouldn't it make sense to prefer operators/functions earlier in the search
> > path for resolving ambiguity.  So if you had plus(int4, int4) in my
> > schema and plus(int8, int8) in system, and they'd otherwise cause an
> > ambiguity failure for the query, use the plus(int4, int4) on mine. It
> > seems not too far from having the search path shadow later exact matches.
>
> Given the complexity of the resolution rules (cf.
> http://developer.postgresql.org/docs/postgres/typeconv.html),
> it's not clear that we can determine exactly which "later" entry ought
> to be blamed for causing a resolution failure.  I'd be interested to
> hear Lockhart's opinion on this --- but my gut feeling is we don't
> want to go there.  The resolution rules are already complicated enough,
> and I think layering an additional mechanism like that onto them might
> make the behavior totally unpredictable.

> Another problem is that this would probably cause earlier namespace
> entries to be over-preferred.  For example, suppose that the system
> namespace has plus(int4,int4) and plus(int8,int8) and you choose to
> define plus(int4,int8) locally.  I believe you'd suddenly find yours
> being used for *any* cross-datatype addition, including cases that
> had nothing obvious to do with either int4 or int8 ...

Well, what I'd been thinking of would have been similar to anywhere it
says "If only one candidate matches", becoming "If the earliest search
path entry with at least one candidate matching has only one
matching candidate ..." But that would cause the plus(int4, int8) to get
used in any cross-datatype case that could coerce and didn't have a
stronger match (ie, one of the arguments exactly matching a plus argument
per b or c) so that's probably not good enough.





Re: RFD: schemas and different kinds of Postgres objects

From
Bill Studenmund
Date:
On Mon, 21 Jan 2002, Tom Lane wrote:

> Peter Eisentraut <peter_e@gmx.net> writes:
> > Remember that a schema is a named representation of ownership, so anything
> > that can be owned must be in a schema.  (Unless you want to invent a
> > parallel universe for a different kind of ownership, which would be
> > incredibly confusing.)
>
> I don't buy that premise.  It's true that SQL92 equates ownership of a
> schema with ownership of the objects therein, but AFAICS we have no hope
> of being forward-compatible with existing database setups (wherein there
> can be multiple tables of different ownership all in a single namespace)
> if we don't allow varying ownership within a schema.  I think we can
> arrange things so that we are upward compatible with both SQL92 and
> the old way.  Haven't worked out details yet though.

Yes we most certianly can! :-)

One of the things schemas have to support is essentially a PATH specifier.
So all we need to do is have all of the schemas created in a new DB have
path specifiers pulling in all of the other schemas. Thus we can make a
schema-savy system act as if it has only one namespace.

Back when Zembu was paying me to work on this, I envisioned a script or
tool you'd feed a DB dump, and it would do the schema fixup, including
adding PATH directives to all schemas, so they all see everything.

Since you have to pg_dump when updating, all this adds is running one tool
during an upgrade. And then existing apps would work. :-)

Take care,

Bill




Re: RFD: schemas and different kinds of Postgres objects

From
Tom Lane
Date:
Bill Studenmund <wrstuden@netbsd.org> writes:
> One of the things schemas have to support is essentially a PATH specifier.

Yes, but...

> So all we need to do is have all of the schemas created in a new DB have
> path specifiers pulling in all of the other schemas. Thus we can make a
> schema-savy system act as if it has only one namespace.

When you create a new user, do all those path specifiers for the
existing users magically update themselves?  Seems like maintenance
would be a pain.

Fernando's "any" idea is probably a cleaner way to handle it if we
wanted to do things like that.  But I still think it'll be safer and
more controllable if we provide a "public" namespace instead; see
followup discussions.
        regards, tom lane


Re: RFD: schemas and different kinds of Postgres objects

From
Bill Studenmund
Date:
On Wed, 23 Jan 2002, Tom Lane wrote:

> Bill Studenmund <wrstuden@netbsd.org> writes:
> > Why not? What's wrong with either schema.foo.function (==>
> > function(schema.foo)) or foo.schema.function (==> schema.function(foo))?
>
> Neither is wrong in isolation, but how do you tell the difference?
> More to the point, given input x.y.z, how do you tell which component
> is what?

See below.

> > Tables and functions can't have the same names as schemas,
>
> News to me.  Where is that written on stone tablets?  Even if that's

I'm still trying to find the quote, but I found it a few months ago. I'm
looking in SQL99, which is 1100+ pages for section 2. :-)

> considered an acceptable limitation from a purely functional point of
> view, I don't like using it to disambiguate input.  The error messages
> you'll get from incorrect input to an implementation that depends on
> that to disambiguate cases will not be very helpful.

?? Depends on how we do it. As I see it, we have four cases. In the
x.y.z.p.q, we have:

1) No table name, but a function name. It's a function call.

2) A table name, but no function name. It's a table reference.

3) Both a table name & function name, and the function is first. I think
this case is an error (I don't think we support function.foo ==
function(foo))

4) Both a table name & function name, and the table is first. This is
foo.function.

Ok, there is a fifth case, no function nor table names, which is an error.

> > Actually functions do have to be schema local. It's in the spec (don't
> > have exactly where with me).
>
> (A) I don't believe that; please cite chapter and verse; (B) even if

Peter got to that one first.

> SQL92 thinks that's okay, we can't do it that way because of
> backwards-compatibility issues.

Why do backwards-compatability issues keep us from doing it?

Yes, I understand we have apps now with different users owning things
(tables, functions) which they all can access, just like they were in one
unified name space. With real schemas, they are in differen namespaces.
But as long as the routines, tables, triggers & such in each schema can
find things in the other schemas as if they were in one namespace, where
is the problem? We just have the app gain PATH directives to path in all
the other schemas.

The app runs, even though there are different schemas involved. Where is
the problem?

> > My vote would be to make them schema-specific. As Peter pointed out,
> > schemas are how you own things,
>
> Sorry, but this line of argument is trying to assume the very point in
> dispute.

When you started this thread, you said you were thinking about
"implementing SQL schemas." Are these "SQL schemas" going to follow the
spec or not? SQL'99 is rather clear that ownership happens at the schema
level. Peter spent quite a lot of time last October pounding that into my
head, and after I looked at the spec, I found he was 100% correct.

If these schemas are to follow the standards, ownership happens at the
schema level. If ownership happens elsewhere, whatever we're doing is not
following the standard. Unfortunatly it's that cut & dried. So why should
we call them "SQL schemas" if we aren't following the SQL spec?

Take care,

Bill



Re: RFD: schemas and different kinds of Postgres objects

From
Tom Lane
Date:
Bill Studenmund <wrstuden@netbsd.org> writes:
> On Wed, 23 Jan 2002, Tom Lane wrote:
>> Bill Studenmund <wrstuden@netbsd.org> writes:
> Why not? What's wrong with either schema.foo.function (==>
> function(schema.foo)) or foo.schema.function (==> schema.function(foo))?
>> 
>> Neither is wrong in isolation, but how do you tell the difference?
>> More to the point, given input x.y.z, how do you tell which component
>> is what?

> ?? Depends on how we do it. As I see it, we have four cases. In the
> x.y.z.p.q, we have:

> 1) No table name, but a function name. It's a function call.

> 2) A table name, but no function name. It's a table reference.

No, you're missing the point.  Which of x,y,z,p,q is the name we
are going to test to see if it is a table or function?  And which
of these names is a schema name --- if you don't even know that,
it's hard to argue that checking to see if some name is known is
a well-defined operation.

> When you started this thread, you said you were thinking about
> "implementing SQL schemas." Are these "SQL schemas" going to follow the
> spec or not?

If you use only the SQL-defined operations, after setting up any
configuration variables we may invent in the way we will document as
necessary for SQL-compatible behavior, then you will get SQL-compatible
behavior.  I do not think that precludes having an underlying
implementation that sees the world differently than SQL does and
supports non-SQL behaviors too.  (For that matter, I'm sure there is
text somewhere in the spec that points out that the spec intends to
define user-visible behavior, not implementation.)
        regards, tom lane


Re: RFD: schemas and different kinds of Postgres objects

From
Bill Studenmund
Date:
On Wed, 23 Jan 2002, Tom Lane wrote:

> Bill Studenmund <wrstuden@netbsd.org> writes:
> > One of the things schemas have to support is essentially a PATH specifier.
>
> Yes, but...
>
> > So all we need to do is have all of the schemas created in a new DB have
> > path specifiers pulling in all of the other schemas. Thus we can make a
> > schema-savy system act as if it has only one namespace.
>
> When you create a new user, do all those path specifiers for the
> existing users magically update themselves?  Seems like maintenance
> would be a pain.

No, they don't. But why should they? Why should they need to?

Either we're migrating an existing app, for which adding PATH directives
with a helper program before restore works, or we're making a new app. If
you're designing an app for a schema-savy system, you need to think about
schemas.

> Fernando's "any" idea is probably a cleaner way to handle it if we
> wanted to do things like that.  But I still think it'll be safer and
> more controllable if we provide a "public" namespace instead; see
> followup discussions.

Why? Why is it needed? What would public let you do that PATH and ACLs
wouldn't?

The only reason I can see it's needed is so that people can make new apps
for a schema-savy PostgreSQL while ignoring the schemas. That strikes me
as bad. I agree that schemas shouldn't interfeer with things, but to let
folks just blow them off seems equally bad.

Also, it wouldn't be SQL'99 schemas. It can still be done, but it's
solving a problem that the other SQL'99 databases don't seem to have.

Take care,

Bill



Re: RFD: schemas and different kinds of Postgres objects

From
Thomas Swan
Date:
Tom Lane wrote:<br /><blockquote cite="mid:3159.1011804418@sss.pgh.pa.us" type="cite"><pre wrap="">Stephan Szabo <a
class="moz-txt-link-rfc2396E"href="mailto:sszabo@megazone23.bigpanda.com"><sszabo@megazone23.bigpanda.com></a>
writes:<br/></pre><blockquote type="cite"><pre wrap="">Wouldn't it make sense to prefer operators/functions earlier in
thesearch<br />path for resolving ambiguity.  So if you had plus(int4, int4) in my<br />schema and plus(int8, int8) in
system,and they'd otherwise cause an<br />ambiguity failure for the query, use the plus(int4, int4) on mine. It<br
/>seemsnot too far from having the search path shadow later exact matches.<br /></pre></blockquote><pre wrap=""><br
/>Giventhe complexity of the resolution rules (cf.<br /><a class="moz-txt-link-freetext"
href="http://developer.postgresql.org/docs/postgres/typeconv.html">http://developer.postgresql.org/docs/postgres/typeconv.html</a>),<br
/>it'snot clear that we can determine exactly which "later" entry ought<br />to be blamed for causing a resolution
failure. I'd be interested to<br />hear Lockhart's opinion on this --- but my gut feeling is we don't<br />want to go
there. The resolution rules are already complicated enough,<br />and I think layering an additional mechanism like that
ontothem might<br />make the behavior totally unpredictable.<br /><br />Another problem is that this would probably
causeearlier namespace<br />entries to be over-preferred.  For example, suppose that the system<br />namespace has
plus(int4,int4)and plus(int8,int8) and you choose to<br />define plus(int4,int8) locally.  I believe you'd suddenly
findyours<br />being used for *any* cross-datatype addit
 
ion, including cases that<br />had nothing obvious to do with either int4 or int8 ...<br /></pre></blockquote> This is
agood example.  The other option is to use  name, arg1, arg2... as a hunt path for function call resolution.  This
woulddepend on when datatype promotion is occuring (i.e. int4 to int8, int8 to int4, etc... )<br /><br /> Then you
couldjust be really hard and say that only exact and trivial conversion matches in user space will be used .   <br
/><br/> There is no easy answer for this, but whatever rules are initiated need to be something that someone can step
throughto solve w/o a machine.<br /><br /> I do think you will ultimately need a search utility that provides 'which'
functionality.  (Given my namespace, which function in what namespace is going to be called.)<br /><blockquote
cite="mid:3159.1011804418@sss.pgh.pa.us"type="cite"><pre wrap=""><br /><br />            regards, tom lane<br /><br
/>---------------------------(endof broadcast)---------------------------<br />TIP 1: subscribe and unsubscribe
commandsgo to <a class="moz-txt-link-abbreviated"
href="mailto:majordomo@postgresql.org">majordomo@postgresql.org</a><br/></pre></blockquote><br /><br /> 

Re: RFD: schemas and different kinds of Postgres objects

From
Bill Studenmund
Date:
On Wed, 23 Jan 2002, Tom Lane wrote:

> Bill Studenmund <wrstuden@netbsd.org> writes:
> > On Wed, 23 Jan 2002, Tom Lane wrote:
> >> Bill Studenmund <wrstuden@netbsd.org> writes:
> > Why not? What's wrong with either schema.foo.function (==>
> > function(schema.foo)) or foo.schema.function (==> schema.function(foo))?
> >>
> >> Neither is wrong in isolation, but how do you tell the difference?
> >> More to the point, given input x.y.z, how do you tell which component
> >> is what?
>
> > ?? Depends on how we do it. As I see it, we have four cases. In the
> > x.y.z.p.q, we have:
>
> > 1) No table name, but a function name. It's a function call.
>
> > 2) A table name, but no function name. It's a table reference.
>
> No, you're missing the point.  Which of x,y,z,p,q is the name we
> are going to test to see if it is a table or function?  And which
> of these names is a schema name --- if you don't even know that,
> it's hard to argue that checking to see if some name is known is
> a well-defined operation.

No, I'm not. :-) You test enough of them to figure out what case you have.
Yes, it might be a bit of work, but it's doable.

Actually, it's not that hard. In foo.funcname, do we support anything
AFTER the funcname? I don't think so. If there were a parenthesis, then we
have a function call. If it's an operator or something else, we have
either a table reference, or a foo.funcname function call. So all we have
to do is see if p.q (last two elements) is a schema.function, or of q is a
function pathed into our current schema. If yes, we have foo.function. If
not, then we have some table reference.

> > When you started this thread, you said you were thinking about
> > "implementing SQL schemas." Are these "SQL schemas" going to follow the
> > spec or not?
>
> If you use only the SQL-defined operations, after setting up any
> configuration variables we may invent in the way we will document as
> necessary for SQL-compatible behavior, then you will get SQL-compatible
> behavior.  I do not think that precludes having an underlying
> implementation that sees the world differently than SQL does and
> supports non-SQL behaviors too.  (For that matter, I'm sure there is
> text somewhere in the spec that points out that the spec intends to
> define user-visible behavior, not implementation.)

While I agree in principle, that quote is from talking about ownership. I
don't see how we can gloss over ownership. :-)

Also, why support two different behaviors? That means 1) there's code
bloat since the backend has to support both. 2) Support is harder as major
DB behaviors will change depending on these settings. 3) I still haven't
seen anything this variant behavior would do that can't be done with
schema paths and access control lists, other than it would let people keep
thinking about things as they do now.

That latter reason doesn't strike me as a good one. Yes, it will take some
getting used to to wrap minds around schemas, but once done, I don't think
it will be that hard. Yes, the documentation will need work, and the
tutorial will need changing (And Bruce will need to release a new edition
of his book :-) , but once that's done, I don't think working with real
schemas will be that hard. So why not just do things right from the
begining?

Take care,

Bill



Re: RFD: schemas and different kinds of Postgres objects

From
Tom Lane
Date:
Bill Studenmund <wrstuden@netbsd.org> writes:
> Either we're migrating an existing app, for which adding PATH directives
> with a helper program before restore works, or we're making a new app.

Sorry, I don't accept that either-or proposition.  People will expect to
be able to continue to use 7.3 as they have used Postgres in the past.
Among other things that will mean being able to add new users to an
existing installation.  If we say "you can't do much of anything in 7.3
until you upgrade all your applications to schema-awareness", then we're
going to have lots of unhappy users.

>> Fernando's "any" idea is probably a cleaner way to handle it if we
>> wanted to do things like that.  But I still think it'll be safer and
>> more controllable if we provide a "public" namespace instead; see
>> followup discussions.

> Why? Why is it needed? What would public let you do that PATH and ACLs
> wouldn't?

Public gives you a place to put the ACL determining what people can do
with publicly-visible names.  See, eg, comments from Joe Conway.
Without a specific public namespace to put ACLs on, a dbadmin has very
little control over interuser interactions.  Please note that the
facility we are talking about offering here is not available in existing
Postgres nor in SQL92, but that doesn't make it evil or unreasonable.

Basically my point here is that the SQL spec is not the be-all and
end-all of database development.  (Have you read C. J. Date's commentary
on it, for example?)  We have a proven useful concept of object ownership
in existing Postgres, and I see no need to remove that facility in
pursuit of slavish adherence to a specification.  If it were a project
goal to rip out everything in Postgres that is not mentioned in the
SQL spec, we could have a much smaller distribution ... with lots fewer
users.
        regards, tom lane


Re: RFD: schemas and different kinds of Postgres objects

From
Tom Lane
Date:
Bill Studenmund <wrstuden@netbsd.org> writes:
>> No, you're missing the point.  Which of x,y,z,p,q is the name we
>> are going to test to see if it is a table or function?

> No, I'm not. :-) You test enough of them to figure out what case you have.

There could be multiple valid interpretations.  When you can't even
figure out where to start, it's too squishy for me.  Code complexity
isn't really the issue here, it's whether a user can understand what's
going on.

> Actually, it's not that hard. In foo.funcname, do we support anything
> AFTER the funcname? I don't think so. If there were a parenthesis, then we
> have a function call. If it's an operator or something else, we have
> either a table reference, or a foo.funcname function call. So all we have
> to do is see if p.q (last two elements) is a schema.function, or of q is a
> function pathed into our current schema. If yes, we have foo.function. If
> not, then we have some table reference.

Now wait a sec.  What you started out with was the claim

> Why not? What's wrong with either schema.foo.function (==>
> function(schema.foo)) or foo.schema.function (==> schema.function(foo))?

The issue was not figuring out whether the last component was a function
name or not, it was to determine what the other components were, and in
particular whether the function name should be presumed to be qualified
(by the next-to-last component taken as a schema name) or unqualified.
That in turns changes your assumptions about which of the components
further left are table names, schema names, or catalog names.

> ... I don't think working with real
> schemas will be that hard. So why not just do things right from the
> begining?

If I thought that SQL's model of ownership == namespace was "right",
then we probably wouldn't be having this argument.  I think the spec
pretty much sucks in this particular department, and I don't see why
we should restrict our implementation to support only the spec's
braindead world view.  Especially not when it makes the implementation
harder, not easier, because we end up needing to add in weird frammishes
to have some semblance of backwards-compatibility too.

Please give me some good reasons (not "the spec says so") why it's
a good idea to treat ownership of a namespace as equivalent to ownership
of the objects in it, and why decoupling the concepts at the
implementation level isn't a reasonable thing to do.
        regards, tom lane


Re: RFD: schemas and different kinds of Postgres objects

From
Bill Studenmund
Date:
On Wed, 23 Jan 2002, Tom Lane wrote:

> Bill Studenmund <wrstuden@netbsd.org> writes:
> >> No, you're missing the point.  Which of x,y,z,p,q is the name we
> >> are going to test to see if it is a table or function?
>
> > No, I'm not. :-) You test enough of them to figure out what case you have.
>
> There could be multiple valid interpretations.  When you can't even
> figure out where to start, it's too squishy for me.  Code complexity
> isn't really the issue here, it's whether a user can understand what's
> going on.
>
> > Actually, it's not that hard. In foo.funcname, do we support anything
> > AFTER the funcname? I don't think so. If there were a parenthesis, then we
> > have a function call. If it's an operator or something else, we have
> > either a table reference, or a foo.funcname function call. So all we have
> > to do is see if p.q (last two elements) is a schema.function, or of q is a
> > function pathed into our current schema. If yes, we have foo.function. If
> > not, then we have some table reference.
>
> Now wait a sec.  What you started out with was the claim
>
> > Why not? What's wrong with either schema.foo.function (==>
> > function(schema.foo)) or foo.schema.function (==> schema.function(foo))?
>
> The issue was not figuring out whether the last component was a function
> name or not, it was to determine what the other components were, and in
> particular whether the function name should be presumed to be qualified
> (by the next-to-last component taken as a schema name) or unqualified.
> That in turns changes your assumptions about which of the components
> further left are table names, schema names, or catalog names.

Then you choose. Choose a search order, and document it. If you are going
to go back into a corner full of so much ambiguity, then document your way
out.

You can make it easier restricting say catalogs and schemas to have
different names, or schemas, tables, and functions can't have the same
name. For this, I'd suggest following Oracle's example. I don't think you
can go too wrong using Oracle's behavior to break ties & ambiguities.

> > ... I don't think working with real
> > schemas will be that hard. So why not just do things right from the
> > begining?
>
> If I thought that SQL's model of ownership == namespace was "right",
> then we probably wouldn't be having this argument.  I think the spec
> pretty much sucks in this particular department, and I don't see why
> we should restrict our implementation to support only the spec's
> braindead world view.  Especially not when it makes the implementation
> harder, not easier, because we end up needing to add in weird frammishes
> to have some semblance of backwards-compatibility too.

?? What weird fammishes from ownership?

It actually makes things (slightly) simpler. All of the owner fields
disapear from most system tables, because the owner is implied by the
containing schema.

Also, what is braindead about the spec? Yes, it's not what I would choose,
and it's different from what PG does now. But it's just that, different.
Given all of the ACL abilities (if the superuser wants user X to be able
to create objects in table Y of schema Z, s/he makes it so), why does it
matter who owns the object?

> Please give me some good reasons (not "the spec says so") why it's
> a good idea to treat ownership of a namespace as equivalent to ownership
> of the objects in it, and why decoupling the concepts at the
> implementation level isn't a reasonable thing to do.

Can you really give a good reason other than, "I don't like it?"

Absent the existance of specs, it's a matter of choice. Having all of the
ownership happen at the schema level or at the individual item level, it's
six of one, half-dozen of the other.

But there is a spec. A spec that, as far as I can tell, all other
schema-claming DBs follow. So why shouldn't we follow it? Yes, you don't
like it. But is this really supposed to be about personal desires, or
about the resulting DB?

Take care,

Bill



Re: RFD: schemas and different kinds of Postgres objects

From
Bill Studenmund
Date:
On Wed, 23 Jan 2002, Tom Lane wrote:

> "Zeugswetter Andreas SB SD" <ZeugswetterA@spardat.at> writes:
> > When configured for historical behavior would need to:
> > 1. have search path: temp, any, system
> > 2. guard against duplicate table names across all schemas (except temp schema)
>
> This would be a *whole* lot simpler if we forgot the notion of "any"
> and made the search order look like
>
>     (temp, private, public, system)
>
> where the public namespace is world-writable but the private per-user
> ones are (typically at least) not.
>
> It occurs to me that we can get both backward-compatible and SQL92
> semantics with this same search path; the only thing that needs to
> be different in the two cases is whether the default place to create
> objects is your private schema or the public one.  If you don't ever
> use your private schema then it doesn't matter if it's on the search
> path or not.  I would still prefer that the search path be a settable
> option, since a paranoid person might well wish to not have public in
> his path at all ... but the default could be as-above.

s/public/DEFAULT/ and add a way (createdb option) to make the default ACL
on DEFAULT such that anyone can create things and we are in agreement.

One of the parts of the schema system I'd envisioned (and tried to make)
was that there would be an IMPLIMENTATION_SCHEMA which owned all
built-ins, and a DEFAULT schema which was owned by the superuser. Making
the default schema path for a schema
IMPLIMENTATION_SCHEMA:<SCHEMA_NAME>:DEFAULT is fine.

Making the default schema for creation settable to DEFAULT would be fine,
and I think would remove objection. :-)

Take care,

Bill



Re: RFD: schemas and different kinds of Postgres objects

From
Bill Studenmund
Date:
On Wed, 23 Jan 2002, Tom Lane wrote:

> Peter Eisentraut <peter_e@gmx.net> writes:
> > OK, I can accept that.  But then I want to get back at my original point,
> > namely that all database objects (except users and groups) should be in
> > schemas.  This is also cleaner, simpler, and more flexible.  There is
> > clearly demand for schema-local functions.  So I think that designing this
> > system from the premise that a schema-qualified operator call will look
> > strange is the wrong end to start at.
>
> Okay, a fair point --- or you could have used my own argument against
> me: there's nothing wrong with designing a general mechanism and then
> choosing not to expose all of the functionality.  So let's assume that
> functions and operators live in namespaces, and that we have some kind
> of search path across multiple namespaces for use when an unqualified
> name is given.
>
> Now, how is that going to play with resolution of ambiguous calls?
>
> The most reasonable semantics I can think of are to collect all the
> potential matches (matching op/func name) across all the searchable
> namespaces, discarding only those that have exactly the same signature
> as one in a prior namespace.  Thus, eg, plus(int4,int4) in an earlier
> namespace would hide plus(int4,int4) in a later namespace in the search
> path, but it wouldn't hide plus(int8,int8).  After we've collected all
> the visible alternatives, do resolution based on argument types the same
> way as we do now.
>
> The only alternative semantics that seem defensible at all are to stop
> at the first namespace that contains any matching-by-name op or func,
> and do resolution using only the candidates available in that namespace.
> That strikes me as not a good idea; for example, a user who defines a
> "+" operator in his own schema for his own datatype would be quite
> unhappy to find it masking all the "+" operators in the system schema.

There is a third behavior which is almost the first one. And it's the one
I use for function matching in the package diffs I made oh so long ago.
:-)

You look in the first namespace for all candidates. If one matches, you
use it. If two or more match, you throw the error we throw now. If none
match, you move on to the next namespace and repeat the search there.

It's what the code does now. It's not that hard. It's just essentially
turning part of the function lookup into a for loop over the namespaces.
:-)

It's easier than gathering everything as you only gather one namespace's
worth of matches at once.

Take care,

Bill



Re: RFD: schemas and different kinds of Postgres objects

From
Bill Studenmund
Date:
On Wed, 23 Jan 2002, Tom Lane wrote:

> Bill Studenmund <wrstuden@netbsd.org> writes:
> > Either we're migrating an existing app, for which adding PATH directives
> > with a helper program before restore works, or we're making a new app.
>
> Sorry, I don't accept that either-or proposition.  People will expect to
> be able to continue to use 7.3 as they have used Postgres in the past.
> Among other things that will mean being able to add new users to an
> existing installation.  If we say "you can't do much of anything in 7.3
> until you upgrade all your applications to schema-awareness", then we're
> going to have lots of unhappy users.

"upgrad[ing] .. to schema-awareness" is adding the PATHs to the schema
creates. You then have a unified namespace (to the apps perspective). The
migration tool I mentioned should be able to do this (and will need to be
present).

As PostgreSQL changes, people are going to have to learn to deal with new
features. I really don't think that dealing with schemas is going to be
that hard. Also, at Zembu, all of the apps we looked at had maybe two or
three schemas. These are big apps. If they can live with just a few
schemas, why do we need so many that maintainance is a pain?

Also, what is this new user going to do *IN THE EXISTING APP*? Either the
user is going to change existing tables, for which adding the user to the
ACLs (or better yet to a group) will work, or the new user is going to own
new tables. If the new user is owning new tables, then you have to change
the app to refer to them. If the upgrade to 7.3 included a, "quick
tutorial about the new schema support," the author should be able to adapt
easily.

> >> Fernando's "any" idea is probably a cleaner way to handle it if we
> >> wanted to do things like that.  But I still think it'll be safer and
> >> more controllable if we provide a "public" namespace instead; see
> >> followup discussions.
>
> > Why? Why is it needed? What would public let you do that PATH and ACLs
> > wouldn't?
>
> Public gives you a place to put the ACL determining what people can do
> with publicly-visible names.  See, eg, comments from Joe Conway.
> Without a specific public namespace to put ACLs on, a dbadmin has very
> little control over interuser interactions.  Please note that the
> facility we are talking about offering here is not available in existing
> Postgres nor in SQL92, but that doesn't make it evil or unreasonable.

I think the existance of a DEFAULT schema (which is a real schema, not a
special namespace) would also aleviate the above concerns.

> Basically my point here is that the SQL spec is not the be-all and
> end-all of database development.  (Have you read C. J. Date's commentary
> on it, for example?)  We have a proven useful concept of object ownership
> in existing Postgres, and I see no need to remove that facility in
> pursuit of slavish adherence to a specification.  If it were a project
> goal to rip out everything in Postgres that is not mentioned in the
> SQL spec, we could have a much smaller distribution ... with lots fewer
> users.

I haven't read the comentary, do you have a URL?

I agree that we should not limit ourselves to SQL'99. But doing more than
SQL'99 is different from wanting to support SQL schemas while using a
different ownership model, when one of the main things of SQL schemas as I
understand it is the ownership model.

Take care,

Bill



Re: RFD: schemas and different kinds of Postgres objects

From
Stephan Szabo
Date:
On Wed, 23 Jan 2002, Bill Studenmund wrote:

> On Wed, 23 Jan 2002, Tom Lane wrote:
>
> There is a third behavior which is almost the first one. And it's the one
> I use for function matching in the package diffs I made oh so long ago.
> :-)
>
> You look in the first namespace for all candidates. If one matches, you
> use it. If two or more match, you throw the error we throw now. If none
> match, you move on to the next namespace and repeat the search there.

That's even more strongly towards earlier namespaces than my suggestion.
How do you define match?  If you allow coercions, then the
plus(int8, int8) in my schema would be prefered over better (possibly
exact) matches in the system schema which may not be what you want.





Re: RFD: schemas and different kinds of Postgres objects

From
Bill Studenmund
Date:
On Wed, 23 Jan 2002, Stephan Szabo wrote:

> On Wed, 23 Jan 2002, Bill Studenmund wrote:
>
> > On Wed, 23 Jan 2002, Tom Lane wrote:
> >
> > There is a third behavior which is almost the first one. And it's the one
> > I use for function matching in the package diffs I made oh so long ago.
> > :-)
> >
> > You look in the first namespace for all candidates. If one matches, you
> > use it. If two or more match, you throw the error we throw now. If none
> > match, you move on to the next namespace and repeat the search there.
>
> That's even more strongly towards earlier namespaces than my suggestion.
> How do you define match?  If you allow coercions, then the
> plus(int8, int8) in my schema would be prefered over better (possibly
> exact) matches in the system schema which may not be what you want.

True. But:

1) How often are you going to make routines with names that duplicate
those in the system schema, when you don't want them to be used?

2) you can always explicitly refer to the system schema, so you can get
the routine you want if it's not the one you'll get by coercion.

3) We tested other (commercial) databases, and they have this behavior, so
it seems a reasonable thing to do.

4) It's simple and easy to understand.

Take care,

Bill



Re: RFD: schemas and different kinds of Postgres objects

From
Stephan Szabo
Date:
On Wed, 23 Jan 2002, Bill Studenmund wrote:

> On Wed, 23 Jan 2002, Stephan Szabo wrote:
>
> > On Wed, 23 Jan 2002, Bill Studenmund wrote:
> >
> > > On Wed, 23 Jan 2002, Tom Lane wrote:
> > >
> > > There is a third behavior which is almost the first one. And it's the one
> > > I use for function matching in the package diffs I made oh so long ago.
> > > :-)
> > >
> > > You look in the first namespace for all candidates. If one matches, you
> > > use it. If two or more match, you throw the error we throw now. If none
> > > match, you move on to the next namespace and repeat the search there.
> >
> > That's even more strongly towards earlier namespaces than my suggestion.
> > How do you define match?  If you allow coercions, then the
> > plus(int8, int8) in my schema would be prefered over better (possibly
> > exact) matches in the system schema which may not be what you want.
>
> True. But:
>
> 1) How often are you going to make routines with names that duplicate
> those in the system schema, when you don't want them to be used?

Sure, you want them used when the arguments match, but what about when
they don't exactly?
If the system schema has foo(integer) and in my schema I make a new type
and then make a type(integer) and foo(type), when I call foo(1), do I
really mean do a coersion to my type and call foo(type)?




Re: RFD: schemas and different kinds of Postgres objects

From
Bill Studenmund
Date:
On Wed, 23 Jan 2002, Stephan Szabo wrote:

> On Wed, 23 Jan 2002, Bill Studenmund wrote:
>
> > True. But:
> >
> > 1) How often are you going to make routines with names that duplicate
> > those in the system schema, when you don't want them to be used?
>
> Sure, you want them used when the arguments match, but what about when
> they don't exactly?
> If the system schema has foo(integer) and in my schema I make a new type
> and then make a type(integer) and foo(type), when I call foo(1), do I
> really mean do a coersion to my type and call foo(type)?

Yes, you did. The documentation said that that would happen, so since you
made the call ambiguous, you wanted the coercion to happen. Or at least
you weren't concerned that it might.

Take care,

Bill



Re: RFD: schemas and different kinds of Postgres objects

From
Stephan Szabo
Date:
On Wed, 23 Jan 2002, Bill Studenmund wrote:

> On Wed, 23 Jan 2002, Stephan Szabo wrote:
>
> > On Wed, 23 Jan 2002, Bill Studenmund wrote:
> >
> > > True. But:
> > >
> > > 1) How often are you going to make routines with names that duplicate
> > > those in the system schema, when you don't want them to be used?
> >
> > Sure, you want them used when the arguments match, but what about when
> > they don't exactly?
> > If the system schema has foo(integer) and in my schema I make a new type
> > and then make a type(integer) and foo(type), when I call foo(1), do I
> > really mean do a coersion to my type and call foo(type)?
>
> Yes, you did. The documentation said that that would happen, so since you

It doesn't currently say anything of the sort. If we made the above
behavior the standard, it would, but that's sort of circular. ;) Unless
I'm misreading the page Tom sent me to earlier, it seems to say it
prefers matches with exact types over coercions which would no longer be
true.

> made the call ambiguous, you wanted the coercion to happen. Or at least
> you weren't concerned that it might.

I still disagree.  If I make a complex number type in my schema,
I don't really intend integer+integer to convert to complex and give me a
complex answer even if I want to be able to cast integers into complex.
AFAIK there's no way to specify that I want to make the function
complex(integer) such that I can do CAST(1 as complex) but not as an
implicit cast.



Re: RFD: schemas and different kinds of Postgres objects

From
Bill Studenmund
Date:
On Wed, 23 Jan 2002, Stephan Szabo wrote:

> On Wed, 23 Jan 2002, Bill Studenmund wrote:
>
> > On Wed, 23 Jan 2002, Stephan Szabo wrote:
> >
> > Yes, you did. The documentation said that that would happen, so since you
>
> It doesn't currently say anything of the sort. If we made the above
> behavior the standard, it would, but that's sort of circular. ;) Unless
> I'm misreading the page Tom sent me to earlier, it seems to say it
> prefers matches with exact types over coercions which would no longer be
> true.

The documentation says nothing about schemas at all now, so obviously it
has to change. :-)

> > made the call ambiguous, you wanted the coercion to happen. Or at least
> > you weren't concerned that it might.
>
> I still disagree.  If I make a complex number type in my schema,
> I don't really intend integer+integer to convert to complex and give me a
> complex answer even if I want to be able to cast integers into complex.
> AFAIK there's no way to specify that I want to make the function
> complex(integer) such that I can do CAST(1 as complex) but not as an
> implicit cast.

Note: I've been talking about functions, and you're talking about
operators. While operators are syntactic sugar for functions, one big
difference is that you can't specify explicit schemas for operators (nor
do I think you should be able to). I think exact matches for operators
anywhere in the path would be better than local coercable ones.

Does SQL'99 say anything about this?

Take care,

Bill



Re: RFD: schemas and different kinds of Postgres objects

From
Stephan Szabo
Date:
On Wed, 23 Jan 2002, Bill Studenmund wrote:

What I was getting at was that Tom's behavior (or even mine) is more
similar to the currently described behavior than the suggested one.

> > > made the call ambiguous, you wanted the coercion to happen. Or at least
> > > you weren't concerned that it might.
> >
> > I still disagree.  If I make a complex number type in my schema,
> > I don't really intend integer+integer to convert to complex and give me a
> > complex answer even if I want to be able to cast integers into complex.
> > AFAIK there's no way to specify that I want to make the function
> > complex(integer) such that I can do CAST(1 as complex) but not as an
> > implicit cast.
>
> Note: I've been talking about functions, and you're talking about
> operators. While operators are syntactic sugar for functions, one big
> difference is that you can't specify explicit schemas for operators (nor
> do I think you should be able to). I think exact matches for operators
> anywhere in the path would be better than local coercable ones.

I'd say the same thing for a random math function as well.  For example
if there was a square(int) that returned $1*$1 and I made a square for my
complex type, I'd still expect that square(5) is an integer rather than a
complex using the square(complex).  For example, I'd expect square(5) to
be a valid length argument to substr.

> Does SQL'99 say anything about this?
That I don't know about (don't have a draft around to look at).  I'm not
sure that it'd have these problems though unless it's got the same sort of
coercion system.



Re: RFD: schemas and different kinds of Postgres objects

From
Peter Eisentraut
Date:
Bill Studenmund writes:

> Does SQL'99 say anything about this?

Yes, though, as usual, you have to twist your brain a little to understand
it.  My understanding is that for a function call of the form "foo(a, b)"
it goes like this:

1. Find all functions named "foo" in the current database.  This is the
set of "possibly candidate routines".

2. Drop all routines that you do not have EXECUTE privilege for.  This is
the set of "executable routines".

3. Drop all routines that do not have compatible parameter lists.  This is
the set of "invocable routines".

4. Drop all routines whose schema is not in the path.  This is the set of
"candidate routines".

5. If you have more than one routine left, eliminate some routines
according to type precedence rules.  (We do some form of this, SQL99
specifies something different.)  This yields the set of "candidate subject
routines".

6. Choose the routine whose schema is earliest in the path as the "subject
routine".

Execute the subject routine.  Phew!


This doesn't look glaringly wrong to me, so maybe you want to consider it.
Please note step 2.

-- 
Peter Eisentraut   peter_e@gmx.net



Re: RFD: schemas and different kinds of Postgres objects

From
Bill Studenmund
Date:
On Wed, 23 Jan 2002, Stephan Szabo wrote:

> On Wed, 23 Jan 2002, Bill Studenmund wrote:
>
> What I was getting at was that Tom's behavior (or even mine) is more
> similar to the currently described behavior than the suggested one.

I understand. As part of developing the package changes, though, I found
that Oracle used the method I described for finding routines in packages.

From Peter's description, it sounds like Oracle's not following the spec.

> I'd say the same thing for a random math function as well.  For example
> if there was a square(int) that returned $1*$1 and I made a square for my
> complex type, I'd still expect that square(5) is an integer rather than a
> complex using the square(complex).  For example, I'd expect square(5) to
> be a valid length argument to substr.

Yeah, that makes sense.

> > Does SQL'99 say anything about this?
> That I don't know about (don't have a draft around to look at).  I'm not

Do you want pdfs?

> sure that it'd have these problems though unless it's got the same sort of
> coercion system.

I don't think it has the same sort of coercion, but it has some, I'd
expect (as all of the DBs I know of have some sort of coercion :-)

Take care,

Bill



Re: RFD: schemas and different kinds of Postgres objects

From
Bill Studenmund
Date:
On Thu, 24 Jan 2002, Peter Eisentraut wrote:

> Bill Studenmund writes:
>
> > Does SQL'99 say anything about this?
>
> Yes, though, as usual, you have to twist your brain a little to understand
> it.

Indeed. I find the spec makes the most sense after you understand it. ;-)

>      My understanding is that for a function call of the form "foo(a, b)"
> it goes like this:
>
> 1. Find all functions named "foo" in the current database.  This is the
> set of "possibly candidate routines".
>
> 2. Drop all routines that you do not have EXECUTE privilege for.  This is
> the set of "executable routines".
>
> 3. Drop all routines that do not have compatible parameter lists.  This is
> the set of "invocable routines".
>
> 4. Drop all routines whose schema is not in the path.  This is the set of
> "candidate routines".
>
> 5. If you have more than one routine left, eliminate some routines
> according to type precedence rules.  (We do some form of this, SQL99
> specifies something different.)  This yields the set of "candidate subject
> routines".
>
> 6. Choose the routine whose schema is earliest in the path as the "subject
> routine".
>
> Execute the subject routine.  Phew!

Wow. Thanks for diging this out.

> This doesn't look glaringly wrong to me, so maybe you want to consider it.
> Please note step 2.

It looks fine, and is probably what we should do. Well, I'd do things in a
different order (look only in in-path schemas first for instance), but
that's just trying to optimize the query. :-)

How different are the type coercion rules?

Take care,

Bill



Re: RFD: schemas and different kinds of Postgres objects

From
Bruce Momjian
Date:
Tom Lane wrote:
> "Zeugswetter Andreas SB SD" <ZeugswetterA@spardat.at> writes:
> > When configured for historical behavior would need to:
> > 1. have search path: temp, any, system
> > 2. guard against duplicate table names across all schemas (except temp schema)
> 
> This would be a *whole* lot simpler if we forgot the notion of "any"
> and made the search order look like
> 
>     (temp, private, public, system)
> 
> where the public namespace is world-writable but the private per-user
> ones are (typically at least) not.

[ I am just reading this schema thread now.]

The above private/public idea seems like a much better than 'any'.  That
'any' thing had me quite confused and the idea thought you would have
duplicates that would only be found at runtime seems destined to random
falures.

I assume 'private' above means search in my personal schema/namespace.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: RFD: schemas and different kinds of Postgres objects

From
Bruce Momjian
Date:
> > Or are you thinking about a per session behavior ?
> > I would rather envision a per database behavior.
> > Maybe the easy way out would be a "default creation schema" property for 
> > each user, that would default to the username. If you want everything in one 
> > schema simply alter the users.
> 
> I hadn't really gotten to the point of thinking about exactly what and
> where the control knobs should be.  I suspect you are right that we will
> want the default behavior to be selectable on a per-user or per-database
> basis, which seems to eliminate the option of using GUC (at least in its
> current form).  We could easily add a field to pg_shadow or pg_database
> respectively to determine the default behavior.  It'd be nice though if
> the behavior could be changed after connection by a SET statement, which
> would be lots easier if the setting were GUC-controlled.  Peter, you see
> any way to resolve that?

I think we could set the database default at db creation time, then
allow SET to modify that default per session;  seems flexible enough. 
It is basically a GUC value who's default is stored in pg_database
rather than postgresql.conf.   You could use postgresql.conf to set the
default schema type at db creation time.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: RFD: schemas and different kinds of Postgres objects

From
Peter Eisentraut
Date:
Tom Lane writes:

> It'd be nice though if
> the behavior could be changed after connection by a SET statement, which
> would be lots easier if the setting were GUC-controlled.  Peter, you see
> any way to resolve that?

We had a text[] field to pg_shadow and/or pg_database containing
name=value assignments which are executed just before the session starts.
Doesn't look terribly difficult, and it's something I've always wanted to
do anyway.

-- 
Peter Eisentraut   peter_e@gmx.net



Re: RFD: schemas and different kinds of Postgres objects

From
Gavin Sherry
Date:
On Wed, 23 Jan 2002, Tom Lane wrote:

> If you use only the SQL-defined operations, after setting up any
> configuration variables we may invent in the way we will document as
> necessary for SQL-compatible behavior, then you will get SQL-compatible
> behavior.  I do not think that precludes having an underlying
> implementation that sees the world differently than SQL does and
> supports non-SQL behaviors too.  (For that matter, I'm sure there is
> text somewhere in the spec that points out that the spec intends to
> define user-visible behavior, not implementation.)

This makes a lot of sense and suggests the possibility of 'schema enabled'
databases. That is, a switch 'bool withschemas' (which defaults to
false) could be added to pg_database. If true, the parser and ownership
model reflects that of SQL'99 and/or the Postgres schema model. If false,
the existing 'schema' model is assumed.

This should allow existing users to migrate their data and applications to
7.3 without having to modify either.

Its not an ideal solution but backward compatibility is generally results
in compromise ;).

Gavin



Re: RFD: schemas and different kinds of Postgres objects

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> We [add] a text[] field to pg_shadow and/or pg_database containing
> name=value assignments which are executed just before the session starts.
> Doesn't look terribly difficult, and it's something I've always wanted to
> do anyway.

Seems like a fine idea, with many uses besides this one.
        regards, tom lane


Re: RFD: schemas and different kinds of Postgres objects

From
Bruce Momjian
Date:
Peter Eisentraut wrote:
> Tom Lane writes:
> 
> > It'd be nice though if
> > the behavior could be changed after connection by a SET statement, which
> > would be lots easier if the setting were GUC-controlled.  Peter, you see
> > any way to resolve that?
> 
> We had a text[] field to pg_shadow and/or pg_database containing
> name=value assignments which are executed just before the session starts.
> Doesn't look terribly difficult, and it's something I've always wanted to
> do anyway.

So are thinking of "dbname=schema_type"?  Seems this is really something
that should be in pg_database.  If you create a database, who wants to
edit postgresql.conf to set its default schema type?  Why not set the
GUC value from pg_database?

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: RFD: schemas and different kinds of Postgres objects

From
Peter Eisentraut
Date:
Tom Lane writes:

> There could be multiple valid interpretations.  When you can't even
> figure out where to start, it's too squishy for me.  Code complexity
> isn't really the issue here, it's whether a user can understand what's
> going on.

Here's a tricky question:  In what situations is a.b valid to mean b(a)?
Because in a general object-like system you could write a.b.c.d to mean
d(c(b(a))).  There you've got a system where it's really impossible to
tell anything.  Maybe b() returns a table, so a.b.c.d could mean
subattribute d in column c in the table returned by b(a).

Somehow we need to do at least one of three things:
1. Require parentheses after function calls.
2. Use a different operator to invoke function calls (SQL uses ->).
3. Require users to register functions as "methods" with the data type
before being able to say a.b for b(a).  This also takes care of having to
specify the schema of b because that's declared when you define the
method.

SQL99 does 2 and 3 (but not 1).

I say, forget Oracle.  Oracle doesn't have all the extensibility
functionality that PostgreSQL has.  Let's build a system that is
consistent, orthogonal, and easy to use for *our* users, and those that
want to convert will quickly see the value.

-- 
Peter Eisentraut   peter_e@gmx.net



Re: RFD: schemas and different kinds of Postgres objects

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> Here's a tricky question:  In what situations is a.b valid to mean b(a)?

I defined that in my first message on these issues: the last element
of a dotted-name string can be either a field name or a function
(which is applied to a table that's the next-to-last item).  The
next-to-last element is always a table name.

> Because in a general object-like system you could write a.b.c.d to mean
> d(c(b(a))).

Indeed, that can happen now in Postgres, and as I pointed out we have
to get rid of it.  That doesn't mean we need to eliminate the base case,
however.

> Somehow we need to do at least one of three things:
> 1. Require parentheses after function calls.

Breaks existing code unnecessarily.

> 2. Use a different operator to invoke function calls (SQL uses ->).

Breaks existing code unnecessarily.

> 3. Require users to register functions as "methods" with the data type
> before being able to say a.b for b(a).  This also takes care of having to
> specify the schema of b because that's declared when you define the
> method.

Doesn't buy you anything unless you intend to reject function
overloading too.  With overloading you may have multiple functions
b(something), so you still have to be able to determine what a is
without any context.
        regards, tom lane


Re: RFD: schemas and different kinds of Postgres objects

From
Bruce Momjian
Date:
Peter Eisentraut wrote:
> Tom Lane writes:
> 
> > It'd be nice though if
> > the behavior could be changed after connection by a SET statement, which
> > would be lots easier if the setting were GUC-controlled.  Peter, you see
> > any way to resolve that?
> 
> We had a text[] field to pg_shadow and/or pg_database containing
> name=value assignments which are executed just before the session starts.
> Doesn't look terribly difficult, and it's something I've always wanted to
> do anyway.

Sorry, I see what you are saying now, that the name=value pairs would
set in pg_database and pg_shadow and get executed on session startup. 
Very good.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: RFD: schemas and different kinds of Postgres objects

From
"Zeugswetter Andreas SB SD"
Date:
> Tom Lane wrote:
> 
> > 
> > This would be a *whole* lot simpler if we forgot the notion of "any"
> > and made the search order look like
> > 
> >     (temp, private, public, system)

I am starting to see the advantages and like it. I also like the exact 
name "public" for the public schema.

Andreas


Re: RFD: schemas and different kinds of Postgres objects

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
>> I am starting to see the advantages and like it. I also like the exact 
>> name "public" for the public schema.

> I wonder if we should think about a 'group' area so people in a group
> can create things that others in the group can see, but not people
> outside the group.

I see no reason to hard-wire such a concept.  Given createable
namespaces, ACLs for namespaces, and a settable namespace search path,
people can set up group namespaces or anything else they want.

The (temp, private, public, system) path is suggested as default because
it's the minimum we need to support both SQL92 and backwards-compatible
behaviors.  I don't think we should put in special-purpose features
beyond that, when we can instead offer a general mechanism with which
people can build the special-purpose features they want.
        regards, tom lane


Re: RFD: schemas and different kinds of Postgres objects

From
Bruce Momjian
Date:
Zeugswetter Andreas SB SD wrote:
> 
> > Tom Lane wrote:
> > 
> > > 
> > > This would be a *whole* lot simpler if we forgot the notion of "any"
> > > and made the search order look like
> > > 
> > >     (temp, private, public, system)
> 
> I am starting to see the advantages and like it. I also like the exact 
> name "public" for the public schema.

I wonder if we should think about a 'group' area so people in a group
can create things that others in the group can see, but not people
outside the group.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: RFD: schemas and different kinds of Postgres objects

From
Bill Studenmund
Date:
On Fri, 25 Jan 2002, Gavin Sherry wrote:

> This makes a lot of sense and suggests the possibility of 'schema enabled'
> databases. That is, a switch 'bool withschemas' (which defaults to
> false) could be added to pg_database. If true, the parser and ownership
> model reflects that of SQL'99 and/or the Postgres schema model. If false,
> the existing 'schema' model is assumed.
>
> This should allow existing users to migrate their data and applications to
> 7.3 without having to modify either.
>
> Its not an ideal solution but backward compatibility is generally results
> in compromise ;).

I guess my frustration with this idea is that we don't really need it. We
can achieve the same global namespace for an old app without it. All we
need is a tool which turns old dumps into new ones (which we probably need
anyway) that merges all of the schemas together w/ PATH statements. Or
maybe (new idea) a tool which looks at the schemas in a DB and updates
their PATHs so they act unified.

We can achieve the old behavior w/o having to build it into the backend.
So why add code to the backend when we don't have to? Among other things,
it would complicate the system schema as we'd have to keep track of
ownership values we wouldn't otherwise need to.

Take care,

Bill



Re: RFD: schemas and different kinds of Postgres objects

From
"Zeugswetter Andreas SB SD"
Date:
> > I am starting to see the advantages and like it. I also like the exact 
> > name "public" for the public schema.
> 
> I wonder if we should think about a 'group' area so people in a group
> can create things that others in the group can see, but not people
> outside the group.

A group simply chooses a special schema name for their group.

Maybe an extra in the ACL area so you can grant privs for a whole 
schema.

grant select on schema blabla to "JoeLuser"

Andreas


Re: RFD: schemas and different kinds of Postgres objects

From
Bill Studenmund
Date:
On Thu, 24 Jan 2002, Bruce Momjian wrote:

> I think we could set the database default at db creation time, then
> allow SET to modify that default per session;  seems flexible enough.
> It is basically a GUC value who's default is stored in pg_database
> rather than postgresql.conf.   You could use postgresql.conf to set the
> default schema type at db creation time.

Specifically to the question of schema pathing, why would you want it to
be session-settable? Either your DB app is designed to work w/ schemas, or
it isn't. That's a pretty fundamental design concept. Given that, I don't
see how it can make sense to try to operate in the opposite mode as the
app was designed for - that'll only lead to chaos.

Take care,

Bill



Re: RFD: schemas and different kinds of Postgres objects

From
Tom Lane
Date:
Bill Studenmund <wrstuden@netbsd.org> writes:
> Specifically to the question of schema pathing, why would you want it to
> be session-settable? Either your DB app is designed to work w/ schemas, or
> it isn't.

So that you can set the correct mode for your client application.  It is
silly to suppose that an installation-wide or even database-wide setting
is sufficient.  Consider for example a database shared by multiple
pieces of client software; wouldn't you like to be able to upgrade them
to schema-awareness one at a time?

You could possibly make a case for a single setting per user, but even
that makes an assumption (user == client software) that I think is not
reasonable for us to force on all Postgres installations.

Basically I haven't got a lot of patience for arguments that say we do
not need flexibility.  There are more people out there, using Postgres
in more different ways, than either you or I know about.
        regards, tom lane


Re: RFD: schemas and different kinds of Postgres objects

From
Bill Studenmund
Date:
On Fri, 25 Jan 2002, Tom Lane wrote:

> Bill Studenmund <wrstuden@netbsd.org> writes:
> > Specifically to the question of schema pathing, why would you want it to
> > be session-settable? Either your DB app is designed to work w/ schemas, or
> > it isn't.
>
> So that you can set the correct mode for your client application.  It is
> silly to suppose that an installation-wide or even database-wide setting
> is sufficient.  Consider for example a database shared by multiple
> pieces of client software; wouldn't you like to be able to upgrade them
> to schema-awareness one at a time?

What exactly does it mean to upgrade to schema-awareness? I know the jist
of what you mean, but what does it entail? What steps? I ask as, when I
think of what it means in practical steps, an upgraded app won't have
problems with extra stuff pathed in. So the upgraded app will be fine with
the pathing set up to include all (not-upgraded) schemas. Also, if it does
have a problem with stuff pathed in, since you can easily set the new apps
up to live in different schemas than the old ones, you can have upgraded &
from-before schemas.

> You could possibly make a case for a single setting per user, but even
> that makes an assumption (user == client software) that I think is not
> reasonable for us to force on all Postgres installations.

But we will have the ability to set the path per schema. Since
schema-aware apps should be able to choose which schema they connect to (I
envision it being a connect parameter), the different apps can implicitly
get different behaviors by connecting to schemas that are designed to be
schema-savy, or connecting to ones which aren't (i.e. have all of the
schema-unaware stuff pathed in).

> Basically I haven't got a lot of patience for arguments that say we do
> not need flexibility.  There are more people out there, using Postgres
> in more different ways, than either you or I know about.

Tom, please listen to what I'm saying. I'm trying to be as clear as I can
& making sure I'm not working from details in my head and not my posts.
I'm sorry if it isn't clear but I'm _not_ saying that we don't need the
flexability you describe. We do.

I'm saying that IT IS ALREADY THERE! The pathing built into schemas can be
very powerful. Powerful enough that I haven't heard of an example yet that
can't be taken care of with judicious use of pathing. And I don't think
the pathing needed is beyond mid-level admins (I don't see it as something
which only say 5 people on the lists can get right). Yes, people will have
to learn it, but it doesn't strike me as that hard a thing.

What I am saying is that we don't need the solution you & Bruce mentioned
to get the flexability you mentioned as the reason for adding it. So why
add the feature which isn't needed?

One of my objections to a "mode" supporting the old behavior is that, as I
understand it, there would be a schema ("public") where different users
could own objects in the same schema. That goes against one of the
advantages I see for schemas: we can consolidate ownership info in the
system tables. If you know what schema something is in, you know its
owner. That means that adding schema support doesn't mean growing system
tables, just renaming a column (the user id gets turned into the schema
id).

Maybe that's not such a big deal. But it seems when we're doing things
right, things should get cleaner. Having to keep ownership info at both
the schema level and at the object level strikes me as not making things
cleaner. That just seems to be going in the wrong direction.

Especially as, AFAICT, it wouldn't be hard to let the sysadmins have all
the flexability you want them to have (and also that I agree they should
have) in a system which is, at its core, very schema-savy (everything in
one schema is owned by the same user or group).

I also agree that migration is important. Apps from 7.2 (and 7.1 and
earlier as possible) should run on the schema-savy backend I describe. A
migration tool to take the dump from before and add update schema commands
to path everything in (so it looks like one namespace) should make the old
apps keep working.

The one thing I'll concede could be useful would be for createuser to be
told to automatically set the new user's schema to include all the other
schemas, and to update all the other user schemas to add this user. That
way you can add new users to your DB when you're acting as if it didn't
have schemas.

Hmmm. If we made the above behavior a per-db-configurable default, the
pg_dump file wouldn't need to be changed. That would be good. It would
make the path updates O(nusers^2) rather than O(nusers), but that probably
won't be bad. And offering both options would probably be good.

Take care,

Bill



Re: RFD: schemas and different kinds of Postgres objects

From
Tom Lane
Date:
Bill Studenmund <wrstuden@netbsd.org> writes:
> But we will have the ability to set the path per schema.

?? I don't follow that at all. A namespace is something that's referred
to by a search path, not vice versa.  Or are you defining "schema" to
mean some higher-level concept that incorporates a search path of
multiple primitive namespaces?  Maybe that could work, but I'm not sure
I see the point yet.
        regards, tom lane


Re: RFD: schemas and different kinds of Postgres objects

From
Bill Studenmund
Date:
On Fri, 25 Jan 2002, Tom Lane wrote:

> Bill Studenmund <wrstuden@netbsd.org> writes:
> > But we will have the ability to set the path per schema.
>
> ?? I don't follow that at all. A namespace is something that's referred
> to by a search path, not vice versa.  Or are you defining "schema" to
> mean some higher-level concept that incorporates a search path of
> multiple primitive namespaces?  Maybe that could work, but I'm not sure
> I see the point yet.

Oh. That would make a difference. We've been talking past each other.

SQL schemas, as I understand the spec, are both. A shema is a container
that holds things like tables and views and functions (and for PostgreSQL
operators and aggregates and I'd suggest index operators, etc.). It also
can include a schema path specification, which defines the search path
used by routines (stored procedures & functions) contained in that schema.

So say I have schemas foo, bar, and baz. I can set the schema path for
schema bar to be foo:bar:baz:IMPLIMENTATION_SCHEMA, and all routines in
bar will look in those four schemas for types, functions and tables (and
everything else we use the search path for).

(*) IMPLIMENTATION_SCHEMA is required by the spec, and contains all the
built-ins. It's be implimentation_schema for pg. Also, if you have a path
that doesn't list it, the db is supposed to prepend it to the list.

So when migrating an app from a schema-unaware PostgreSQL to a
schema-aware one, if we create a schema for each user, and make each
such schema path in all the other such schemas, we make it such that all
of the procedures in those schemas act like they have a unified namespace.

There also is also the concept of the CURRENT_PATH which is the schema
path used for parsed queries (like ones typed into psql). I got lost in
the spec trying to find what this is supposed to default to, but what I
understand other DBs to do is your CURRENT_PATH is set to the path of the
schema you log into.

Add to this mix the default schema for user X is schema X (which I thought
was in the spec but I can't find now), and let's look at that example
again.

Say we had users foo, bar and baz before. We made schemas foo, bar, and
baz. We set the default paths for each of these schemas to
foo:bar:baz:IMPLIMENTATION_SCHEMA. Now the routines in each of these
schemas will see a unified namespace. Next, when we log in as users foo,
bar, or baz, and our CURRENT_PATH ends up including the namespaces of the
three original users. So now all of our submitted queries also see a
unified namespace.

So with a schema-savy backend, by adding PATH statements to the schemas
that pull in all of the previous schemas, we can make the old app behave
as if it had a unified namespace.

Does that make sense?

Take care,

Bill

P.S. does anyone need copies of the spec? I found pdf's on the web a while
back..



Re: RFD: schemas and different kinds of Postgres objects

From
Thomas Lockhart
Date:
...
> > Wouldn't it make sense to prefer operators/functions earlier in the search
> > path for resolving ambiguity.  So if you had plus(int4, int4) in my
> > schema and plus(int8, int8) in system, and they'd otherwise cause an
> > ambiguity failure for the query, use the plus(int4, int4) on mine. It
> > seems not too far from having the search path shadow later exact matches.
> Given the complexity of the resolution rules (cf.
> http://developer.postgresql.org/docs/postgres/typeconv.html),
> it's not clear that we can determine exactly which "later" entry ought
> to be blamed for causing a resolution failure.  I'd be interested to
> hear Lockhart's opinion on this --- but my gut feeling is we don't
> want to go there.  The resolution rules are already complicated enough,
> and I think layering an additional mechanism like that onto them might
> make the behavior totally unpredictable.

(I've been following the discussion; I suspect that this part may
already have an obvious answer since "any" scoping -- equivalent to
flattening the namespace? -- may now be out of favor; I'm assuming that
we have a clearly scoped lookup scheme available).

imho there is nothing fundamentally difficult or "unpredictable" about
layering schema lookup on to the existing function resolution rules. One
might want a bit better diagnostics about *which* function was actually
chosen, but reasonable scoping and lookup rules could be constructed
which give reasonable behavior with the addition of schemas.

For example, the current function resolution rules prefer an exact
match, then start looking for approximate matches, and narrow that down
to preferring the one with the best explicit match on data types. If
more than one matches, then it rejects the query. (I've left out one or
two steps, but on the whole this is the behavior that matters.)

With schemas, one could choose to use "closest schema" as the tiebreaker
for multiple matches, but istm that an exact match should always win.

We might want to include a mechanism that *blocks* schema lookups deeper
into the search path, to allow reliable *complete replacement* of a
function. This would be a property of the function, to be set when it is
defined in the schema. So an implementer could choose to restrict
lookups explicitly if that is deemed necessary. Again, this is not a
huge complication.

It is an interesting discussion, and the fine points will not be brought
out without having lots of back-and-forth, which seems to be happening
already ;)
                 - Thomas


Re: RFD: schemas and different kinds of Postgres objects

From
Hiroshi Inoue
Date:
Is *the path* below the same as "search path* in other
postings about this thread ?

Maybe Peter's posting isn't the one exactly what I have to
ask but there are too many postings for me to follow.

regards,
Hiroshi Inoue

Peter Eisentraut wrote:
> 
> Bill Studenmund writes:
> 
> > Does SQL'99 say anything about this?
> 
> Yes, though, as usual, you have to twist your brain a little to understand
> it.  My understanding is that for a function call of the form "foo(a, b)"
> it goes like this:
> 
> 1. Find all functions named "foo" in the current database.  This is the
> set of "possibly candidate routines".
> 
> 2. Drop all routines that you do not have EXECUTE privilege for.  This is
> the set of "executable routines".
> 
> 3. Drop all routines that do not have compatible parameter lists.  This is
> the set of "invocable routines".
> 
> 4. Drop all routines whose schema is not in the path.  This is the set of
> "candidate routines".
> 
> 5. If you have more than one routine left, eliminate some routines
> according to type precedence rules.  (We do some form of this, SQL99
> specifies something different.)  This yields the set of "candidate subject
> routines".
> 
> 6. Choose the routine whose schema is earliest in the path as the "subject
> routine".
> 
> Execute the subject routine.  Phew!
> 
> This doesn't look glaringly wrong to me, so maybe you want to consider it.
> Please note step 2.
> 
> --
> Peter Eisentraut   peter_e@gmx.net


Re: RFD: schemas and different kinds of Postgres objects

From
Bill Studenmund
Date:
On Mon, 28 Jan 2002, Hiroshi Inoue wrote:

> Is *the path* below the same as "search path* in other
> postings about this thread ?

I think so. I believe the path I've been talking about is the one in step
6 below.

Take care,

Bill

> Maybe Peter's posting isn't the one exactly what I have to
> ask but there are too many postings for me to follow.
>
> regards,
> Hiroshi Inoue
>
> Peter Eisentraut wrote:
> >
> > Bill Studenmund writes:
> >
> > > Does SQL'99 say anything about this?
> >
> > Yes, though, as usual, you have to twist your brain a little to understand
> > it.  My understanding is that for a function call of the form "foo(a, b)"
> > it goes like this:
> >
> > 1. Find all functions named "foo" in the current database.  This is the
> > set of "possibly candidate routines".
> >
> > 2. Drop all routines that you do not have EXECUTE privilege for.  This is
> > the set of "executable routines".
> >
> > 3. Drop all routines that do not have compatible parameter lists.  This is
> > the set of "invocable routines".
> >
> > 4. Drop all routines whose schema is not in the path.  This is the set of
> > "candidate routines".
> >
> > 5. If you have more than one routine left, eliminate some routines
> > according to type precedence rules.  (We do some form of this, SQL99
> > specifies something different.)  This yields the set of "candidate subject
> > routines".
> >
> > 6. Choose the routine whose schema is earliest in the path as the "subject
> > routine".
> >
> > Execute the subject routine.  Phew!
> >
> > This doesn't look glaringly wrong to me, so maybe you want to consider it.
> > Please note step 2.
> >
> > --
> > Peter Eisentraut   peter_e@gmx.net
>



Re: RFD: schemas and different kinds of Postgres objects

From
Hiroshi Inoue
Date:
Bill Studenmund wrote:
> 
> On Mon, 28 Jan 2002, Hiroshi Inoue wrote:
> 
> > Is *the path* below the same as "search path* in other
> > postings about this thread ?
> 
> I think so. I believe the path I've been talking about is the one in step
> 6 below.

What I can find in SQL99 is SQL-path.
Does *the path*(i.e search path) mean SQL-path ?
They don't seem the same to me.

regards,
Hiroshi Inoue


Re: RFD: schemas and different kinds of Postgres objects

From
Bill Studenmund
Date:
On Tue, 29 Jan 2002, Hiroshi Inoue wrote:

> Bill Studenmund wrote:
> >
> > On Mon, 28 Jan 2002, Hiroshi Inoue wrote:
> >
> > > Is *the path* below the same as "search path* in other
> > > postings about this thread ?
> >
> > I think so. I believe the path I've been talking about is the one in step
> > 6 below.
>
> What I can find in SQL99 is SQL-path.
> Does *the path*(i.e search path) mean SQL-path ?
> They don't seem the same to me.

While we may have not been using the terminology of the spec, I think we
have been talking about schema paths from SQL99.

One difference between our discussions and SQL99 I've noticed is that
we've spoken of having the path find functions (and operators and
aggregates), types, _and_tables_. SQL99 doesn't have tables in there
AFAICT, but I think it makes sense.

Take care,

Bill



Re: RFD: schemas and different kinds of Postgres objects

From
Hiroshi Inoue
Date:
Bill Studenmund wrote:
> 
> On Tue, 29 Jan 2002, Hiroshi Inoue wrote:
> 
> > What I can find in SQL99 is SQL-path.
> > Does *the path*(i.e search path) mean SQL-path ?
> > They don't seem the same to me.
> 
> While we may have not been using the terminology of the spec, I think we
> have been talking about schema paths from SQL99.
> 
> One difference between our discussions and SQL99 I've noticed is that
> we've spoken of having the path find functions (and operators and
> aggregates), types, _and_tables_.

My understanding is the same.
Tom, Peter is it right ?

> SQL99 doesn't have tables in there
> AFAICT, but I think it makes sense.

It seems to make sense but they are different and
our *path* is never an extension of SQL-path.
Where are the difference or the relevance referred
to in this thread ?

regards,
Hiroshi Inoue


Re: RFD: schemas and different kinds of Postgres objects

From
Bill Studenmund
Date:
On Wed, 30 Jan 2002, Hiroshi Inoue wrote:

> Bill Studenmund wrote:
> >
> > On Tue, 29 Jan 2002, Hiroshi Inoue wrote:
> > SQL99 doesn't have tables in there
> > AFAICT, but I think it makes sense.
>
> It seems to make sense but they are different and
> our *path* is never an extension of SQL-path.
> Where are the difference or the relevance referred
> to in this thread ?

How is our path not an extention of SQL-path? Or at least how is the path
I've been pushing not an SQL-path?

Take care,

Bill



Re: RFD: schemas and different kinds of Postgres objects

From
Tom Lane
Date:
[ just catching up on this thread after a couple days thinking about
other things ]

Stephan Szabo <sszabo@megazone23.bigpanda.com> writes:
> AFAIK there's no way to specify that I want to make the function
> complex(integer) such that I can do CAST(1 as complex) but not as an
> implicit cast.

You may have forgotten that I recently suggested adding just such a
feature, ie a boolean flag on pg_proc rows to indicate whether they can
be considered for implicit casts.  I think we'd agreed that it would be
a good thing to do in 7.3.

However, that doesn't bear very much on the general argument of the
thread.  The bottom line is that we've put a whole lot of sweat into
developing rules for resolving ambiguous operator and function calls,
and I don't think we're going to be willing to toss all that effort into
the scrap heap.  But making namespace search order the dominant factor
in choosing a function/operator (as Bill seems to want) would certainly
break all that carefully-crafted effort.  If we put the system namespace
at the front of the search list then users would be unable to override
standard operators with schema-local substitutes; clearly that's no
good.  But if we put it at the back, then a schema-local user operator
would dominate all system entries of the same operator name, even for
quite different types, and thereby it would break the resolution
behavior.

So I'm still of the opinion that my original suggestion is the only
workable one: collect candidates across all available namespaces,
discarding only those that are exact matches to candidates in earlier
namespaces, and then apply the existing resolution rules to the
collection.  AFAICS this need not be any slower than what we do now,
if the catalog is set up so that we can collect candidates in one
indexscan without regard to namespace.  The case where there actually
are any exact matches to discard should be uncommon, so we can deal with
it later on in the resolution process.
        regards, tom lane


Re: RFD: schemas and different kinds of Postgres objects

From
Bill Studenmund
Date:
On Wed, 30 Jan 2002, Tom Lane wrote:

> [ just catching up on this thread after a couple days thinking about
> other things ]
>
> However, that doesn't bear very much on the general argument of the
> thread.  The bottom line is that we've put a whole lot of sweat into
> developing rules for resolving ambiguous operator and function calls,
> and I don't think we're going to be willing to toss all that effort into
> the scrap heap.  But making namespace search order the dominant factor
> in choosing a function/operator (as Bill seems to want) would certainly
> break all that carefully-crafted effort.  If we put the system namespace
> at the front of the search list then users would be unable to override
> standard operators with schema-local substitutes; clearly that's no
> good.  But if we put it at the back, then a schema-local user operator
> would dominate all system entries of the same operator name, even for
> quite different types, and thereby it would break the resolution
> behavior.

I've changed my mind. :-)

> So I'm still of the opinion that my original suggestion is the only
> workable one: collect candidates across all available namespaces,
> discarding only those that are exact matches to candidates in earlier
> namespaces, and then apply the existing resolution rules to the
> collection.  AFAICS this need not be any slower than what we do now,
> if the catalog is set up so that we can collect candidates in one
> indexscan without regard to namespace.  The case where there actually
> are any exact matches to discard should be uncommon, so we can deal with
> it later on in the resolution process.

Sounds like the thing to do, and it matches the spec. :-)

Oh, you can make a path with your namespace before the built-in one. It's
just that if you don't include the built-in one (IMPLIMENTATION_SCHEMA) in
a path, you're supposed to prepend it to the specified list.

Take care,

Bill




Re: RFD: schemas and different kinds of Postgres objects

From
Tom Lane
Date:
Hiroshi Inoue <Inoue@tpf.co.jp> writes:
> Bill Studenmund wrote:
>> While we may have not been using the terminology of the spec, I think we
>> have been talking about schema paths from SQL99.
>> 
>> One difference between our discussions and SQL99 I've noticed is that
>> we've spoken of having the path find functions (and operators and
>> aggregates), types, _and_tables_.

> My understanding is the same.
> Tom, Peter is it right ?

SQL99's SQL-path is very clearly stated to be used only for looking up
routines and user-defined type names.  Extending it to cover tables,
operators, and so forth makes sense to me, but we have to recognize
that it is a spec extension and therefore not all the answers we need
can be found in the spec.

I also find it curious that they exclude standard type names from the
search path.  It would seem obvious to treat the standard type names
as included in a schema that is part of the search path, but AFAICT
this is not done in the spec.  Postgres *has to* do it that way,
however, or give up our whole approach to datatypes; surely we don't
want to hardwire the SQL-standard datatypes into the parser to the
exclusion of the not-so-standard ones.

IMHO, the spec's artificial distinction between system and user types
limits its usefulness as a guide to the questions we're debating here.
        regards, tom lane


Re: RFD: schemas and different kinds of Postgres objects

From
Bill Studenmund
Date:
On Wed, 30 Jan 2002, Tom Lane wrote:

> Hiroshi Inoue <Inoue@tpf.co.jp> writes:
> > Bill Studenmund wrote:
> >> While we may have not been using the terminology of the spec, I think we
> >> have been talking about schema paths from SQL99.
> >>
> >> One difference between our discussions and SQL99 I've noticed is that
> >> we've spoken of having the path find functions (and operators and
> >> aggregates), types, _and_tables_.
>
> > My understanding is the same.
> > Tom, Peter is it right ?
>
> SQL99's SQL-path is very clearly stated to be used only for looking up
> routines and user-defined type names.  Extending it to cover tables,
> operators, and so forth makes sense to me, but we have to recognize
> that it is a spec extension and therefore not all the answers we need
> can be found in the spec.

True. I think that extending the path to be used for operators and
aggregates makes sense as they are special types of function calls. The
searching for tables might need to be a configurable parameter (defaulting
to yes), though. I think it makes sense to do, but I can imagine cases
where apps need to not.

> I also find it curious that they exclude standard type names from the
> search path.  It would seem obvious to treat the standard type names
> as included in a schema that is part of the search path, but AFAICT
> this is not done in the spec.  Postgres *has to* do it that way,
> however, or give up our whole approach to datatypes; surely we don't
> want to hardwire the SQL-standard datatypes into the parser to the
> exclusion of the not-so-standard ones.
>
> IMHO, the spec's artificial distinction between system and user types
> limits its usefulness as a guide to the questions we're debating here.

True.

Does SQL99 support types as flexable as the ones we do? I know types in
Oracle are basically special cases of already built-in ones...

Take care,

Bill



Re: RFD: schemas and different kinds of Postgres objects

From
Hiroshi Inoue
Date:
Bill Studenmund wrote:
> 
> On Wed, 30 Jan 2002, Hiroshi Inoue wrote:
> 
> > Bill Studenmund wrote:
> > >
> > > On Tue, 29 Jan 2002, Hiroshi Inoue wrote:
> > > SQL99 doesn't have tables in there
> > > AFAICT, but I think it makes sense.
> >
> > It seems to make sense but they are different and
> > our *path* is never an extension of SQL-path.
> > Where are the difference or the relevance referred
> > to in this thread ?
> 
> How is our path not an extention of SQL-path? Or at least how is the path
> I've been pushing not an SQL-path?

IMHO _tables_like objects must be guarded from such
a search mechanism fundamentally. I don't object to
the use of our *path* but it should be distinguished
from SQL-path.

For example the PATH environment variable is used
only to search executables not files. Is it
preferable for *rm a_file* to search all the directory
in the PATH ? If the purpose is different the different
*path* is needed of cource. 

regards,
Hiroshi Inoue


Re: RFD: schemas and different kinds of Postgres objects

From
Hiroshi Inoue
Date:
Tom Lane wrote:
> 
> Hiroshi Inoue <Inoue@tpf.co.jp> writes:
> > Bill Studenmund wrote:
> >> While we may have not been using the terminology of the spec, I think we
> >> have been talking about schema paths from SQL99.
> >>
> >> One difference between our discussions and SQL99 I've noticed is that
> >> we've spoken of having the path find functions (and operators and
> >> aggregates), types, _and_tables_.
> 
> > My understanding is the same.
> > Tom, Peter is it right ?
> 
> SQL99's SQL-path is very clearly stated to be used only for looking up
> routines and user-defined type names.  Extending it to cover tables,
> operators, and so forth makes sense to me,

I have no objection to the point it makes sense to use
such *path*s internally but I think it also has a siginificance
for SQL-path to not look up _tables_like objects. 
I think they are different from the first and we should(need)
not manage the system with one *path*.

BTW I see few references to *catalog*. Would the concept
of catalog be introduced together. If so what would be
contained in the current database.

regards,
Hiroshi Inoue


Re: RFD: schemas and different kinds of Postgres objects

From
Tom Lane
Date:
Hiroshi Inoue <Inoue@tpf.co.jp> writes:
> I have no objection to the point it makes sense to use
> such *path*s internally but I think it also has a siginificance
> for SQL-path to not look up _tables_like objects. 
> I think they are different from the first and we should(need)
> not manage the system with one *path*.

I'm unconvinced.  We must search for datatypes and tables on the same
path because tables have associated datatypes; it will definitely not
do to look for a table's datatype and get the wrong type.  And I think
that functions and operators should be looked for on the same path
as datatypes, because a type should be pretty closely associated with
the functions/operators for it.  So it seems to me that the apparent
flexibility of having more than one path is just a way to shoot yourself
in the foot.  Why are you concerned that we keep them separate?

> BTW I see few references to *catalog*. Would the concept
> of catalog be introduced together. If so what would be
> contained in the current database.

My thought is that we will consider catalog == database.  As far as
I can tell, that is a legitimate implementation-defined way of
interpreting the spec.  (It's not clear to me what the value is of
having more than one level of schema hierarchy; or at least, if you want
hierarchical namespaces, there's no argument for stopping at depth two.
But I digress.)  To satisfy the spec we must allow a (purely decorative)
specification of the current database name as the catalog level of a
qualified name, but that's as far as I want to go.  In this round,
anyway.  Cross-database access is not something to tackle for 7.3.
        regards, tom lane


Re: RFD: schemas and different kinds of Postgres objects

From
Hiroshi Inoue
Date:
Tom Lane wrote:
> 
> Hiroshi Inoue <Inoue@tpf.co.jp> writes:
> > I have no objection to the point it makes sense to use
> > such *path*s internally but I think it also has a siginificance
> > for SQL-path to not look up _tables_like objects.
> > I think they are different from the first and we should(need)
> > not manage the system with one *path*.
> 
> I'm unconvinced.  We must search for datatypes and tables on the same
> path because tables have associated datatypes;

Isn't the table definition a part of the datatype in
such a case ?

> it will definitely not
> do to look for a table's datatype and get the wrong type.  And I think
> that functions and operators should be looked for on the same path
> as datatypes, because a type should be pretty closely associated with
> the functions/operators for it.  So it seems to me that the apparent
> flexibility of having more than one path is just a way to shoot yourself
> in the foot.  Why are you concerned that we keep them separate?

For example, doesn't 'DROP table a_table' drop the 
a_table table in a schema in the *path* if there's
no a_table table in the current schema ?

If we would never introduce SQL-paths (in the future)
there would be problem.

regards,
Hiroshi Inoue


Re: RFD: schemas and different kinds of Postgres objects

From
Tom Lane
Date:
Hiroshi Inoue <Inoue@tpf.co.jp> writes:
> Just a confirmation.
> We can't see any catalog.schema.object notation in 7.3,
> can we ?

No, what I meant was you could write catalog.schema.object --- but for
7.3, the system will only accept it if the catalog name is the same as
the current database.  This satisfies the minimum requirements of the
spec, and it leaves notational room to use the catalog name as the cue
for cross-database access, if we ever decide we want to try to do that.
        regards, tom lane


Re: RFD: schemas and different kinds of Postgres objects

From
Tom Lane
Date:
Hiroshi Inoue <Inoue@tpf.co.jp> writes:
> For example, doesn't 'DROP table a_table' drop the 
> a_table table in a schema in the *path* if there's
> no a_table table in the current schema ?

Sure.  And that's exactly what it should do, IMHO.
Otherwise the notion that you can ignore your private
schema (at the front of the path) if you're not using
it falls down.  Also, we wouldn't be able to implement
temp tables via a backend-local schema at the front of
the path.

Any security concerns here should be addressed by putting
ACLs on the schemas you don't want altered; not by contorting
the notion of a search path to work for some operations and
not others.
        regards, tom lane


Re: RFD: schemas and different kinds of Postgres objects

From
Hiroshi Inoue
Date:
Tom Lane wrote:
> 
> Hiroshi Inoue <Inoue@tpf.co.jp> writes:
> 
> > BTW I see few references to *catalog*. Would the concept
> > of catalog be introduced together. If so what would be
> > contained in the current database.
> 
> My thought is that we will consider catalog == database.  As far as
> I can tell, that is a legitimate implementation-defined way of
> interpreting the spec.  (It's not clear to me what the value is of
> having more than one level of schema hierarchy; or at least, if you want
> hierarchical namespaces, there's no argument for stopping at depth two.
> But I digress.)  To satisfy the spec we must allow a (purely decorative)
> specification of the current database name as the catalog level of a
> qualified name, but that's as far as I want to go.  In this round,
> anyway.  Cross-database access is not something to tackle for 7.3.

Just a confirmation.
We can't see any catalog.schema.object notation in 7.3,
can we ?

regards,
Hiroshi Inoue


Re: RFD: schemas and different kinds of Postgres objects

From
Hiroshi Inoue
Date:
Tom Lane wrote:
> 
> Hiroshi Inoue <Inoue@tpf.co.jp> writes:
> > For example, doesn't 'DROP table a_table' drop the
> > a_table table in a schema in the *path* if there's
> > no a_table table in the current schema ?
> 
> Sure.  And that's exactly what it should do, IMHO.
> Otherwise the notion that you can ignore your private
> schema (at the front of the path) if you're not using
> it falls down.  Also, we wouldn't be able to implement
> temp tables via a backend-local schema at the front of
> the path.

I don't think it's useful for tables other than temp
ones and I wouldn't use it other than for temp ones.

When we type 'rm a_file' in a shell environment
does the *rm* command search the PATH in finding
the a_file file ? Even though we need to implement
such a search mechanism we would use another path
different from the executable search PATH. I don't
think our *path* is an extension of SQL-path.

I wouldn't complain unless we call the *path*
as SQL-path or an extension of SQL-path.

regards,
Hiroshi Inoue


Re: RFD: schemas and different kinds of Postgres objects

From
Bill Studenmund
Date:
On Thu, 31 Jan 2002, Hiroshi Inoue wrote:

> Tom Lane wrote:
> >
> > Hiroshi Inoue <Inoue@tpf.co.jp> writes:
> > SQL99's SQL-path is very clearly stated to be used only for looking up
> > routines and user-defined type names.  Extending it to cover tables,
> > operators, and so forth makes sense to me,
>
> I have no objection to the point it makes sense to use
> such *path*s internally but I think it also has a siginificance
> for SQL-path to not look up _tables_like objects.
> I think they are different from the first and we should(need)
> not manage the system with one *path*.

I'm confused. Are you suggesting multiple paths? i.e. a function/type path
and a table one?

I think calling our path an SQL path is fine. Yes, we extend it by using
it for tables too, but it strikes me as still fundamentally an SQL path.
So I don't see why we should not call it that.

Take care,

Bill



Re: RFD: schemas and different kinds of Postgres objects

From
Bill Studenmund
Date:
On Thu, 31 Jan 2002, Hiroshi Inoue wrote:

> Tom Lane wrote:
> >
> > Hiroshi Inoue <Inoue@tpf.co.jp> writes:
> > > For example, doesn't 'DROP table a_table' drop the
> > > a_table table in a schema in the *path* if there's
> > > no a_table table in the current schema ?
> >
> > Sure.  And that's exactly what it should do, IMHO.
> > Otherwise the notion that you can ignore your private
> > schema (at the front of the path) if you're not using
> > it falls down.  Also, we wouldn't be able to implement
> > temp tables via a backend-local schema at the front of
> > the path.
>
> I don't think it's useful for tables other than temp
> ones and I wouldn't use it other than for temp ones.

I agree.

> When we type 'rm a_file' in a shell environment
> does the *rm* command search the PATH in finding
> the a_file file ? Even though we need to implement
> such a search mechanism we would use another path
> different from the executable search PATH. I don't
> think our *path* is an extension of SQL-path.
>
> I wouldn't complain unless we call the *path*
> as SQL-path or an extension of SQL-path.

I still don't get this. The path we're talking about is the same thing
(with the same envirnment name and operational syntax) as SQL-paths,
except that we use it to find tables too. Why does that make it not an SQL
path?

Take care,

Bill



Re: RFD: schemas and different kinds of Postgres objects

From
Bill Studenmund
Date:
On Wed, 30 Jan 2002, Tom Lane wrote:

> Hiroshi Inoue <Inoue@tpf.co.jp> writes:
> > For example, doesn't 'DROP table a_table' drop the
> > a_table table in a schema in the *path* if there's
> > no a_table table in the current schema ?
>
> Sure.  And that's exactly what it should do, IMHO.
> Otherwise the notion that you can ignore your private
> schema (at the front of the path) if you're not using
> it falls down.  Also, we wouldn't be able to implement
> temp tables via a backend-local schema at the front of
> the path.

Well, I disagree on this one. :-) I'd vote drop should need a specific
schema if it's not the current one. But I won't push the point. :-)

> Any security concerns here should be addressed by putting
> ACLs on the schemas you don't want altered; not by contorting
> the notion of a search path to work for some operations and
> not others.

I'm not so concerned about security as being sure of operator intent. ACLs
address security (and should be used), but they don't address making sure
you delete exactly what you wanted.

Take care,

Bill



Re: RFD: schemas and different kinds of Postgres objects

From
Bill Studenmund
Date:
On Thu, 31 Jan 2002, Hiroshi Inoue wrote:

> > it will definitely not
> > do to look for a table's datatype and get the wrong type.  And I think
> > that functions and operators should be looked for on the same path
> > as datatypes, because a type should be pretty closely associated with
> > the functions/operators for it.  So it seems to me that the apparent
> > flexibility of having more than one path is just a way to shoot yourself
> > in the foot.  Why are you concerned that we keep them separate?
>
> For example, doesn't 'DROP table a_table' drop the
> a_table table in a schema in the *path* if there's
> no a_table table in the current schema ?
>
> If we would never introduce SQL-paths (in the future)
> there would be problem.

??

We're talking about adding them now. Why would we add them twice?

Take care,

Bill



Re: RFD: schemas and different kinds of Postgres objects

From
Hiroshi Inoue
Date:
Bill Studenmund wrote:
> 
> On Thu, 31 Jan 2002, Hiroshi Inoue wrote:
> >
> > I wouldn't complain unless we call the *path*
> > as SQL-path or an extension of SQL-path.
> 
> I still don't get this. The path we're talking about is the same thing
> (with the same envirnment name and operational syntax) as SQL-paths,
> except that we use it to find tables too. Why does that make it not an SQL
> path?

I don't think It's always good to follow the standard.
However it's very wrong to change the meaning of words
in the standard. It seems impossible to introduce SQL-path
using our *path*. The *path* is PostgreSQL specific and
it would be configurable for us to be SQL99-compatible
(without SQL-path) or SQL99-imcompatible using the *path*.

regards,
Hiroshi Inoue