RFD: schemas and different kinds of Postgres objects - Mailing list pgsql-hackers

From Tom Lane
Subject RFD: schemas and different kinds of Postgres objects
Date
Msg-id 11381.1011648592@sss.pgh.pa.us
Whole thread Raw
Responses Re: RFD: schemas and different kinds of Postgres objects
Re: RFD: schemas and different kinds of Postgres objects
List pgsql-hackers
Continuing to think about implementing SQL schemas for 7.3 ...

Today's topic for discussion: which types of Postgres objects should
belong to schemas, and which ones should have other name scopes?

Relations (tables, indexes, views, sequences) clearly belong to schemas.
Since each relation has an associated datatype with the same name, it
seems that datatypes must belong to schemas as well.  (Even if that
argument doesn't convince you, SQL99 says that user-defined datatypes
belong to schemas.)  However the situation is murkier for other kinds of
objects.

Here are all the kinds of named objects that exist in Postgres today,
with some comments on whether they should belong to schemas or not:

relations        Must be in schemas
types            Must be in schemas
databases        Databases contain schemas, not vice versa
users            Users are cross-database, so not in schemas
groups            User groups are cross-database, so not in schemas
languages        Probably should not be in schemas
access methods        Probably should not be in schemas
opclasses        See below
operators        See below
functions/procedures    See below
aggregates        Should treat same as regular functions
constraints        See below
rules            See below
triggers        See below
NOTIFY conditions    See below

Languages and access methods are not trivial to add to the system, so
there's not much risk of name conflicts, and no reason to make their name
scope less than global.

The situation is a lot murkier for operators and functions.  These should
probably be treated alike, since operators are just syntactic sugar for
functions.  I think the basic argument for making them schema-local is
that different users might conceivably want to define conflicting
functions or operators of the same name.  Against that, however, there
are a number of reasons for wanting to keep these objects database-wide.
First off there are syntactic problems.  Do you really want to write    A schemaname.+ B
to qualify an ambiguous "+" operator?  Looks way too much like a syntax
error to me.  Allowing this would probably turn a lot of simple syntax
errors into things that get past the grammar and end up producing truly
confusing error messages.  Qualified function names also pose some
problems, not so much with    schemaname.function(args)
which seems reasonable, but with the Berkeley-derived syntax that allows
"foo.function" to mean "function(foo)" --- there's no way to squeeze a
schema-name for the function into that.  (And you'll recall from my note
of the other day that we don't want to abandon this syntax entirely,
because people would like us to support "sequencename.nextval" for Oracle
compatibility.)  Notice that we are not forced to make functions/operators
schema-local just because datatypes are, because overloading will save the
day.  func(schema1.type1) and func(schema2.type1) are distinct functions
because the types are different, even if they live in the same function
namespace.  Finally, SQL99 doesn't appear to think that operator and
function names are schema-local; though that may just be because it hasn't
got user-defined operators AFAICT.

I am leaning towards keeping functions/operators database-wide, but would
like to hear comments.  Is there any real value in, eg, allowing different
users to define different "+" operators *on the same datatypes*?

Not sure about index opclasses.  Given that datatype names are
schema-local, one can think of scenarios where two users define similar
datatypes and then try to use the same index opclass name for both.
But it seems pretty unlikely.  I'd prefer to leave opclass names
database-wide for simplicity.  Comments?

As for constraints, currently we only support table-level constraints,
and we do not enforce any uniqueness at all on their names; multiple
constraints for the same table can have the same name (and if so, ALTER
TABLE DROP CONSTRAINT drops all matching names).  SQL92 requires named
constraints to have names that are unique within their schema, which is
okay for standalone assertions (which we haven't got) but seems quite
unnecessary for constraints attached to tables.  And what's really odd,
it appears to allow a table constraint to belong to a different schema
than the table it is on!  This is pretty bogus.  I'd prefer to ignore the
part of the spec that says that table constraint names can be qualified
names, and either keep our existing behavior or require constraint names
to be unique per-table.  Thoughts?

Rewrite rules are currently required to have a name unique within their
database.  We clearly don't want that to still be true in the schema
environment.  Personally I would like to make rules' names unique only
among rules on the same table (like we handle triggers).  That would
mean an incompatible change in the syntax of DROP RULE: it'd have to
become DROP RULE rule ON table, much like DROP TRIGGER.  Is that okay?
If not, probably we must make rulenames local to schemas and say they
implicitly belong to the schema of the associated table.

Triggers are already handled as being named uniquely among the triggers
of the same table.  This behavior is fine with me, and doesn't need to
be changed for schema support.

I can see some advantage to considering NOTIFY condition names to be local
to a schema, but I can also see risks of breaking existing applications.
Currently, "NOTIFY foo" will signal to "LISTEN foo" even if the two
processes belong to different users.  With an implicit schema qualifier
attached to foo, very likely this would fail to work.  Since NOTIFY names
aren't officially registered anywhere, the implicit qualifier would have
to correspond to the front schema of one's schema search path, and there'd
be no way for such processes to communicate if their search paths didn't
match.  I think people would end up always qualifying NOTIFY names with
a single schema name, which means we might as well continue to consider
them global.  On the other hand, if you assume that NOTIFY names are often
the names of tables, it'd make sense to allow them to be qualified.  Any
thoughts on this?
        regards, tom lane


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: adding regression tests for PLs
Next
From: Bruce Momjian
Date:
Subject: Re: [GENERAL] PostgreSQL Licence: GNU/GPL