Thread: Re: schema support, was Package support for Postgres

Re: schema support, was Package support for Postgres

From
Bill Studenmund
Date:
On Wed, 24 Oct 2001, Peter Eisentraut wrote:

> Bill Studenmund writes:
>
> > So I am a "naive" programmer because I mention intent above?
>
> No.

Sorry, that's the way it came across. As you've said that was not your
intent, please disregard my response; I was responding to something you
did not mean.

> > So if we have INFORMATION_SCHEMA with the right vies in it, we are fine
> > doing whatever we want.
>
> I think some interpretation of the SQL standard can be used to prove that
> a new schema should not contain any objects.  So you're going to have to
> stick to the two predefined schemas to put the system catalogs in.  Then
> again, other interpretations may be used to prove other things.  But to me
> the intent of the standard is clear that system catalogs are meant to go
> into the defintion schema, and I don't see a reason why this could not be
> so.

I had been thining that we could have the built-in objects (functions,
types, operators, etc.) in whatever was the "default.master" package, but
it looks like SQL99 doesn't like that. You're right that built-in things
have to be in a different schema than user-added things.

Section 10.4 contains text:

ii) If RN contains a <schema name> SN, then

Case:

1) If SN is INFORMATION_SCHEMA, then the single candidate routine of RI is
the built-in function identified by <routine name>.

Actually 4.24 is more exact. It defines a built-in function as a routine
which is returned from teh query:

SELECT DISTINCT ROUTINE_NAME
FROM INFORMATION_SCHEMA.ROUTINES
WHERE SPECIFIC_SCHEMA = INFORMATION_SCHEMA

Actually, since we have to have an INFORMATION_SCHEMA, and
"INFORMATION_SCHEMA" gets thrown around a lot, I think it'd be easiest to
make "INFORMATION_SCHEMA" the schema containing built-in things. Otherwise
(among other things) we have to replace DEFINTION_SCHEMA with
INFORMATION_SCHEMA in the above-defined view (and in a lot of other
places).

Thoughts?

> > I stil think we can't do that, since someone other than the schema owner
> > can add a package to a schema. :-) Or at least that's the assumption I'm
> > running on; we allow users other than PGUID to create functions (and
> > operators and aggregates and types) in the default (whatever it will be
> > called) schema, so why shouldn't they be allowed to add packages?
>
> Because SQL says so.  All objects in a schema belong to the owner of the
> schema.  In simple setups you have one schema per user with identical
> names.  This has well-established use patterns in other SQL RDBMS.

Then implimenting schemas will cause a backwards-incompatabile change
regarding who can add/own functions (and operators and ..).

Mainly because when we introduce schemas, all SQL transactions will have
to be performed in the context of *some* schema. I think "DEFAULT" was the
name you mentioned for when there was no schema matching the username. As
"DEFAULT" (or whatever we call it) will be made by the PG super user (it
will actually be added as part of initdb), then that means that only the
super user will own functions. That's not how things are now, and imposing
that on upgrading users will likely cause pain.

Think about a dump/restore upgrade from 7.2 to 7.3. Right now users other
than PGUID can own functions (and triggers, etc.). When you do the
restore, though, since your dump had no schema support, it all goes into
DEFAULT. Which will be owned by PGUID. So now we either have a schema with
things owned by a user other than the schema owner, or we have a broken
restore.

Or we have to special case the DEFAULT schema. Which strikes me as a bad
thing to do.

For now, I'd suggest letting users other than a schema owner own things in
a schema, and later on add controls over who can add things to a schema.
Then when you do a "CREATE SCHEMA" command, you will implicitly be adding
restrictions prohibiting someone other than the owner from adding things
(including packages/subschemas).

> I agree that this might not be what everyone would want, but it seems
> extensible.  However, I feel we're trying to design too many things at
> once.  Let's do schemas first the way they're in the SQL standard, and
> then we can try to tack on ownership or subschemas or package issues.

Well, the packages changes can easily be turned into schema support for
functions and aggregates, so we are part way there. Also, the packages
changes illustrate how to make system-wide internal schema changes of the
type adding SQL schemas will need. Plus, packages as they are now are
useful w/o schema support.

And there's the fact that schemas were wanted for 7.2, and didn't happen.
Withouth external adgitation, will they happen for 7.3? Given the size of
the job, I understand why they didn't happen (the package changes so far
represent over 3 months of full-time programming). We've got some momentum
now, I'd say let's run with it. :-)

Take care,

Bill



Re: schema support, was Package support for Postgres

From
"Ross J. Reedstrom"
Date:
On Tue, Oct 23, 2001 at 08:43:32AM -0700, Bill Studenmund wrote:
> 
> And there's the fact that schemas were wanted for 7.2, and didn't happen.
> Withouth external adgitation, will they happen for 7.3? Given the size of
> the job, I understand why they didn't happen (the package changes so far
> represent over 3 months of full-time programming). We've got some momentum
> now, I'd say let's run with it. :-)
> 

I feel much better about my unsucessfully attempt at a naive schema
implementation, last Christmas holidays: I had no where _near_ 3 months
of time in on that.

;-)
Ross

-- 
Ross Reedstrom, Ph.D.                                 reedstrm@rice.edu
Executive Director                                  phone: 713-348-6166
Gulf Coast Consortium for Bioinformatics              fax: 713-348-6182
Rice University MS-39
Houston, TX 77005


Re: schema support, was Package support for Postgres

From
Peter Eisentraut
Date:
Bill Studenmund writes:

> > Because SQL says so.  All objects in a schema belong to the owner of the
> > schema.  In simple setups you have one schema per user with identical
> > names.  This has well-established use patterns in other SQL RDBMS.
>
> Then implimenting schemas will cause a backwards-incompatabile change
> regarding who can add/own functions (and operators and ..).
>
> Mainly because when we introduce schemas, all SQL transactions will have
> to be performed in the context of *some* schema.  I think "DEFAULT" was the
> name you mentioned for when there was no schema matching the username. As
> "DEFAULT" (or whatever we call it) will be made by the PG super user (it
> will actually be added as part of initdb), then that means that only the
> super user will own functions.

If you want to own the function you should create it in your schema.  If
you want to create a function and let someone else own it, then ask
someone else for write access to their schema.  (This should be a rare
operation and I don't think SQL provides for it, so we can ignore it in
the beginning.)  If there is no schema you have write access to then you
cannot create things.  People have been dying for that kind of feature,
and schemas will enable us to have it.

Think about it this way:  In its simplest implementation (which is in fact
the Entry Level SQL92, AFAIR), a schema can only have the name of the user
that owns it.  I suspect that this is because SQL has no CREATE USER, so
CREATE SCHEMA is sort of how you become a user that can do things.  At the
same time, schemas would space off the things each user creates, and if
you want to access someone else's stuff you have to prefix it with the
user's name <user>.<table>, sort of like ~user/file.  The generic
"namespace" nature of schemas only comes from the fact that in higher
SQL92 levels a user can own more than one schema with different names.

(Interesting thesis:  It might be that our users are in fact schemas
(minus the parser changes) and we can forget about the whole thing.)

Now what does this spell for the cooperative development environments you
described?  Difficult to tell, but perhaps some of these would do, none of
which are standard, AFAIK:

* schemas owned by groups/roles

* access privileges to schemas, perhaps some sort of sticky bit functionality

> Or we have to special case the DEFAULT schema. Which strikes me as a bad
> thing to do.

I don't necessarily think of the DEFAULT schemas as a real schema.  It
might just be there so that *some* schema context is set if you don't have
one set otherwise, but you don't necessarily have write access to it.
But it might not be necessary at all.

-- 
Peter Eisentraut   peter_e@gmx.net   http://funkturm.homeip.net/~peter



Re: schema support, was Package support for Postgres

From
Bill Studenmund
Date:
On Thu, 25 Oct 2001, Peter Eisentraut wrote:

> Bill Studenmund writes:
>
> > Mainly because when we introduce schemas, all SQL transactions will have
> > to be performed in the context of *some* schema.  I think "DEFAULT" was the
> > name you mentioned for when there was no schema matching the username. As
> > "DEFAULT" (or whatever we call it) will be made by the PG super user (it
> > will actually be added as part of initdb), then that means that only the
> > super user will own functions.
>
> If you want to own the function you should create it in your schema.  If
> you want to create a function and let someone else own it, then ask
> someone else for write access to their schema.  (This should be a rare
> operation and I don't think SQL provides for it, so we can ignore it in
> the beginning.)  If there is no schema you have write access to then you
> cannot create things.  People have been dying for that kind of feature,
> and schemas will enable us to have it.

I think I understand your descriptions of what you will be *able* to do
with schemas. And also that they may describe how you *should* do thing
with schema. I'm not disagreeing with you about that. But that's not the
angle I'm working.

I guess to get at my point, I can ask this question, "Will schema support
invalidate existing PostgreSQL database designs."

I would like the answer to be no. I would like our users to be able to
dump a pre-schema-release db, upgrade, and then restore into a
schema-aware PostgreSQL. And have their restore work.

Since the admin is restoring a db which was made before schema support,
there are no CREATE SCHEMA commands in it (or certainly not ones which do
a real schema create - right now CREATE SCHEMA is a synonym for CREATE
DATABASE). So the restore will create everything in the "DEFAULT" schema
(The schema where creates done w/o a CREATE SCHEMA go).

But right now, we can have different users owning things in one database.
So there will be restores out there which will have different users owning
things in the same restored-to schema, which will be "DEFAULT".

So we have to have (or just retail) the ability to have different users
owning things in one schema.

> Think about it this way:  In its simplest implementation (which is in fact
> the Entry Level SQL92, AFAIR), a schema can only have the name of the user
> that owns it.  I suspect that this is because SQL has no CREATE USER, so
> CREATE SCHEMA is sort of how you become a user that can do things.  At the
> same time, schemas would space off the things each user creates, and if
> you want to access someone else's stuff you have to prefix it with the
> user's name <user>.<table>, sort of like ~user/file.  The generic
> "namespace" nature of schemas only comes from the fact that in higher
> SQL92 levels a user can own more than one schema with different names.
>
> (Interesting thesis:  It might be that our users are in fact schemas
> (minus the parser changes) and we can forget about the whole thing.)

Hmmm... I don't think so, but hmmm..

> Now what does this spell for the cooperative development environments you
> described?  Difficult to tell, but perhaps some of these would do, none of
> which are standard, AFAIK:
>
> * schemas owned by groups/roles

I think that schemas owned by roles are part of SQL99.

> * access privileges to schemas, perhaps some sort of sticky bit
>   functionality
>
> > Or we have to special case the DEFAULT schema. Which strikes me as a bad
> > thing to do.
>
> I don't necessarily think of the DEFAULT schemas as a real schema.  It
> might just be there so that *some* schema context is set if you don't have
> one set otherwise, but you don't necessarily have write access to it.
> But it might not be necessary at all.

While if we were starting over, we might be able to (maybe should have)
design(ed) things so we don't need it, I think a "DEFAULT" schema would
help give users of the schema-aware PostgreSQL an experience similar to
what they have now.

And getting back to where this all started, I think we do need to have the
ability to have users other than the schema owner own things in the
schema, so we should keep the owner id column in the pg_package table. I'm
not against, when things are all said and done, having the default be that
only the schema owner can add things. But that's a policy decision. :-)

Take care,

Bill



Re: schema support, was Package support for Postgres

From
Gunnar Rønning
Date:
* Bill Studenmund <wrstuden@netbsd.org> wrote:

| I would like the answer to be no. I would like our users to be able to
| dump a pre-schema-release db, upgrade, and then restore into a
| schema-aware PostgreSQL. And have their restore work.


Important point. Also having a standard is fine, but by limiting ourselves
to it we are ignoring issues that might be very useful. Draw the line. 

-- 
Gunnar Rønning - gunnar@polygnosis.com
Senior Consultant, Polygnosis AS, http://www.polygnosis.com/


Re: schema support, was Package support for Postgres

From
Peter Eisentraut
Date:
Bill Studenmund writes:

> I guess to get at my point, I can ask this question, "Will schema support
> invalidate existing PostgreSQL database designs."
>
> I would like the answer to be no. I would like our users to be able to
> dump a pre-schema-release db, upgrade, and then restore into a
> schema-aware PostgreSQL. And have their restore work.

I think this can work.  Assume a database like this:

user1:  CREATE TABLE foo ( );
user2:  CREATE TABLE bar ( );

The dump of this would be something like:

\c - user1
CREATE TABLE foo ( );

\c - user2
CREATE TABLE bar ( );

So the tables would be created in the appropriate schema context for each
user.  The remaining problem then is that the two schemas user1 and user2
would need to be created first, but we could make this implicit somewhere.
For instance, a user creation would automatically create a schema for the
user in template1.  Or at least the dump could be automatically massaged
to this effect.

> But right now, we can have different users owning things in one database.
> So there will be restores out there which will have different users owning
> things in the same restored-to schema, which will be "DEFAULT".

This would fundamentally undermine what an SQL schema is and don't help
interoperability a bit.  If we want to implement our own namespace
mechanism we can call it NAMESPACE.  But if we want something called
SCHEMA then we should implement it the way it's standardized, and there is
certainly a tight coupling between schemas and ownership.  In fact, as
I've said already, a schema *is* the ownership; a user is just a weird
PostgreSQL invention.

> I think that schemas owned by roles are part of SQL99.

Correct.

-- 
Peter Eisentraut   peter_e@gmx.net   http://funkturm.homeip.net/~peter



Re: schema support, was Package support for Postgres

From
Bill Studenmund
Date:
On Fri, 26 Oct 2001, Peter Eisentraut wrote:

> Bill Studenmund writes:
>
> > I guess to get at my point, I can ask this question, "Will schema support
> > invalidate existing PostgreSQL database designs."
> >
> > I would like the answer to be no. I would like our users to be able to
> > dump a pre-schema-release db, upgrade, and then restore into a
> > schema-aware PostgreSQL. And have their restore work.
>
> I think this can work.  Assume a database like this:
>
> user1:  CREATE TABLE foo ( );
> user2:  CREATE TABLE bar ( );
>
> The dump of this would be something like:
>
> \c - user1
> CREATE TABLE foo ( );
>
> \c - user2
> CREATE TABLE bar ( );
>
> So the tables would be created in the appropriate schema context for each
> user.  The remaining problem then is that the two schemas user1 and user2
> would need to be created first, but we could make this implicit somewhere.
> For instance, a user creation would automatically create a schema for the
> user in template1.  Or at least the dump could be automatically massaged
> to this effect.
>
> > But right now, we can have different users owning things in one database.
> > So there will be restores out there which will have different users owning
> > things in the same restored-to schema, which will be "DEFAULT".
>
> This would fundamentally undermine what an SQL schema is and don't help
> interoperability a bit.  If we want to implement our own namespace
> mechanism we can call it NAMESPACE.  But if we want something called
> SCHEMA then we should implement it the way it's standardized, and there is
> certainly a tight coupling between schemas and ownership.  In fact, as
> I've said already, a schema *is* the ownership; a user is just a weird
> PostgreSQL invention.

Hmmm.... I've been looking into this, and you are right. All of the views
in INFORMATION_SCHEMA that I looked at contain text like

WHERE (SCHEMA_OWNER = CURRENT_USER OR SCHEMA_OWNER IN (SELECT ROLL_NAMEFROM ENABLED_ROLES) )

So then we'll need a tool to massage old-style dumps to:

1) create the schema, and

2) path all of the schemas together by default.

Well, at least a number of tables won't gain a new colum as a result of
this; the owner column will become the schema_id column. :-)

Take care,

Bill