Thread: Re: schema support, was Package support for Postgres
On Wed, 24 Oct 2001, Peter Eisentraut wrote: > Bill Studenmund writes: > > > So I am a "naive" programmer because I mention intent above? > > No. Sorry, that's the way it came across. As you've said that was not your intent, please disregard my response; I was responding to something you did not mean. > > So if we have INFORMATION_SCHEMA with the right vies in it, we are fine > > doing whatever we want. > > I think some interpretation of the SQL standard can be used to prove that > a new schema should not contain any objects. So you're going to have to > stick to the two predefined schemas to put the system catalogs in. Then > again, other interpretations may be used to prove other things. But to me > the intent of the standard is clear that system catalogs are meant to go > into the defintion schema, and I don't see a reason why this could not be > so. I had been thining that we could have the built-in objects (functions, types, operators, etc.) in whatever was the "default.master" package, but it looks like SQL99 doesn't like that. You're right that built-in things have to be in a different schema than user-added things. Section 10.4 contains text: ii) If RN contains a <schema name> SN, then Case: 1) If SN is INFORMATION_SCHEMA, then the single candidate routine of RI is the built-in function identified by <routine name>. Actually 4.24 is more exact. It defines a built-in function as a routine which is returned from teh query: SELECT DISTINCT ROUTINE_NAME FROM INFORMATION_SCHEMA.ROUTINES WHERE SPECIFIC_SCHEMA = INFORMATION_SCHEMA Actually, since we have to have an INFORMATION_SCHEMA, and "INFORMATION_SCHEMA" gets thrown around a lot, I think it'd be easiest to make "INFORMATION_SCHEMA" the schema containing built-in things. Otherwise (among other things) we have to replace DEFINTION_SCHEMA with INFORMATION_SCHEMA in the above-defined view (and in a lot of other places). Thoughts? > > I stil think we can't do that, since someone other than the schema owner > > can add a package to a schema. :-) Or at least that's the assumption I'm > > running on; we allow users other than PGUID to create functions (and > > operators and aggregates and types) in the default (whatever it will be > > called) schema, so why shouldn't they be allowed to add packages? > > Because SQL says so. All objects in a schema belong to the owner of the > schema. In simple setups you have one schema per user with identical > names. This has well-established use patterns in other SQL RDBMS. Then implimenting schemas will cause a backwards-incompatabile change regarding who can add/own functions (and operators and ..). Mainly because when we introduce schemas, all SQL transactions will have to be performed in the context of *some* schema. I think "DEFAULT" was the name you mentioned for when there was no schema matching the username. As "DEFAULT" (or whatever we call it) will be made by the PG super user (it will actually be added as part of initdb), then that means that only the super user will own functions. That's not how things are now, and imposing that on upgrading users will likely cause pain. Think about a dump/restore upgrade from 7.2 to 7.3. Right now users other than PGUID can own functions (and triggers, etc.). When you do the restore, though, since your dump had no schema support, it all goes into DEFAULT. Which will be owned by PGUID. So now we either have a schema with things owned by a user other than the schema owner, or we have a broken restore. Or we have to special case the DEFAULT schema. Which strikes me as a bad thing to do. For now, I'd suggest letting users other than a schema owner own things in a schema, and later on add controls over who can add things to a schema. Then when you do a "CREATE SCHEMA" command, you will implicitly be adding restrictions prohibiting someone other than the owner from adding things (including packages/subschemas). > I agree that this might not be what everyone would want, but it seems > extensible. However, I feel we're trying to design too many things at > once. Let's do schemas first the way they're in the SQL standard, and > then we can try to tack on ownership or subschemas or package issues. Well, the packages changes can easily be turned into schema support for functions and aggregates, so we are part way there. Also, the packages changes illustrate how to make system-wide internal schema changes of the type adding SQL schemas will need. Plus, packages as they are now are useful w/o schema support. And there's the fact that schemas were wanted for 7.2, and didn't happen. Withouth external adgitation, will they happen for 7.3? Given the size of the job, I understand why they didn't happen (the package changes so far represent over 3 months of full-time programming). We've got some momentum now, I'd say let's run with it. :-) Take care, Bill
On Tue, Oct 23, 2001 at 08:43:32AM -0700, Bill Studenmund wrote: > > And there's the fact that schemas were wanted for 7.2, and didn't happen. > Withouth external adgitation, will they happen for 7.3? Given the size of > the job, I understand why they didn't happen (the package changes so far > represent over 3 months of full-time programming). We've got some momentum > now, I'd say let's run with it. :-) > I feel much better about my unsucessfully attempt at a naive schema implementation, last Christmas holidays: I had no where _near_ 3 months of time in on that. ;-) Ross -- Ross Reedstrom, Ph.D. reedstrm@rice.edu Executive Director phone: 713-348-6166 Gulf Coast Consortium for Bioinformatics fax: 713-348-6182 Rice University MS-39 Houston, TX 77005
Bill Studenmund writes: > > Because SQL says so. All objects in a schema belong to the owner of the > > schema. In simple setups you have one schema per user with identical > > names. This has well-established use patterns in other SQL RDBMS. > > Then implimenting schemas will cause a backwards-incompatabile change > regarding who can add/own functions (and operators and ..). > > Mainly because when we introduce schemas, all SQL transactions will have > to be performed in the context of *some* schema. I think "DEFAULT" was the > name you mentioned for when there was no schema matching the username. As > "DEFAULT" (or whatever we call it) will be made by the PG super user (it > will actually be added as part of initdb), then that means that only the > super user will own functions. If you want to own the function you should create it in your schema. If you want to create a function and let someone else own it, then ask someone else for write access to their schema. (This should be a rare operation and I don't think SQL provides for it, so we can ignore it in the beginning.) If there is no schema you have write access to then you cannot create things. People have been dying for that kind of feature, and schemas will enable us to have it. Think about it this way: In its simplest implementation (which is in fact the Entry Level SQL92, AFAIR), a schema can only have the name of the user that owns it. I suspect that this is because SQL has no CREATE USER, so CREATE SCHEMA is sort of how you become a user that can do things. At the same time, schemas would space off the things each user creates, and if you want to access someone else's stuff you have to prefix it with the user's name <user>.<table>, sort of like ~user/file. The generic "namespace" nature of schemas only comes from the fact that in higher SQL92 levels a user can own more than one schema with different names. (Interesting thesis: It might be that our users are in fact schemas (minus the parser changes) and we can forget about the whole thing.) Now what does this spell for the cooperative development environments you described? Difficult to tell, but perhaps some of these would do, none of which are standard, AFAIK: * schemas owned by groups/roles * access privileges to schemas, perhaps some sort of sticky bit functionality > Or we have to special case the DEFAULT schema. Which strikes me as a bad > thing to do. I don't necessarily think of the DEFAULT schemas as a real schema. It might just be there so that *some* schema context is set if you don't have one set otherwise, but you don't necessarily have write access to it. But it might not be necessary at all. -- Peter Eisentraut peter_e@gmx.net http://funkturm.homeip.net/~peter
On Thu, 25 Oct 2001, Peter Eisentraut wrote: > Bill Studenmund writes: > > > Mainly because when we introduce schemas, all SQL transactions will have > > to be performed in the context of *some* schema. I think "DEFAULT" was the > > name you mentioned for when there was no schema matching the username. As > > "DEFAULT" (or whatever we call it) will be made by the PG super user (it > > will actually be added as part of initdb), then that means that only the > > super user will own functions. > > If you want to own the function you should create it in your schema. If > you want to create a function and let someone else own it, then ask > someone else for write access to their schema. (This should be a rare > operation and I don't think SQL provides for it, so we can ignore it in > the beginning.) If there is no schema you have write access to then you > cannot create things. People have been dying for that kind of feature, > and schemas will enable us to have it. I think I understand your descriptions of what you will be *able* to do with schemas. And also that they may describe how you *should* do thing with schema. I'm not disagreeing with you about that. But that's not the angle I'm working. I guess to get at my point, I can ask this question, "Will schema support invalidate existing PostgreSQL database designs." I would like the answer to be no. I would like our users to be able to dump a pre-schema-release db, upgrade, and then restore into a schema-aware PostgreSQL. And have their restore work. Since the admin is restoring a db which was made before schema support, there are no CREATE SCHEMA commands in it (or certainly not ones which do a real schema create - right now CREATE SCHEMA is a synonym for CREATE DATABASE). So the restore will create everything in the "DEFAULT" schema (The schema where creates done w/o a CREATE SCHEMA go). But right now, we can have different users owning things in one database. So there will be restores out there which will have different users owning things in the same restored-to schema, which will be "DEFAULT". So we have to have (or just retail) the ability to have different users owning things in one schema. > Think about it this way: In its simplest implementation (which is in fact > the Entry Level SQL92, AFAIR), a schema can only have the name of the user > that owns it. I suspect that this is because SQL has no CREATE USER, so > CREATE SCHEMA is sort of how you become a user that can do things. At the > same time, schemas would space off the things each user creates, and if > you want to access someone else's stuff you have to prefix it with the > user's name <user>.<table>, sort of like ~user/file. The generic > "namespace" nature of schemas only comes from the fact that in higher > SQL92 levels a user can own more than one schema with different names. > > (Interesting thesis: It might be that our users are in fact schemas > (minus the parser changes) and we can forget about the whole thing.) Hmmm... I don't think so, but hmmm.. > Now what does this spell for the cooperative development environments you > described? Difficult to tell, but perhaps some of these would do, none of > which are standard, AFAIK: > > * schemas owned by groups/roles I think that schemas owned by roles are part of SQL99. > * access privileges to schemas, perhaps some sort of sticky bit > functionality > > > Or we have to special case the DEFAULT schema. Which strikes me as a bad > > thing to do. > > I don't necessarily think of the DEFAULT schemas as a real schema. It > might just be there so that *some* schema context is set if you don't have > one set otherwise, but you don't necessarily have write access to it. > But it might not be necessary at all. While if we were starting over, we might be able to (maybe should have) design(ed) things so we don't need it, I think a "DEFAULT" schema would help give users of the schema-aware PostgreSQL an experience similar to what they have now. And getting back to where this all started, I think we do need to have the ability to have users other than the schema owner own things in the schema, so we should keep the owner id column in the pg_package table. I'm not against, when things are all said and done, having the default be that only the schema owner can add things. But that's a policy decision. :-) Take care, Bill
* Bill Studenmund <wrstuden@netbsd.org> wrote: | I would like the answer to be no. I would like our users to be able to | dump a pre-schema-release db, upgrade, and then restore into a | schema-aware PostgreSQL. And have their restore work. Important point. Also having a standard is fine, but by limiting ourselves to it we are ignoring issues that might be very useful. Draw the line. -- Gunnar Rønning - gunnar@polygnosis.com Senior Consultant, Polygnosis AS, http://www.polygnosis.com/
Bill Studenmund writes: > I guess to get at my point, I can ask this question, "Will schema support > invalidate existing PostgreSQL database designs." > > I would like the answer to be no. I would like our users to be able to > dump a pre-schema-release db, upgrade, and then restore into a > schema-aware PostgreSQL. And have their restore work. I think this can work. Assume a database like this: user1: CREATE TABLE foo ( ); user2: CREATE TABLE bar ( ); The dump of this would be something like: \c - user1 CREATE TABLE foo ( ); \c - user2 CREATE TABLE bar ( ); So the tables would be created in the appropriate schema context for each user. The remaining problem then is that the two schemas user1 and user2 would need to be created first, but we could make this implicit somewhere. For instance, a user creation would automatically create a schema for the user in template1. Or at least the dump could be automatically massaged to this effect. > But right now, we can have different users owning things in one database. > So there will be restores out there which will have different users owning > things in the same restored-to schema, which will be "DEFAULT". This would fundamentally undermine what an SQL schema is and don't help interoperability a bit. If we want to implement our own namespace mechanism we can call it NAMESPACE. But if we want something called SCHEMA then we should implement it the way it's standardized, and there is certainly a tight coupling between schemas and ownership. In fact, as I've said already, a schema *is* the ownership; a user is just a weird PostgreSQL invention. > I think that schemas owned by roles are part of SQL99. Correct. -- Peter Eisentraut peter_e@gmx.net http://funkturm.homeip.net/~peter
On Fri, 26 Oct 2001, Peter Eisentraut wrote: > Bill Studenmund writes: > > > I guess to get at my point, I can ask this question, "Will schema support > > invalidate existing PostgreSQL database designs." > > > > I would like the answer to be no. I would like our users to be able to > > dump a pre-schema-release db, upgrade, and then restore into a > > schema-aware PostgreSQL. And have their restore work. > > I think this can work. Assume a database like this: > > user1: CREATE TABLE foo ( ); > user2: CREATE TABLE bar ( ); > > The dump of this would be something like: > > \c - user1 > CREATE TABLE foo ( ); > > \c - user2 > CREATE TABLE bar ( ); > > So the tables would be created in the appropriate schema context for each > user. The remaining problem then is that the two schemas user1 and user2 > would need to be created first, but we could make this implicit somewhere. > For instance, a user creation would automatically create a schema for the > user in template1. Or at least the dump could be automatically massaged > to this effect. > > > But right now, we can have different users owning things in one database. > > So there will be restores out there which will have different users owning > > things in the same restored-to schema, which will be "DEFAULT". > > This would fundamentally undermine what an SQL schema is and don't help > interoperability a bit. If we want to implement our own namespace > mechanism we can call it NAMESPACE. But if we want something called > SCHEMA then we should implement it the way it's standardized, and there is > certainly a tight coupling between schemas and ownership. In fact, as > I've said already, a schema *is* the ownership; a user is just a weird > PostgreSQL invention. Hmmm.... I've been looking into this, and you are right. All of the views in INFORMATION_SCHEMA that I looked at contain text like WHERE (SCHEMA_OWNER = CURRENT_USER OR SCHEMA_OWNER IN (SELECT ROLL_NAMEFROM ENABLED_ROLES) ) So then we'll need a tool to massage old-style dumps to: 1) create the schema, and 2) path all of the schemas together by default. Well, at least a number of tables won't gain a new colum as a result of this; the owner column will become the schema_id column. :-) Take care, Bill