Thread: Package support for Postgres
Zembu has decided to release the result of a recent Postgres developement project to the Postgres project. This project (for which I was the lead developer) adds Oracle-like package support to Postgres. I'm in the process of making a version of the patch which is relative to the current cvs tree. The change is fairly encompasing, weighing in at around 800k of unified diffs, of which about 200k are the real meat. Before I send it in, though, I thought I'd see what people think of the idea. Oh, this feature would definitly be a 7.3 feature, 7.2 is too close to the door for this to go in. :-) This message is rather long. I've divided it into sections which start with "**". ** What are Packages So what are packages? In Oracle, they are a feature which helps developers make stored procedures and functions. They provide a name space for functions local to the package, session-specific package variables, and initialization routines which are run before any other routine in the package. Also, all parts of a package are loaded and unloaded at once - you can't have it partially installed. All of these features make life much easier for stored-procedure developers. The name space feature means that you can name the routines in your package whatever you want, and they won't conflict with the names either in other packages or with functions not in a package. All you need to do is ensure that no other package has the same name as yours. ** What did I do, and what does a package declaration look like? What I've done is impliment Oracle packages with a Postgres flair. There is a new command, CREATE PACKAGE <name> AS which defines a package. For those of you up on Oracle packages, this command duplicates the Oracle CREATE PACKAGE BODY command - there is no Postgres equivalent of the Oracle CREATE PACKAGE command. Packages are listed in a new system table, pg_package, and are referenced in other tables by the oid of the row in pg_package. There are seven different components which can be present in a package, and so a CREATE PACKAGE command contains seven stanza types. A package can be made up of functions, types, operators, aggregates, package-global variables, initialization routines, and functions usable for type declarations. Four of the stanzas are easy to understand; to create a function, a type, an aggregate, or an operator, you include a stanza which is the relevant CREATE command without the CREATE keyword. Thus the FUNCTION stanza creates a function, the TYPE stanza creates a type, AGGREGATE => an aggregate, and OPERATOR => an operator. The initializer routines and package-global variables are done a bit differently than in Oracle, reflecting Postgres's strength at adding languages. Postgres supports six procedural languages (plpgsql, pltcl, pltclu, plperl, plperlu, and plpython) whereas I think Oracle only supports two (PL/SQL and I herad they added a java PL). The main difference is that the variables and the initializer routines are language specific. So you can have different variables for plpgsql than for pltcl. Likewise for initializers. Package-global variables are defined as: DECLARE <variable name> '<variable type>' [, <next name> '<next type>' ] LANGUAGE 'langname' The type is delimited by single quotes so that the postgres parser didn't have to learn the syntax of each procedural language's variable types. Initializer routines are declared like normal functions, except the function name and signature (number & type of arguements and return type) are not given. The name is automatically generated (it is __packinit_ followed by the language name) and the function signature should not be depended on. It is to take no parameters and return an int4 for now, but that should probably change whenever PG supports true procedures. Initializer routines are declared as: BODY AS 'function body' LANGUAGE 'lanname' [with <with options>] I'm attaching a sample showing a package initialization routine and global variable declaration. There's a syntax error in it, which I asked about in another EMail. The last component of a package are the functions usable for type declarations. They are declared as: BEFORE TYPE FUNCTION <standard package function declaration> They are useful as the normal functions in a package are declared after the types are declared, so that they can use a type newly-defined in a package. Which is fine, except that to define a type, you have to give an input and an output function. BEFORE TYPE FUNCTIONs are used to define those functions. Other than exactly when they are created in package loading, they are just like other functions in the package. I'm attaching an example which defines the type 'myint4' (using the internal int4 routines) and proceeds to declare routines using the new type. ** So how do I use things in a package? You don't have to do anything special to use a type or an operator defined in a package - you just use it. Getting technical, operators and types in packages are in the same name space as are types and operators not in packages. To follow along with the example I attached above, the 'myint4' type is usable in the typetest package, in tables, in other packages, and in "normal" functions. For functions and aggregates, things are a little more complicated. First off, there is a package called "standard" which contains all types, aggregates, operators, and functions which aren't in a specific package. This includes all of the standard Postgres routines, and anything created with CREATE FUNCTION, CREATE AGGREGATE, CREATE OPERATOR, and CREATE TYPE. Secondly, parsing is always done in terms of a specified package context. If we are parsing an equation in a routine inside of a package, then the package context is that package. If we are just typing along in psql, then the package context is "standard". When you specify a function or aggregate, you have two choices. One is to specify a package, and a function in that package, like "nametest.process" to specify the "process" function in the "nametest" package. The other choice is to just give the function's name. The first place Postgres will look is in the package context used for parsing. If it's not there (and that context wasn't "standard"), then it will look in "standard". So for example in the type declaration example attached, the type stanza uses "myint4in" and "myint4out" as the input and output routines, and finds the ones declared as part of the package. I've attached a sample showing off namespaces. It has two non-package routines, and one package named "nametest". Here's a sample session: testing=# select standard.process(4); process ------------------I am in standard (1 row) testing=# select nametest.process(4); process ---------------------I am in the package (1 row) testing=# select nametest.docheck(); docheck ---------------------I am in the package (1 row) First we see that the standard.process() routine says it is in the "standard" package, and that the nametest.process() routine says it is in the package. Then we call the nametest.docheck() routine. It evaluates "process(changer(4));" in the context of the nametest package. We find the process() routine in the package, and use it. The changer routine is there to test how typecasting works. It verifies that Postgres would typecast the return of changer into a different integer and call the process() routine in the package rather than call the process() routine in standard. This behavior matches Oracle's. The other routines in the package show of some examples of how sql will parse according to the above rules. Initialization routines: There is only one recomended way to use them: call a function written in the same PL in the package. That will cause the initialization routine to be run. Assuming there are no errors, the routine you call won't be executed until after the initialization routine finishes. Of course the non-recomended way is to manually call __packinit_<langname> directly. The problem with that is that you are depending on implimentation details which might change. Like exactly how the name is generated (which probably won't change) and the calling convention (which hopefully will if procedures are ever suported). Package-global variables: Just use them. Assuming that the procedural language supports global variables, they just work. Note that as with Oracle, each backend will get its own set of variables. No effort is made to coordinate values across backends. But chances are you don't want to do that, and if you did, just make a table. :-) ** So what is the state of the diffs? The diffs contain changes to last week's current (I'll cvs update before sending out) which add package support to the backend, plpgsql, the SPI interface, initdb, and pg_dump. The changes also include modifying the system schema to support packages (pg_package which lists packages, pg_packglobal which list global variables, and adding a package identifier to pg_aggretage, pg_operator, pg_proc and pg_type). The big things missing are documentation, and regression tests which explicitly test packages. Also, plpgsql is the only PL with package support. Adding package support doesn't make sense for the 'C' and 'internal' languages, as you can manually add "global" variables and initialization routines yourself. It also doesn't make sense for 'sql' as sql doesn't support variables. The other languages need to gain package support, and I'll appreciate help from their authors. :-) So I'd better wrap up here. Monday I'll send the diffs to the patches list, and also send a message talking about more of the details of the changes. What do folks think? Take care, Bill
>What do folks think? >Take care, >Bill Hello Bill, The community have been waiting for packages for a long time. I don't believe you did it!!! IMHO most applications do not fully benefit from the power of PostgreSQL because transactions are performed at application lever (PHP/asp/Java/Application server). Sometimes, libraries are mapped to database structure, which is nonsense when a simple view with left joins can solve a problem. Most applications should be developed/ported at PostgreSQL level using the full range of available tools (transactions, triggers, views, foreign keys, rules and off course PL/pgSQL). This is much easier and powerful. Then, all you need is to display information using a good object-oriented language (Java/PHP). With the help of packages, a lot of developers will probably release GPL libraries and PostgreSQL will become the #1 database in the world. At pgAdmin team, we were thinking of developing packages at client level. This is nonsense when reading your paper. The ability of defining context levels is a great feature. Question: how do you map package to PostgreSQL objects (tables, views, triggers)? Is there any possibility of defining templates? Can this be added to packages in the future with little impact on PostgreSQL internals? Now, we can only thank you for bringing Packages to PostgreSQL. Best regards, Jean-Michel POURE pgAdmin Team
Bill Studenmund <wrstuden@netbsd.org> writes: > ... operators and types in > packages are in the same name space as are types and operators not in > packages. > For functions and aggregates, things are a little more complicated. First > off, there is a package called "standard" which contains all types, > aggregates, operators, and functions which aren't in a specific package. > This includes all of the standard Postgres routines, and anything created > with CREATE FUNCTION, CREATE AGGREGATE, CREATE OPERATOR, and CREATE TYPE. > Secondly, parsing is always done in terms of a specified package context. > If we are parsing an equation in a routine inside of a package, then the > package context is that package. If we are just typing along in psql, then > the package context is "standard". > When you specify a function or aggregate, you have two choices. One is to > specify a package, and a function in that package, like > "nametest.process" to specify the "process" function in the "nametest" > package. > The other choice is to just give the function's name. The first place > Postgres will look is in the package context used for parsing. If it's not > there (and that context wasn't "standard"), then it will look in > "standard". Hmm. How does/will all of this interact with SQL-style schemas? The reason I'm concerned is that if we want to retain the present convention that the rowtype of a table has the same name as the table, I think we are going to have to make type names schema-local, just like table names will be. And if type names are local to schemas then so must be the functions that operate on those types, and therefore also operators (which are merely syntactic sugar for functions). This seems like it will overlap and possibly conflict with the decisions you've made for packages. It also seems possible that a package *is* a schema, if schemas are defined that way --- does a package bring anything more to the table? I also wonder how the fixed, single-level namespace search path you describe interacts with the SQL rules for schema search. (I don't actually know what those rules are offhand; haven't yet read the schema parts of the spec in any detail...) Also, both operators and functions normally go through ambiguity resolution based on the types of their inputs. How does the existence of a name search path affect this --- are candidates nearer the front of the search path preferred? Offhand I'm not sure if they should get any preference or not. I'd like to see schemas implemented per the spec in 7.3, so we need to coordinate all this stuff. regards, tom lane
Bill Studenmund writes: > So what are packages? In Oracle, they are a feature which helps developers > make stored procedures and functions. I think you have restricted yourself too much to functions and procedures. A package could/should also be able to contain views, tables, and such. > They provide a name space for functions local to the package, Namespacing is the task of schemas. I think of packages as a bunch of objects that can be addressed under a common name (think RPMs). But it seems like some of this work could be used to implement schema support. > session-specific package variables, I think this is assuming a little too much about how a PL might operate. Some PLs already support this in their own language-specific way, with or without packages. Thus, I don't think packages should touch this. Actually, I think you could easily set up session variables in the package initializer function. > The last component of a package are the functions usable for type > declarations. They are declared as: > BEFORE TYPE FUNCTION <standard package function declaration> > > They are useful as the normal functions in a package are declared after > the types are declared, so that they can use a type newly-defined in a > package. I think it would make much more sense to allow the creation of objects in the CREATE PACKAGE command in any order. PostgreSQL has not so far had a concept of "functions suitable for type declarations" and we shouldn't add one. -- Peter Eisentraut peter_e@gmx.net http://funkturm.homeip.net/~peter
Tom Lane writes: > This seems like it will overlap and possibly conflict with the decisions > you've made for packages. It also seems possible that a package *is* > a schema, if schemas are defined that way --- does a package bring > anything more to the table? I have been pondering a little about something I called "package", completely independent of anything previously implemented. What I would like to get out of a package is the same thing I get out of package systems on operating systems, namely that I can remove all the things that belong to the package with one command. Typical packages on PostgreSQL could be the PgAccess admin tables or the ODBC catalog extensions. One might think that this could also be done with schemas. I'm thinking using schemas for this would be analogous to installing one package per directory. Now since we don't have to deal with command search paths or file system mount points there might be nothing wrong with that. Packages typically also have post-install/uninstall code, as does this proposed implementation, so that would have to be fit in somewhere. This is basically where my thinking has stopped... ;-) Now I'm also confused as to what this package system really represents: Is it a namespace mechanisms -- but Oracle does have schemas; or is it a package manager like I had in mind -- for that it does too many things that don't belong there; or is it a mechanism to set up global variables -- that already exists and doesn't need "packages". -- Peter Eisentraut peter_e@gmx.net http://funkturm.homeip.net/~peter
On Sat, 13 Oct 2001, Tom Lane wrote: > Bill Studenmund <wrstuden@netbsd.org> writes: > > For functions and aggregates, things are a little more complicated. First > > off, there is a package called "standard" which contains all types, > > aggregates, operators, and functions which aren't in a specific package. > > This includes all of the standard Postgres routines, and anything created > > with CREATE FUNCTION, CREATE AGGREGATE, CREATE OPERATOR, and CREATE TYPE. > > > Secondly, parsing is always done in terms of a specified package context. > > If we are parsing an equation in a routine inside of a package, then the > > package context is that package. If we are just typing along in psql, then > > the package context is "standard". > > > When you specify a function or aggregate, you have two choices. One is to > > specify a package, and a function in that package, like > > "nametest.process" to specify the "process" function in the "nametest" > > package. > > > The other choice is to just give the function's name. The first place > > Postgres will look is in the package context used for parsing. If it's not > > there (and that context wasn't "standard"), then it will look in > > "standard". > > Hmm. How does/will all of this interact with SQL-style schemas? Independent as I understand it. Schemas (as I understand Oracle schemas) operate at a level above the level where packages operate. > The reason I'm concerned is that if we want to retain the present > convention that the rowtype of a table has the same name as the table, > I think we are going to have to make type names schema-local, just > like table names will be. And if type names are local to schemas > then so must be the functions that operate on those types, and therefore > also operators (which are merely syntactic sugar for functions). > > This seems like it will overlap and possibly conflict with the decisions > you've made for packages. It also seems possible that a package *is* > a schema, if schemas are defined that way --- does a package bring > anything more to the table? I don't think it conflicts. My understanding of schemas is rather simplistic and practical. As I understand it, they correspond roughly to databases in PG. So with schema support, one database can essentially reach into another one. Package support deals with the functions (and types and in this case aggregates and operators) that schema support would find in the other schemas/databases. > I also wonder how the fixed, single-level namespace search path you > describe interacts with the SQL rules for schema search. (I don't > actually know what those rules are offhand; haven't yet read the schema > parts of the spec in any detail...) Should be independent. The searching only happens when you are not in the "standard" package, and you give just a function name for a function. The searching would only happen in the current schems. If you give a schema name, then I'd expect PG to look in that schema, in standard, for that function. If you give both a schema and package name, then PG would look in that package in that schema. > Also, both operators and functions normally go through ambiguity > resolution based on the types of their inputs. How does the existence > of a name search path affect this --- are candidates nearer the front > of the search path preferred? Offhand I'm not sure if they should get > any preference or not. There is no name spacing for operators in my implimentation as to have one strikes me as reducing the utility of having types and operators in a package. For functions (and aggregates), I tried to touch on that in the latter part of my message; that's what the example with "process(changer(4))" was about. PG will try to type coerce a function in the current package before it looks in standard. So yes, candidates nearer the front are prefered. > I'd like to see schemas implemented per the spec in 7.3, so we need to > coordinate all this stuff. Sounds good. I don't think it will be that hard, though. :-) Take care, Bill
On Sat, 13 Oct 2001, Jean-Michel POURE wrote: > >What do folks think? > >Take care, > >Bill > > Hello Bill, > > The community have been waiting for packages for a long time. I don't > believe you did it!!! > > IMHO most applications do not fully benefit from the power of PostgreSQL > because transactions are performed at application lever > (PHP/asp/Java/Application server). Sometimes, libraries are mapped to > database structure, which is nonsense when a simple view with left joins > can solve a problem. > > Most applications should be developed/ported at PostgreSQL level using the > full range of available tools (transactions, triggers, views, foreign keys, > rules and off course PL/pgSQL). This is much easier and powerful. Then, all > you need is to display information using a good object-oriented language > (Java/PHP). > > With the help of packages, a lot of developers will probably release GPL > libraries and PostgreSQL will become the #1 database in the world. Yep. PostgreSQL is within reach of really challenging the commercial databases. I think the core developers are working on the changes needed to challenge the commercial db's in terms of speed and performance for big datastores (WAL, working to prevent OID rollover, etc.). Packages address a different side of what will be needed to challenge the big boys - better stored procedure support. :-) > At pgAdmin team, we were thinking of developing packages at client level. > This is nonsense when reading your paper. The ability of defining context > levels is a great feature. Question: how do you map package to PostgreSQL > objects (tables, views, triggers)? Is there any possibility of defining > templates? Can this be added to packages in the future with little impact > on PostgreSQL internals? Packages don't really map to DB objects (tables, views, triggers) at the moment. Have you used Oracle much? These packages are a direct translation of Oracle packages, with a few PostgreSQL extentions thrown in (Oracle doesn't have PostgreSQL's ability to add aggregates, operators, and system types AFAIK, so their packages likewise don't, and types in packages AFAIK are package-specific). I forget who said it, but operators (and aggregates) are basically just sugar wrapped around functions; these packages are another form of sugar wrapped around functions. To start adding views and tables and triggers makes packages more than just special sugar around functions. Also, my big concern is that if we start adding tables and views and triggers to packages, pg_dump becomes a nightmare. > Now, we can only thank you for bringing Packages to PostgreSQL. You're welcome. Take care, Bill
On Sun, 14 Oct 2001, Peter Eisentraut wrote: > I have been pondering a little about something I called "package", > completely independent of anything previously implemented. What I would > like to get out of a package is the same thing I get out of package > systems on operating systems, namely that I can remove all the things that > belong to the package with one command. Typical packages on PostgreSQL > could be the PgAccess admin tables or the ODBC catalog extensions. > > One might think that this could also be done with schemas. I'm thinking > using schemas for this would be analogous to installing one package per > directory. Now since we don't have to deal with command search paths or > file system mount points there might be nothing wrong with that. > > Packages typically also have post-install/uninstall code, as does this > proposed implementation, so that would have to be fit in somewhere. > > This is basically where my thinking has stopped... ;-) > > Now I'm also confused as to what this package system really represents: > Is it a namespace mechanisms -- but Oracle does have schemas; or is it a > package manager like I had in mind -- for that it does too many things > that don't belong there; or is it a mechanism to set up global variables > -- that already exists and doesn't need "packages". It is an implimentation of Oracle Packages for PostgreSQL, taking advantage of some of PostgreSQL's abilities (the aggregates & operators in a package bit is new). It is a tool to help developers create large projects and/or reuse code. It is not schema support; schema support operates on a level above package support. It is also not the package support you had in mind. That support is different. What you describe above is packaging which primarily helps the admin, while this packaging primarily helps the procedure developer. That difference in emphasis is why this package support does things an administrator-focused package system wouldn't. Also, please note that while many of PostgreSQL's procedure languages might not need global variable support, PL/pgSQL does. Take care, Bill
Bill Studenmund writes: > I disagree. Views and tables are the purview of schemas, which as I > mentioned to Tom, strike me as being different from packages. Well, obviously schemas are a namespacing mechanism for tables and views. And apparently the "packages" you propose are (among other things) a namespacing mechanism for functions. But the fact is that schemas already provide a namespacing mechanism for functions. (That's what SQL says and that's how it's going to happen.) Now perhaps you want to have a namespacing mechanism *below* schemas. But then I think this could be done with nested schemas, since the sub-schemas would really be the same concept as a top-level schema. That would be a much more general mechanism. > Packages basically are modules which make life easier for functions > (and types and aggregates and operators). Obviously there is a large number of ideas that "make life easier". But I'm still missing a clear statement what exactly the design idea behind these packages is. So far I understood namespace and global variables for PL/pgSQL. For the namespace thing we've already got a different design. For global variables, see below. > If we really want to make tables and views and triggers part of packages, > we can. My big concern is that it then makes pg_dump harder. I'll go into > that more below. That has never stopped us from doing anything. ;-) > Regrettablely Oracle beat you to it with what "packages" are in terms of > Oracle, and I suspect also in the minds of many DBAs. Oracle appears to have beaten us to define the meaning of quite a few things, but that doesn't mean we have to accept them. We don't re-implement Oracle here. And exactly because all Oracle has is procedures and PL/SQL, whereas PostgreSQL has operators, types, and such, and user-defined procedural languages, designs may need to be changed or thrown out. It wouldn't be the first time. > I agree that some PLs might do things their own way and so package > variables won't be as useful. If these variables are not appropriate to a > PL, it can ignore them. > > PL/pgSQL is a counter-example, though, showing that something needs to be > done. Then PL/pgSQL should be fixed. But that doesn't need a such a large concept as "packages". It could be as easy as DECLARE GLOBAL... BEGIN... END > It is not set up to support global variables; each code block > generates its own namespace, and removes it on the way out. Thus I can > not see a clean way to add package global variables to say the > initialization routine - this routine's exit code would need to not > destroy the context. That strikes me as a mess. The language handler should have no problem creating persistent storage -- I don't see that as a problem. If the language is misdesigned that it cannot be done (which I doubt, but consider the theoretical case) then the language should be replaced by something better, but please keep in mind that it's a PL/pgSQL problem only. Maybe if you're from an Oracle background this separation is not quite as natural. > I think you misread me slightly. BEFORE TYPE FUNCTION functions are > "usable" for type declarations, not "suitable" for them. Also, I didn't > say one key clause, "in this package". The main difference is when in the > creation of the package the functions are created; they get created before > the types, rather than after. Right, that's why I suggested allowing the CREATE statements in any order so you could order them yourself to have the function before the types or whatever you want. > This concept is new to PostgreSQL because PostgreSQL has never before > chained creations together like this. Externally perhaps not, but internally these things happen all the time. -- Peter Eisentraut peter_e@gmx.net http://funkturm.homeip.net/~peter
On Tue, 16 Oct 2001, Peter Eisentraut wrote: > Bill Studenmund writes: > > > I disagree. Views and tables are the purview of schemas, which as I > > mentioned to Tom, strike me as being different from packages. > > Well, obviously schemas are a namespacing mechanism for tables and views. > And apparently the "packages" you propose are (among other things) a > namespacing mechanism for functions. But the fact is that schemas already > provide a namespacing mechanism for functions. (That's what SQL says and > that's how it's going to happen.) Now perhaps you want to have a > namespacing mechanism *below* schemas. But then I think this could be > done with nested schemas, since the sub-schemas would really be the same > concept as a top-level schema. That would be a much more general > mechanism. Yes, I want a namespace below schemas. The difference between packages and schemas is that schemas encapsulate everything. As Tom pointed out, that includes types (and I'd assume operators too). Packages do not encapsulate types and operators. That's what makes them different from a sub-schema (assuming a sub-schema is a schema within a schema). > Obviously there is a large number of ideas that "make life easier". But > I'm still missing a clear statement what exactly the design idea behind > these packages is. So far I understood namespace and global variables for > PL/pgSQL. For the namespace thing we've already got a different design. > For global variables, see below. See above. > > I agree that some PLs might do things their own way and so package > > variables won't be as useful. If these variables are not appropriate to a > > PL, it can ignore them. > > > > PL/pgSQL is a counter-example, though, showing that something needs to be > > done. > > Then PL/pgSQL should be fixed. But that doesn't need a such a large Why is PL/pgSQL broken? It has a very clean design element; you enter a code block, you get a new namespace. You can declare variables in that namespace if you want. When you use a variable name, PL/pgSQL looks in the current namespace, then the parent, and so on. You exit a code block, the namespace goes away. That's how C works, for instance. > concept as "packages". It could be as easy as > > DECLARE GLOBAL > ... > BEGIN > ... > END > > > It is not set up to support global variables; each code block > > generates its own namespace, and removes it on the way out. Thus I can > > not see a clean way to add package global variables to say the > > initialization routine - this routine's exit code would need to not > > destroy the context. That strikes me as a mess. > > The language handler should have no problem creating persistent storage -- > I don't see that as a problem. If the language is misdesigned that it > cannot be done (which I doubt, but consider the theoretical case) then the > language should be replaced by something better, but please keep in mind > that it's a PL/pgSQL problem only. Maybe if you're from an Oracle > background this separation is not quite as natural. The problem is not creating persistent storage; the issue is that the langyage was designed to not use it. What you're proposing could be done, but would effectivly be shoving the change in with a hammer. Also, any other PLs which are based on languages with strict namespaces will have the same problem. Look at C for instance. What you're describing is the equivalent to letting a function or procedure in C declare global variables. That's not now the language works, and no one seems to mind. :-) > Right, that's why I suggested allowing the CREATE statements in any order > so you could order them yourself to have the function before the types or > whatever you want. My concern with that is that then we have to make sure to dump it in the same order you entered it. Right now, in general, pg_dump dumps objects in stages; all of the languages are dumped, then all of the types, then the functions, and so on. Functions needed for types and languages get dumped right before the type or language which needs it. If we go with strict package order mattering, then pg_dump needs to be able to recreate that order. That means that it has to look in pg_proc, pg_operator, pg_type, and pg_aggreagate, sort things (in the package being dumped) by oid, and dump things in order of increasing oid. Nothing else in pg_dump works like that. I'd rather not start. I have however come up with another way to make BEFORE TYPE FUNCTION go away. I'll just scan the types in a package (I doubt there will be many), get a set of candidate names, and scan the functions in the package for them. If they are found, they get added before the types do. So then the decision as to when a function should get added is implicit, rather than explicit. I'll see about adding this before I send in the patch (it is the only thing left). Take care, Bill
On Sat, 13 Oct 2001, Peter Eisentraut wrote: > Bill Studenmund writes: > > > session-specific package variables, > > I think this is assuming a little too much about how a PL might operate. > Some PLs already support this in their own language-specific way, with or > without packages. Thus, I don't think packages should touch this. > Actually, I think you could easily set up session variables in the package > initializer function. Could you please give me an example of how to do this, say for plperl or plpython? Just showing how two functions made with CREATE FUNCTION can use global variables will be fine. This example will help me understand how they work. Take care, Bill
are on their way to the patches list. Given the mail delay we've been seeing, they'll take a while to get there. Oh, it turns out there _is_ a size limit for patches, so it'll need to get approved. There are still a few warts in the code. 1) One wart is that I needed to make an identifier for the oid for the "standard" package. The oid in question is 10, and the identifier is STANDARDPackageId. I think I will change it to StandardPackageId. The question I have is in which file should I store the define defining it? 2) Another problem is dealing with the ambiguity between relation.attribute and package.functionname. The present code does it by changing scan.l to recognize ${identifier}\.${identifier}, and if the first identifier isn't a key word, look to see if it is a package (scan pg_packages for the name). If so, the scanner returns a different token, PACKID, than IDENT. I'll see what I can do about moving all of this into the parser, and defering the pg_packages scan until later. I think I got rid of all of the debugging comments; please let me know if I didn't. Take care, Bill
Bill Studenmund writes: > Yes, I want a namespace below schemas. > > The difference between packages and schemas is that schemas encapsulate > everything. As Tom pointed out, that includes types (and I'd assume > operators too). Packages do not encapsulate types and operators. Of course nobody is forcing you to put types into subschemas. But the user would have the freedom to spread things around as he sees fit. > > > I agree that some PLs might do things their own way and so package > > > variables won't be as useful. If these variables are not appropriate to a > > > PL, it can ignore them. > > > > > > PL/pgSQL is a counter-example, though, showing that something needs to be > > > done. > > > > Then PL/pgSQL should be fixed. But that doesn't need a such a large > > Why is PL/pgSQL broken? Maybe read "fixed" as "enhanced". > The problem is not creating persistent storage; the issue is that the > langyage was designed to not use it. What you're proposing could be done, > but would effectivly be shoving the change in with a hammer. Also, any > other PLs which are based on languages with strict namespaces will have > the same problem. Other PLs have shown that storing global data in a language-typical way *is* possible. I read your argumentation as "PL/pgSQL is not designed to have global variables, so I'm going to implement 'packages' as a way to make some anyway". Either PL/pgSQL is not designed for it, then there should not be any -- at all. Or it can handle them after all, but then it's the business of the language handler to deal with it. > Look at C for instance. What you're describing is the equivalent to > letting a function or procedure in C declare global variables. That's not > now the language works, and no one seems to mind. :-) What you're describing is the equivalent of declaring global variables in C but inventing a whole new mechanism in the operating system for it called "package" (which also happens to be a namespace mechanism as a second job). Now, there are ways to exchange data between separate programs or modules in C, such as message queues or shared memory. (If you think about it, what PL/Python is doing with its global dictionary is just the same as shared memory.) This works fine, but it doesn't affect PostgreSQL proper in any way. So, do something with PL/pgSQL. Implement a global dictionary, or shared memory, or some other way to share data between functions. Call it "package" if you like. (Java has packages which are somewhat like that.) > My concern with that is that then we have to make sure to dump it in the > same order you entered it. pg_dump can do dependency ordering if you ask it nicely. ;-) When we implement schemas we'll have to make sure it works anyway. Thinking about pg_dump when designing backend features is usually not worthwhile. -- Peter Eisentraut peter_e@gmx.net http://funkturm.homeip.net/~peter
On Wed, 17 Oct 2001, Peter Eisentraut wrote: > Bill Studenmund writes: > > > Yes, I want a namespace below schemas. > > > > The difference between packages and schemas is that schemas encapsulate > > everything. As Tom pointed out, that includes types (and I'd assume > > operators too). Packages do not encapsulate types and operators. > > Of course nobody is forcing you to put types into subschemas. But the > user would have the freedom to spread things around as he sees fit. ??? > > > Then PL/pgSQL should be fixed. But that doesn't need a such a large > > > > Why is PL/pgSQL broken? > > Maybe read "fixed" as "enhanced". > > > The problem is not creating persistent storage; the issue is that the > > langyage was designed to not use it. What you're proposing could be done, > > but would effectivly be shoving the change in with a hammer. Also, any > > other PLs which are based on languages with strict namespaces will have > > the same problem. > > Other PLs have shown that storing global data in a language-typical way > *is* possible. I read your argumentation as "PL/pgSQL is not designed to > have global variables, so I'm going to implement 'packages' as a way to > make some anyway". Either PL/pgSQL is not designed for it, then there > should not be any -- at all. Or it can handle them after all, but then > it's the business of the language handler to deal with it. Do you really think that my employer paid me for three months to come up with an 800k diff _just_ to add global variables to PL/pgSQL? While part of it, global variables are only one part of the work. I would actually say it is a minor one. Honestly, I do not understand why "global variables" have been such a sore point for you. PLs for which they don't make sense like this don't have to do it, and Oracle, on whom our Pl/pgSQL was based, thinks that they make perfect sense for the language we copied. Also, remember that this is an implimentation of Oracle packages for Postgres. One of our goals was to make it so that you can mechanically transform an Oracle package into a Postgres one, and vis versa. This implimentation does a good job of that. To make the change you suggest would not. > > My concern with that is that then we have to make sure to dump it in the > > same order you entered it. > > pg_dump can do dependency ordering if you ask it nicely. ;-) When we > implement schemas we'll have to make sure it works anyway. Thinking about > pg_dump when designing backend features is usually not worthwhile. The thing is what you're talking about is more than just dependency ordering (which I taught pg_dump to do for packages). "doing things in the order you list" to me means that things get dumped in the exact same order. Say you added some functions and then some operators and then some functions. If order matters, then the operators should get generated in the dump before the functions, even though there's no dependency-reason to do so. Maybe I'm taking that a bit more literal than you mean, but how it comes across to me is unnecessarily difficult. We cah achieve the same thing other ways. I did however take your point that BEFORE TYPE FUNCTION should go away; the patch I sent in does not have it. In the patch, stanzas in the CREATE PACKAGE command are gathered, and done in sequence according to kind. First the global variables are defined, then the initialization routines, then functions which are needed for types in the package, then types, then functions (other than the ones already done), aggregates, and operators. Take care, Bill
Bill Studenmund writes: > Could you please give me an example of how to do this, say for plperl or > plpython? Just showing how two functions made with CREATE FUNCTION can use > global variables will be fine. This example will help me understand how > they work. For PL/Tcl you use regular Tcl global variables: create function produce(text) returns text as ' global foo; set foo $1; ' language pltcl; create function consume() returns text as ' global foo; return $foo; ' language pltcl; There is also a mechanism for one procedure to save private data across calls. For PL/Python you use a global dictionary: create function produce(text) returns text as ' GD["key"] = args[0] ' language plpython; create function consume() returns text as ' return GD["key"] ' language plpython; There is also a dictionary for private data. For PL/Perl I'm not sure if something has been implemented. In C you can use shared memory, and for PL/sh you would use temp files of course. ;-) -- Peter Eisentraut peter_e@gmx.net http://funkturm.homeip.net/~peter
On Thu, 18 Oct 2001, Peter Eisentraut wrote: > Bill Studenmund writes: > > > Could you please give me an example of how to do this, say for plperl or > > plpython? Just showing how two functions made with CREATE FUNCTION can use > > global variables will be fine. This example will help me understand how > > they work. > > For PL/Tcl you use regular Tcl global variables: > > create function produce(text) returns text as ' > global foo; set foo $1; > ' language pltcl; > > create function consume() returns text as ' > global foo; return $foo; > ' language pltcl; > > There is also a mechanism for one procedure to save private data across > calls. > > For PL/Python you use a global dictionary: > > create function produce(text) returns text as ' > GD["key"] = args[0] > ' language plpython; > > create function consume() returns text as ' > return GD["key"] > ' language plpython; > > There is also a dictionary for private data. Private to what? > For PL/Perl I'm not sure if something has been implemented. In C you can > use shared memory, and for PL/sh you would use temp files of course. ;-) Thank you. I can now experiment with them to see how they do. I've never thought of adding package variables for C routines; there are other options open. :-) Oh, by shared memory, do you mean SYSV Shared Memory (like how the backends talk) or just memory shared between routines? I ask as part of the idea with these variables is that they are backend-specific. So C routines actually should NOT used SYSV Shared Mem. :-) Take care, Bill
Bill Studenmund writes: > Honestly, I do not understand why "global variables" have been such a sore > point for you. My point is that the proposed "package support" introduces two features that are a) independent, and b) already exist, at least in design. Schemas are already planned as a namespace mechanism. Global variables in PLs already exist in some PLs. Others can add it if they like. There aren't any other features introduced by "package support" that I can see or that you have explicitly pointed out. So the two questions I ask myself are: 1. Are package namespaces "better" than schemas? The answer to that is no, because schemas are more standard and more general. 2. Are global variables via packages "better" than the existing setups? My answer to that is again no, because the existing setups respect language conventions, maintain the separation of the backend and the language handlers, and of course they are already there and used. So as a consequence we have to ask ourselves, 3. Do "packages" add anything more to the table than those two elementary features? Please educate us. 4. Would it make sense to provide "packages" alongside the existing mechanisms that accomplish approximately the same thing. That could be debated, in case we agree that they are approximately the same thing. -- Peter Eisentraut peter_e@gmx.net http://funkturm.homeip.net/~peter
On Thu, 18 Oct 2001, Peter Eisentraut wrote: > Bill Studenmund writes: > > > Honestly, I do not understand why "global variables" have been such a sore > > point for you. > > My point is that the proposed "package support" introduces two features > that are a) independent, and b) already exist, at least in design. > Schemas are already planned as a namespace mechanism. Global variables in > PLs already exist in some PLs. Others can add it if they like. There > aren't any other features introduced by "package support" that I can see > or that you have explicitly pointed out. Then my explanations didn't click. Please let me try again. The main feature of package support is that it greatly facilitates developing large, complicated db applications. Like ones which require multiple full-time developers to develop. I think PostgreSQL has the internals to run these apps, and it should provide a developement environment to encourage them. That's what packages are about. I have never developed an application that large. But I have talked to our DBAs who have worked with such things in Oracle, and a few who have worked on (developed) such large applications. They all have agreed that something akin to packages is needed to make it work. The seperate namespaces (for function names and for variables) mean that different programmers don't have to coordinate the names of functions. Or that the names have to have some de-ambiguating prefix to make them different. All that has to happen is that different packages have different names. When you through in the idea of developers releasing libraries (packages) on the net, the minimality of coordination is even more important. The fact (for PostgreSQL i.e. this implimentation) that types and operators aren't namespaced off means that they effectively leak into the enclosing database (or schema when we have them) so that making and supporting new types can be the aim/result of the package. For comaprison with other languages, packages strike me as comparable to libraries (in C) or modules (say in Perl or Python). Neither libraries nor modules realy do anything that can't be achieved otherwise in the language. Yet they are a prefered method of developing code, especially reused code. When you're making a program/application, you don't need to concern yourself with (many) details about the code; you use the module and that's it. Likewise here, an application developer/integrator need only load a module, and then all the routines in it are available. You don't for instance have to worry if the routines have names which overlap ones you were using, or ones used worse yet by another set of routines you want to use. I think Jean-Michael's comments were right. While I'm not sure if things will be as overwhelming as he predicted, packages (even as implimented in my patch) will help people develop code libraries for PostgreSQL. And that will make PostgreSQL applications easier. Also, as I've come to understand what schemas are and aren't, I've realized that they can be readily leveraged to help with schema support. Schemas, at least according to the SQL92 spec I have looked at (I'd love to see a later spec), are namespaces only for tables and views (and character sets and a number of other things which PostreSQL doesn't support). They don't touch on functions. Sure, PostgreSQL could decide to do something with functions, but if we do, we're improvising, and I see no reason to improvise differently than other DBMSs have done. There may be one, but I don't see it. Also, as I understand schemas (which could be wrong), there is a difference in emphasis between schemas and packages. Schemas are a way to partition your database, so that different parts of an application see only a subsection of the whole database. You can have some parts only able to access one or another schema, while other parts can access multiple schemas. Packages however are designed to help you build the tools to make the applications work (providing toolchests of code for instance). It's like schemas are a more top-down design element, and packages are bottom-up. Where I see the interaction is that we want to have different schemas have schema-specific functions, we just have a package implicitly assosciated with each schema which contains the traditional functions and aggregates (and types and operators) of that schema. > So the two questions I ask myself are: > > 1. Are package namespaces "better" than schemas? The answer to that is > no, because schemas are more standard and more general. See above; I never said packages were better than schemas (nor worse). I said they were different parts of the puzzle. I think they are both important and valuable. > 2. Are global variables via packages "better" than the existing setups? > My answer to that is again no, because the existing setups respect > language conventions, maintain the separation of the backend and the > language handlers, and of course they are already there and used. All package variables are to the backend are entries in a table, pg_packglobal, provided for the convenience of the language handler. If the handler doesn't want to do anything with them, then it doesn't and that's no loss. > So as a consequence we have to ask ourselves, > > 3. Do "packages" add anything more to the table than those two elementary > features? Please educate us. See above. > 4. Would it make sense to provide "packages" alongside the existing > mechanisms that accomplish approximately the same thing. That could be > debated, in case we agree that they are approximately the same thing. I don't agree that they are approximatly the same thing, though I agree that many of the things packages do can be cobbled together (more painfully) without them. Hopefully these explanations will come across clearer. Take care, Bill
On Sat, 13 Oct 2001, Tom Lane wrote: > Bill Studenmund <wrstuden@netbsd.org> writes: > > The other choice is to just give the function's name. The first place > > Postgres will look is in the package context used for parsing. If it's not > > there (and that context wasn't "standard"), then it will look in > > "standard". > > Hmm. How does/will all of this interact with SQL-style schemas? > > The reason I'm concerned is that if we want to retain the present > convention that the rowtype of a table has the same name as the table, > I think we are going to have to make type names schema-local, just > like table names will be. And if type names are local to schemas > then so must be the functions that operate on those types, and therefore > also operators (which are merely syntactic sugar for functions). Ahhh... There's the operators == sugar comment. I agree with you above; types and functions need to be schema-specific. > This seems like it will overlap and possibly conflict with the decisions > you've made for packages. It also seems possible that a package *is* > a schema, if schemas are defined that way --- does a package bring > anything more to the table? I'm repeating myself a little. :-) Packages aren't schemas. What they bring to the table is they facilitate making stored procedures (functions). You can have twelve different developers working on twenty different packages, with no fear of name conflicts. The package names will have to be different, so there can be functions with the same names in different pacakges. This ability isn't that important in small development projects, but is really important for big ones. Think about big db applications, like Clarify. Any project with multiple procedure authors. Without something like packages, you'd need to spend a lot of effort coordinating names & such so that they didn't conflict. With packages, it's rather easy. Also, I think PostgreSQL can challenge the commercial databases for these applications. But to do so, changing over to PG will need to be easy. Having packages there will greatly help. > I'd like to see schemas implemented per the spec in 7.3, so we need to > coordinate all this stuff. For the most part, I think packages and schemas are orthogonal. I'm taking a cue from Oracle here. Oracle considers packages to be a schema-specific object. Take care, Bill
On Sat, 13 Oct 2001, Bill Studenmund wrote: > On Sat, 13 Oct 2001, Tom Lane wrote: > > > I also wonder how the fixed, single-level namespace search path you > > describe interacts with the SQL rules for schema search. (I don't > > actually know what those rules are offhand; haven't yet read the schema > > parts of the spec in any detail...) > > Should be independent. The searching only happens when you are not in the > "standard" package, and you give just a function name for a function. > The searching would only happen in the current schems. If > you give a schema name, then I'd expect PG to look in that schema, in > standard, for that function. If you give both a schema and package name, > then PG would look in that package in that schema. My description of namespaces seems to have caused a fair bit of confusion. Let me try again. The ability of the package changes to automatically check standard when you give an ambiguous function name while in a package context is a convenience for the procedure author. Nothing more. It means that when you want to use one of the built in functions (date_part, abs, floor, sqrt etc.) you don't have to prefix it with "standard.". You can just say date_part(), abs(), floor(), sqrt(), etc. The only time you need to prefix a call with "standard." is if you want to exclude any so-named routines in your own package. I've attached a copy of a package I wrote as part of testing package initializers and package global variables. It is an adaptation of the Random package described in Chapter 8 of _Oracle8 PL/SQL Programming_ by Scott Urman. Other than adapting it to PostgreSQL, I also tweaked the RandMax routine to give a flat probability. Note the use of date_part() in the BODY AS section, and the use of rand() in randmax(). Both of these uses are the ambiguous sort of function naming which can trigger the multiple searching. Since they are in plpgsql code, they get parsed in the context of the random package. So when each of them gets parsed, parse_func first looks in the random package. For rand(), it will find the rand() function and use it. But for date_part(), since there isn't a date_part function in the package, we use the one in standard. If we didn't have this ability, one of the two calls would need to have had an explicit package with it. There are two choices (either "standard." would be needed for date_part(), or "random." for rand()), but I think both would lead to problems. Either choice makes the syntax heavy, for little gain. Also, if we scatter the package name throughout the package, if we ever want to change it, we have more occurences to change. Does that make more sense? Take care, Bill
On Sat, 13 Oct 2001, Peter Eisentraut wrote: > Bill Studenmund writes: > > > So what are packages? In Oracle, they are a feature which helps developers > > make stored procedures and functions. > > I think you have restricted yourself too much to functions and procedures. > A package could/should also be able to contain views, tables, and such. I disagree. Views and tables are the purview of schemas, which as I mentioned to Tom, strike me as being different from packages. Packages basically are modules which make life easier for functions (and types and aggregates and operators). If we really want to make tables and views and triggers part of packages, we can. My big concern is that it then makes pg_dump harder. I'll go into that more below. > > They provide a name space for functions local to the package, > > Namespacing is the task of schemas. I think of packages as a bunch of > objects that can be addressed under a common name (think RPMs). Regrettablely Oracle beat you to it with what "packages" are in terms of Oracle, and I suspect also in the minds of many DBAs. I also think that you and Tom have something different in mind about the namespacing in packages. It is purely a convenience for the package developer; whenever you want to use a function built into the database, you _don't_ have to type "standard." everywhere. Think what a PITA it would be to have to say "standard.abs(" instead of "abs(" in your functions! I'm sorry if my explanation went abstract quickly & making that unclear. > But it seems like some of this work could be used to implement schema > support. I think the big boost this will have to schema support is that it shows how to make a far-reaching change to PostgreSQL. :-) It's an internal schema change and more, just as schema support will be. > > session-specific package variables, > > I think this is assuming a little too much about how a PL might operate. > Some PLs already support this in their own language-specific way, with or > without packages. Thus, I don't think packages should touch this. > Actually, I think you could easily set up session variables in the package > initializer function. I agree that some PLs might do things their own way and so package variables won't be as useful. If these variables are not appropriate to a PL, it can ignore them. PL/pgSQL is a counter-example, though, showing that something needs to be done. It is not set up to support global variables; each code block generates its own namespace, and removes it on the way out. Thus I can not see a clean way to add package global variables to say the initialization routine - this routine's exit code would need to not destroy the context. That strikes me as a mess. > > The last component of a package are the functions usable for type > > declarations. They are declared as: > > BEFORE TYPE FUNCTION <standard package function declaration> > > > > They are useful as the normal functions in a package are declared after > > the types are declared, so that they can use a type newly-defined in a > > package. > > I think it would make much more sense to allow the creation of objects in > the CREATE PACKAGE command in any order. PostgreSQL has not so far had a > concept of "functions suitable for type declarations" and we shouldn't add > one. I think you misread me slightly. BEFORE TYPE FUNCTION functions are "usable" for type declarations, not "suitable" for them. Also, I didn't say one key clause, "in this package". The main difference is when in the creation of the package the functions are created; they get created before the types, rather than after. This concept is new to PostgreSQL because PostgreSQL has never before chained creations together like this. Thinking about it though it would be feasable to scan the list of types in the package, and see if there are references to functions declared in that package, and if so to create them before the types get declared. That would remove the need for BEFORE TYPE FUNCTION and also make pg_dump a little simpler. Take care, Bill
* Bill Studenmund <wrstuden@netbsd.org> wrote: | | Packages aren't schemas. What they bring to the table is they facilitate | making stored procedures (functions). You can have twelve different | developers working on twenty different packages, with no fear of name | conflicts. The package names will have to be different, so there can be | functions with the same names in different pacakges. Hmm. But if we had schema support can't we just package those procedures into a schema with a given name ? Maybe my stored procedures needs some other resources as well that should not conflict with other packages, like temp tables or such. It then seems to me that using schemas can solve everything that packages do and more ? | For the most part, I think packages and schemas are orthogonal. I'm taking | a cue from Oracle here. Oracle considers packages to be a schema-specific | object. What is really the difference functionality wise of making a subschema and package ? In both cases you deal with the namespace issues. -- Gunnar Rønning - gunnar@polygnosis.com Senior Consultant, Polygnosis AS, http://www.polygnosis.com/
Gunnar Rønning <gunnar@polygnosis.com> writes: > Hmm. But if we had schema support can't we just package those procedures > into a schema with a given name ? Maybe my stored procedures needs some other > resources as well that should not conflict with other packages, like temp > tables or such. It then seems to me that using schemas can solve everything > that packages do and more ? Yeah. I am wondering whether we couldn't support Oracle-style packages as a thin layer of syntactic sugar on top of schemas. I am concerned about the prospect that "foo.bar" might mean either "object bar in schema foo" or "object bar in package foo". regards, tom lane
* Tom Lane <tgl@sss.pgh.pa.us> wrote: | | > resources as well that should not conflict with other packages, like temp | > tables or such. It then seems to me that using schemas can solve everything | > that packages do and more ? | | Yeah. I am wondering whether we couldn't support Oracle-style packages | as a thin layer of syntactic sugar on top of schemas. I am concerned | about the prospect that "foo.bar" might mean either "object bar in | schema foo" or "object bar in package foo". Agreed, and in Sybase you may declare a procedure in a schema(or database which is the Sybase term). If you want it global you declare it in the "master" schema. -- Gunnar Rønning - gunnar@polygnosis.com Senior Consultant, Polygnosis AS, http://www.polygnosis.com/
On 19 Oct 2001, Gunnar [iso-8859-1] R�nning wrote: > * Bill Studenmund <wrstuden@netbsd.org> wrote: > | > | Packages aren't schemas. What they bring to the table is they facilitate > | making stored procedures (functions). You can have twelve different > | developers working on twenty different packages, with no fear of name > | conflicts. The package names will have to be different, so there can be > | functions with the same names in different pacakges. > > Hmm. But if we had schema support can't we just package those procedures > into a schema with a given name ? Maybe my stored procedures needs some other > resources as well that should not conflict with other packages, like temp > tables or such. It then seems to me that using schemas can solve everything > that packages do and more ? Assuming that schema support covers functions (which Tom, I, evidently you, and Oracle think it should but which isn't mentioned at least in SQL92), you could do that. And if you're adding tables, you probably should. But a lot of times you don't need to go to the effort of namespacing off a whole new schema, and I can think of some cool things to do when you don't. One example is a large, complicated db app with multiple programmers. For each general area of the app, you can create a package. That way you modularize the code into more managable pieces. But since the are all in the same schema, they can maintain/interact with the same tables. So that's an arguement for packages/subschemas. > | For the most part, I think packages and schemas are orthogonal. I'm taking > | a cue from Oracle here. Oracle considers packages to be a schema-specific > | object. > > What is really the difference functionality wise of making a subschema and > package ? In both cases you deal with the namespace issues. A matter of what is subspaced. I'd assume that a subschema namespaces off everything a schema does. A package however only namespaces off functions and aggregates. Packages, at least as I've implimented them, do *not* namespace off types nor operators they contain. Technically, the package oid is a key in the name index for pg_proc and pg_aggregate, while it is not for pg_type and pg_operator. I admit, I took a minor liberty here. Oracle packages do have types, but Oracle types are not as rich as PostgreSQL's So when I was translating packages, I made the types in them match PostgreSQL's. Also, since I'd added aggregates and types, adding operators seemed like a reasonable thing. Both from the point of view of the parser (they are all done about the same way), and from the point of utility. PostgreSQL's ability to add types is really cool, and the ability to add operators makes new types convenient to use. If packages could add types and support functions but not operators, that'd seem lame. The reason that packages don't namespace off types and operators is I think it makes them more useful. Think about the complex number example in the programmer's guide. I can think of scientific applications which could use them. But having to say package.complex for the type would be combersome. And even worse, having to say package.+ or package.- would be bad. And package.* might be ambiguous to the parser! So that's why I made pacakges not be subschemas. Packages were designed to help with writing stored procedures, and to do it well. :-) Take care, Bill
On 19 Oct 2001, Gunnar [iso-8859-1] R�nning wrote: > * Tom Lane <tgl@sss.pgh.pa.us> wrote: > | > | Yeah. I am wondering whether we couldn't support Oracle-style packages > | as a thin layer of syntactic sugar on top of schemas. I am concerned > | about the prospect that "foo.bar" might mean either "object bar in > | schema foo" or "object bar in package foo". > > Agreed, and in Sybase you may declare a procedure in a schema(or > database which is the Sybase term). If you want it global you declare it > in the "master" schema. Oh cool. I knew that Oracle used "standard" for the name of the built-in package, but I didn't know a name for the built-in schema. "master" sounds good. Take care, Bill
On Fri, 19 Oct 2001, Tom Lane wrote: > Yeah. I am wondering whether we couldn't support Oracle-style packages > as a thin layer of syntactic sugar on top of schemas. I am concerned > about the prospect that "foo.bar" might mean either "object bar in > schema foo" or "object bar in package foo". See my note to Gunnar for why I think packages should be inside of schemas rather than renamed schemas. Types and expecially operators would be much more useful to the enclosing schema that way (I think). Yes, there is an ambiguity between schema foo and package foo. I can think of a few ways to deal with this. 1) Do whatever Oracle does, assuming it's not grotesque. Yes, I've said that a lot. But I think PostgreSQL can really take some applications away from the commercial DBMSs, and Oracle is #1 in that market. So Oracle represents Prior Art of least surprise. :-) 2) If there is both a schema named foo and a package named foo, then foo.bar should always take foo to be the schema. If we let a package in the local schema named foo be found before the schema foo, then we would get different results in said schema and another one (which didn't have a package named foo in it). 3) Don't let schemas and packages have the same name. I actually believe this is what Oracle does, though I haven't checked. I _have_ checked that packages and tables can't have the same name, and built that into the packages patches. I think requiring schemas to have names different from tables and packages is a good thing, and would reduce ambiguity. As an aside the reason I suspect this is what Oracle does is that Oracle has a system table which contains a list of named objects. Tables and packages show up as entries in this table, and I'd expect schemas would too. Take care, Bill
Bill Studenmund writes: > > create function produce(text) returns text as ' > > GD["key"] = args[0] > > ' language plpython; > > > > create function consume() returns text as ' > > return GD["key"] > > ' language plpython; > > > > There is also a dictionary for private data. > > Private to what? Private to the procedure, but saved across calls (during one session). > Oh, by shared memory, do you mean SYSV Shared Memory (like how the > backends talk) or just memory shared between routines? I ask as part of > the idea with these variables is that they are backend-specific. So C > routines actually should NOT used SYSV Shared Mem. :-) Yes, you're right. Actually, sharing data across PostgreSQL C functions is trivial because you can just use global variables in your dlopen modules. -- Peter Eisentraut peter_e@gmx.net http://funkturm.homeip.net/~peter
>I think Jean-Michael's comments were right. While I'm not sure if things >will be as overwhelming as he predicted, packages (even as implimented in >my patch) will help people develop code libraries for PostgreSQL. And that >will make PostgreSQL applications easier. PostgreSQL is a fantastic tool which lacks a few features to become #1. IMHO, these features are :> Beginners: ability to drop and reorganize columns. I know this sounds stupid for hackers, but this is #1 need when migrating from beginner tools such as MySQL or Access. Candidates?> Advanced users: PACKAGE support to create and distribute software libraries. CREATE OR REPLACE VIEW, CREATE OR REPLACE TRIGGER, etc... PL/pgSQL installation by default with infinite loop protection.> Professionnal user: PostgreSQL does not lack many things.Maybe server-side Java would be great in terms of object/inheritence approach. I run several databases, one being hosted on a double Pentium Linux box with U2W discs. When using triggers, views, rules and PL/pgSQL, applications can be optimized so much that you "hardly" reach the hardware limits.> Power users: load balancing, replication, tablespace.I can't really say. I first discovered PostgreSQL when localizing Oracle8i to French. We asked Oracle if I could use their software to help us during the translation process. They answered "OK, but you have to pay $xx.xxx because you have a double processor box". This was about twice the price we were getting paid. That day, I understood Oracle did not care about its users and was only interested in fast, short term profit. Cheers, Jean-Michel
----- Original Message ----- From: Bill Studenmund <wrstuden@netbsd.org> Sent: Sunday, October 14, 2001 8:22 AM > My description of namespaces seems to have caused a fair bit of confusion. > Let me try again. > > The ability of the package changes to automatically check standard when > you give an ambiguous function name while in a package context is a > convenience for the procedure author. Nothing more. > > It means that when you want to use one of the built in functions > (date_part, abs, floor, sqrt etc.) you don't have to prefix it with > "standard.". You can just say date_part(), abs(), floor(), sqrt(), etc. > The only time you need to prefix a call with "standard." is if you want to > exclude any so-named routines in your own package. Quick question: would it be possible then create a 'system' package and 'system' (or 'master' if you will) schema (when it's implemented), move over all the system tables (pg_*) into the master schema and functions into the 'system' package, so that no name conflicts will arise when creating types, functions, tables, etc with the same names as system ones? -- S.
On Sat, 20 Oct 2001, Peter Eisentraut wrote: > Yes, you're right. Actually, sharing data across PostgreSQL C functions > is trivial because you can just use global variables in your dlopen > modules. Exactly. That's why I never envisioned "C" or "internal" functions using package global variables. :-) Take care, Bill
On Sat, 20 Oct 2001, Serguei Mokhov wrote: > > It means that when you want to use one of the built in functions > > (date_part, abs, floor, sqrt etc.) you don't have to prefix it with > > "standard.". You can just say date_part(), abs(), floor(), sqrt(), etc. > > The only time you need to prefix a call with "standard." is if you want to > > exclude any so-named routines in your own package. > > Quick question: would it be possible then create a 'system' package > and 'system' (or 'master' if you will) schema (when it's implemented), > move over all the system tables (pg_*) into the master schema > and functions into the 'system' package, so that no name conflicts will arise > when creating types, functions, tables, etc with the same names as system ones? Yes. That is part of my plan actually. :-) In the patch I sent in last week, all of the built-in functions and aggregates are in the "standard" package, and you can infact reference them as standard.foo. Moving types, operators, and relations (and whatever else should go there) into "master" was part of my plan for schemas. Take care, Bill
But what if you want a C function to set a variable which can be accessed using an SQL, perl, PLpgSQL or other function type? Shouldn't a global variable be global between all types of functions? -- Rod Taylor There are always four sides to every story: your side, their side, the truth, and what really happened. ----- Original Message ----- From: "Bill Studenmund" <wrstuden@netbsd.org> To: "Peter Eisentraut" <peter_e@gmx.net> Cc: "PostgreSQL Development" <pgsql-hackers@postgresql.org> Sent: Friday, October 19, 2001 1:59 PM Subject: Re: [HACKERS] Package support for Postgres > On Sat, 20 Oct 2001, Peter Eisentraut wrote: > > > Yes, you're right. Actually, sharing data across PostgreSQL C functions > > is trivial because you can just use global variables in your dlopen > > modules. > > Exactly. That's why I never envisioned "C" or "internal" functions using > package global variables. :-) > > Take care, > > Bill > > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/users-lounge/docs/faq.html >
----- Original Message ----- From: Bill Studenmund <wrstuden@netbsd.org> Sent: Friday, October 19, 2001 2:04 PM > > > It means that when you want to use one of the built in functions > > > (date_part, abs, floor, sqrt etc.) you don't have to prefix it with > > > "standard.". You can just say date_part(), abs(), floor(), sqrt(), etc. > > > The only time you need to prefix a call with "standard." is if you want to > > > exclude any so-named routines in your own package. > > > > Quick question: would it be possible then create a 'system' package > > and 'system' (or 'master' if you will) schema (when it's implemented), > > move over all the system tables (pg_*) into the master schema > > and functions into the 'system' package, so that no name conflicts will arise > > when creating types, functions, tables, etc with the same names as system ones? > > Yes. That is part of my plan actually. :-) Hmm. I see. Then there won't be a problem of creating any DB object with the system name. > In the patch I sent in last week, Yeah, I remember that one. Took me a couple of minutes to download. You know, it never hurts to compress things: then the patch would be ~10 times less in size, and you wouldn't have to worry about PINE messing up with your code in the message body... :) And that would reduce the bounce rate too. Just a kind and gentle cry to reduce the size of patches sent to my mailbox and save some bandwidth on the way :) > all of the built-in functions and > aggregates are in the "standard" package, and you can infact reference > them as standard.foo. When you refer to it just foo(), and you have foo() defined in more than one package, how do you resolve this? Do you also have a notion of a global package and sub-packages? -- Serguei
On Sun, 21 Oct 2001, Serguei Mokhov wrote: > ----- Original Message ----- > From: Bill Studenmund <wrstuden@netbsd.org> > Sent: Friday, October 19, 2001 2:04 PM > > > > Quick question: would it be possible then create a 'system' package > > > and 'system' (or 'master' if you will) schema (when it's implemented), > > > move over all the system tables (pg_*) into the master schema > > > and functions into the 'system' package, so that no name conflicts will arise > > > when creating types, functions, tables, etc with the same names as system ones? > > > > Yes. That is part of my plan actually. :-) Oh, one reason that needs to happen is that everything needs to be in a package or a schema; for the tables where they do namespacing the schema or package is part of the primary key. > Hmm. I see. Then there won't be a problem of creating any DB object > with the system name. It will work, though if you start creating tables named "pg_class", I think you might make your head hurt. Also, your own int4 type might not be such a good idea... > > In the patch I sent in last week, > > Yeah, I remember that one. Took me a couple of minutes > to download. You know, it never hurts to compress things: > then the patch would be ~10 times less in size, and you wouldn't > have to worry about PINE messing up with your code in the message body... :) > And that would reduce the bounce rate too. > > Just a kind and gentle cry to reduce the size of patches sent to > my mailbox and save some bandwidth on the way :) Ok. :-) Next time I will either compress it or I'll mail in a URL. > > all of the built-in functions and > > aggregates are in the "standard" package, and you can infact reference > > them as standard.foo. > > When you refer to it just foo(), and you have foo() defined > in more than one package, how do you resolve this? Do you also have > a notion of a global package and sub-packages? There is a very simple search path system. If you are in a package (in a function that is part of a package), you look for foo in that package. If you don't find it there, you look in stadard. If it's not there, you don't find it. To look in other packages than the one you're in, you have to say which one it is. With schemas, if your package is not in "master" or whatever it is called, you look first in your package, then in your_schema.standard, then in master.standard. Take care, Bill
On Sat, 20 Oct 2001, Rod Taylor wrote: > But what if you want a C function to set a variable which can be > accessed using an SQL, perl, PLpgSQL or other function type? > Shouldn't a global variable be global between all types of functions? No. Doing that requires that all languages have the same internal storage of variables. And it's more than just an int4 takes up 4 bytes. Look in the plpgsql source, at struct PLpgSQL_var. There is a fair amount of into about a variable. While we could harmonize the info storage, making globals global across all languages would also would mean breaking down a lot of the isolation between PLs. Right now they are their own independent entities. To tie them together like this would, in my opinion, make them fragilly-interconnected. My suggestion is to just add a get and a set routine in one language, and have it store the global. :-) Take care, Bill
Based on this email, I am assuming we don't want to add Package support but instead will do it with schemas, which Tom is already working on. --------------------------------------------------------------------------- Peter Eisentraut wrote: > Bill Studenmund writes: > > > Honestly, I do not understand why "global variables" have been such a sore > > point for you. > > My point is that the proposed "package support" introduces two features > that are a) independent, and b) already exist, at least in design. > Schemas are already planned as a namespace mechanism. Global variables in > PLs already exist in some PLs. Others can add it if they like. There > aren't any other features introduced by "package support" that I can see > or that you have explicitly pointed out. > > So the two questions I ask myself are: > > 1. Are package namespaces "better" than schemas? The answer to that is > no, because schemas are more standard and more general. > > 2. Are global variables via packages "better" than the existing setups? > My answer to that is again no, because the existing setups respect > language conventions, maintain the separation of the backend and the > language handlers, and of course they are already there and used. > > So as a consequence we have to ask ourselves, > > 3. Do "packages" add anything more to the table than those two elementary > features? Please educate us. > > 4. Would it make sense to provide "packages" alongside the existing > mechanisms that accomplish approximately the same thing. That could be > debated, in case we agree that they are approximately the same thing. > > -- > Peter Eisentraut peter_e@gmx.net http://funkturm.homeip.net/~peter > > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster > -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026