Re: SQL-Invoked Procedures for 8.1 - Mailing list pgsql-hackers
From | Josh Berkus |
---|---|
Subject | Re: SQL-Invoked Procedures for 8.1 |
Date | |
Msg-id | 200409231112.18941.josh@agliodbs.com Whole thread Raw |
In response to | SQL-Invoked Procedures for 8.1 (Gavin Sherry <swm@linuxworld.com.au>) |
Responses |
Re: SQL-Invoked Procedures for 8.1
Re: SQL-Invoked Procedures for 8.1 Re: SQL-Invoked Procedures for 8.1 Re: SQL-Invoked Procedures for 8.1 |
List | pgsql-hackers |
Gavin, Neil, > Following is a proposal to implement what SQL2003 calls 'SQL-Invoked > Procedures' and what most people refer to as stored procedures. Fujitsu > will be funding Neil Conway and I to work on this feature. Which, by the way, is way keen. My comments are based on having professionally written several hundred thousand lines of procedural code for PostgreSQL, SQL Server, and Oracle. I believe that your interpretation of the spec is correct but that there are several things not covered by the spec, but implemented by other RDBMSes, which make stored procedures *useful* which have been omitted. I feel strongly that these things will make a large difference to people thinking of migrating to PostgreSQL from other DBMSes, and want to make sure that Neil's implementation does not make them harder, instead of easier, to to implement later. > Procedures are nearly identical to functions. IMHO, this is largely because the spec regards a great deal of SP functionality to be "implementation-defined", and is thus kept as vague as possible. In practice, other DBMSes which have both SPs and Functions treat them *very* differently. > 3) Procedures can be run in the same savepoint level as the caller when > OLD SAVEPOINT LEVEL is specified at creation time. According to SQL2003, > functions must be run on a new savepoint level. From my understanding, we > do not do this currently. > > Work will focus on 1 and 2 until we have the concept of savepoint levels > with functions. Its possible that we will implement this too if there is > demand. One of the things which differentiates SPs on other DBs from PostgreSQL Functions is transactionality. In SQL Server and Oracle, SPs are not automatically a transaction; instead, they contain transactions within them. This is vitally important to DBAs who want to use SPs to automate database maintenance, loads, transformations, and other activities which require checkpointing within the course of a program. For example, I run a nightly data transformation for one client which requires 16 steps with VACUUMs, ANALYZEs and error-handling between them. Currently, the only way I can implement this for PostgreSQL is to have an external program (Perl, in my case) manage this and call each step as a separate function. It would be far easier to manage if I could put all of the steps, including the vaccums inside one long-running SP, but the required transaction container prevents this. > If we go down the route of saying that procedures are a type of function, > we have the option of allowing users access to OUT and INOUT in functions. > This would make procedures simply a subset of functions. What do people > think? Well, to be frank, my first thought is, why bother? If you're just going to implement some syntatic sugar on top of the current Function feature, why bother at all? Given the opportunity, I would far prefer to set us on a road that would allow us to replicate -- and exceed -- the functionality of Oracle's PL/SQL. This proposal does not do that; in fact, if someone were to start implementing such functionality later they might find this code a stumbling block. > There will be cases when we need to identify whether a routine is a > function or a procedure. This could be done two ways. We could say that > any proc in pg_proc which returns void is a procedure or we could store > this in some 'protype' column. Thoughts? Well, see my thoughts above on differentiating SPs from Functions. I certainly don't think we should be using the same table. A second point, which I brought up with you on IRC, is to eliminate overloading and allow named parameter calls on SPs. This is extremely useful functionality in T-SQL and PL/SQL; in fact, I'd say that it's essential for any operation that wants to create an SP-centric middleware as only named parameter calls allow developers to add parameters to existing procedures without breaking existing calls. For anyone who doesn't know what I'm talking about, it's this form: CREATE PROCEDURE do_some_work ( alpha INT, beta INT, gamma TEXT ) etc. Where you can: CALL do_some_work( alpha = 5, beta = 7 ) .. and then gamma picks up its default, if any, or even: CALL do_some_work( gamma = 'Lord Fontleroy', beta = 7, alpha = 1 ) The idea being that for SPs, schema.name is unique regardless of the parameters. Even if implementing named parameter calls is beyond the current spec, I will argue strongly in favor of eliminating overloading for SPs. Overloading serves no purpose for them and prohibits the whole concept of default values. > Other databases, and SQL2003, support a few different implementations what > could be called variables. In SQL2003, there are a few types of > 'variables': I completely follow your thinking about variables, and find it makes sense. > SET VARIABLE <varname> = <val> > SHOW VARIABLE <varname> The problem with this is name collisions with the GUC -- especially as it now allows add-ins to create their own GUC variables. However intuitive SET and SHOW are, they will lead to problems. Maybe SETVAR and SHOWVAR? Or using the PL/pgsql assignment operator for the first, and DISPLAY for the second, e.g.: num_logos := 917; DISPLAY num_logos; Also, you do realize that this is going to lead to requests for SELECT .... INTO on the SQL command line, don't you? > The other option is that we do it at the protocol level and modify libpq > (and psql) to add support. This would allow us something like: This may be a better approach. I've personally never been comfortable with the use of variables outside of SPs and packages; it seems orthagonal to the declaritive nature of SQL. However, this is a aesthic thing and not really based on practical considerations. The more practical consideration is, where will OUT and INOUT parameters be used? Do we want them returned to the SQL session or directly to the calling client? I would think that practicality would argue in favor of the latter; I can't see needing variables in SQL except for testing, and having them in psql will allow me that. > That might need some massaging but you get the idea. The commands to psql > translate to lower level protocol level commands which are: create > variable, set (ie, modify the value of) variable and describe variable. > Neil thinks we should have create and set in one step. This is great, since > it covers most cases, but I'm not sure how we have uninitialised values. > Perhaps we set to NULL? Yes. > The only other question (that I can think of now) with respect to > variables is how they are affected by transactions. My gut feel is > that they should have transactional semantics. For example: I agree strongly with this, especially since we'll be using Savepoints inside the SPs. Having one's data mods roll back but not the variable values would be confusing and lead to a *lot* of debugging. > We can use the same permissions as for functions. Agreed. -- Josh Berkus Aglio Database Solutions San Francisco
pgsql-hackers by date: