Thread: Packages, inner subprograms, and parameterizable anonymous blocks for PL/pgSQL
Packages, inner subprograms, and parameterizable anonymous blocks for PL/pgSQL
From
Bryn Llewellyn
Date:
Folks who develop applications for Oracle Database have had the features that the subject line of this email lists sincethe arrival of PL/SQL in the early nineties. The advantages are self-evident to these programmers; and their lack comesas a shocking disappointment when they start to write application code for PostgreSQL*. The absence of packages andinner subprograms is huge. The absence of parameterizable anonymous blocks is a smaller limitation. Notice that this point is entirely separable from the endeavor of migrating an extant application. It has first and foremostto do with how you think of designing code. I’ve heard rumors that some contributors to the PostgreSQL implementation are interested in bringing the PL/pgSQL featuresthat I mentioned. If there is any such thinking, please let me know. I’m not a C coder but I’d be very interestedin reader any ordinary prose that describes how these features might be exposed to the PostgreSQL application developer. ________________________________________________________________________________ * Full disclosure: I was the product manager for PL/SQL, working at Oracle HQ, from about 2000 through 2019 when I startedwith Yugabyte, Inc. At least some people on this list have heard of YugabyteDB and know that it uses Postgres’s SQLprocessing code “as is” (currently Version 11.2, but presently Version 13) on top of its own implementation of a distributedstorage layer (inspired by Google Spanner).
Re: Packages, inner subprograms, and parameterizable anonymous blocks for PL/pgSQL
From
Adrian Klaver
Date:
On 12/16/21 11:22, Bryn Llewellyn wrote: > Folks who develop applications for Oracle Database have had the features that the subject line of this email lists sincethe arrival of PL/SQL in the early nineties. The advantages are self-evident to these programmers; and their lack comesas a shocking disappointment when they start to write application code for PostgreSQL*. The absence of packages andinner subprograms is huge. The absence of parameterizable anonymous blocks is a smaller limitation. > > Notice that this point is entirely separable from the endeavor of migrating an extant application. It has first and foremostto do with how you think of designing code. > > I’ve heard rumors that some contributors to the PostgreSQL implementation are interested in bringing the PL/pgSQL featuresthat I mentioned. If there is any such thinking, please let me know. I’m not a C coder but I’d be very interestedin reader any ordinary prose that describes how these features might be exposed to the PostgreSQL application developer. Not following. To be exposed they have to exist and that is not the case in the community Postgres. The relevant question would seem to be, how do I get these features built? > ________________________________________________________________________________ > > * Full disclosure: I was the product manager for PL/SQL, working at Oracle HQ, from about 2000 through 2019 when I startedwith Yugabyte, Inc. At least some people on this list have heard of YugabyteDB and know that it uses Postgres’s SQLprocessing code “as is” (currently Version 11.2, but presently Version 13) on top of its own implementation of a distributedstorage layer (inspired by Google Spanner). > > > -- Adrian Klaver adrian.klaver@aklaver.com
Re: Packages, inner subprograms, and parameterizable anonymous blocks for PL/pgSQL
From
Bryn Llewellyn
Date:
adrian.klaver@aklaver.com wrote:bryn@yugabyte.com wrote:Folks who develop applications for Oracle Database have had the features that the subject line of this email lists since the arrival of PL/SQL in the early nineties. The advantages are self-evident to these programmers; and their lack comes as a shocking disappointment when they start to write application code for PostgreSQL*. The absence of packages and inner subprograms is huge. The absence of parameterizable anonymous blocks is a smaller limitation.
Notice that this point is entirely separable from the endeavor of migrating an extant application. It has first and foremost to do with how you think of designing code.
I’ve heard rumors that some contributors to the PostgreSQL implementation are interested in bringing the PL/pgSQL features that I mentioned. If there is any such thinking, please let me know. I’m not a C coder but I’d be very interested in reader any ordinary prose that describes how these features might be exposed to the PostgreSQL application developer.
Not following. To be exposed they have to exist and that is not the case in the community Postgres. The relevant question would seem to be, how do I get these features built?Bryn continued:
* Full disclosure: I was the product manager for PL/SQL, working at Oracle HQ, from about 2000 through 2019 when I started with Yugabyte, Inc. At least some people on this list have heard of YugabyteDB and know that it uses Postgres’s SQL processing code “as is” (currently Version 11.2, but presently Version 13) on top of its own implementation of a distributed storage layer (inspired by Google Spanner).
Oops. I did a typo. I’d meant to write “I’d be very interested in *reading* any ordinary prose…”
I can’t parse your “To be exposed they have to exist and that is not the case…” Do you mean that the rumor that I heard is wrong and that nobody has said to the Postgres community that they’ve embarked on, or at least are interested in, implementing what I’m asking about?
I had assumed that the answer to “How do I get these features built?” was “Write a C implementation and submit it for consideration”. But I can’t do that. The obvious Google searches like “Submit enhancement request for PostgreSQL” turn up only informal emails to lists like this. Is there a better answer?
Re: Packages, inner subprograms, and parameterizable anonymous blocks for PL/pgSQL
From
Adrian Klaver
Date:
On 12/16/21 12:36, Bryn Llewellyn wrote: > // >> /adrian.klaver@aklaver.com <mailto:adrian.klaver@aklaver.com> wrote: >> / >> >> >> Not following. To be exposed they have to exist and that is not the >> case in the community Postgres. The relevant question would seem to >> be, how do I get these features built? >> >>> /Bryn continued:/ >>> >>> * Full disclosure: I was the product manager for PL/SQL, working at > > Oops. I did a typo. I’d meant to write “I’d be very interested in > *reading* any ordinary prose…” That I figured out:) > > I can’t parse your “To be exposed they have to exist and that is not the > case…” Do you mean that the rumor that I heard is wrong and that nobody > has said to the Postgres community that they’ve embarked on, or at least > are interested in, implementing what I’m asking about? Not that I know of. What was being danced around is that Oracle compatibility is a key feature of EDB's business model: https://www.enterprisedb.com/products/edb-postgres-advanced-server-secure-ha-oracle-compatible > > I had assumed that the answer to “How do I get these features built?” > was “Write a C implementation and submit it for consideration”. But I > can’t do that. The obvious Google searches like “Submit enhancement > request for PostgreSQL” turn up only informal emails to lists like this. > Is there a better answer? > Not really, though if you want to bring this up --hackers is the best list. -- Adrian Klaver adrian.klaver@aklaver.com
Re: Packages, inner subprograms, and parameterizable anonymous blocks for PL/pgSQL
From
Peter Geoghegan
Date:
On Thu, Dec 16, 2021 at 11:22 AM Bryn Llewellyn <bryn@yugabyte.com> wrote: > The advantages are self-evident to these programmers; and their lack comes as a shocking disappointment when they startto write application code for PostgreSQL*. The absence of packages and inner subprograms is huge. Why are those things huge? It's not self-evident to me. I can only speak for myself, but throwing around terms like "shocking disappointment" is never going to convince me of anything. You can make similar statements about many other things. Any functionality that gets added to PostgreSQL should ideally be compelling to users that have not worked with Oracle in the past. Maybe that happens to result in a feature that very much looks like what you already have in mind, or maybe there are significant differences. If you're not going to meet others closer to where they are, then how can you expect it for yourself? -- Peter Geoghegan
Re: Packages, inner subprograms, and parameterizable anonymous blocks for PL/pgSQL
From
Pavel Stehule
Date:
čt 16. 12. 2021 v 20:22 odesílatel Bryn Llewellyn <bryn@yugabyte.com> napsal:
Folks who develop applications for Oracle Database have had the features that the subject line of this email lists since the arrival of PL/SQL in the early nineties. The advantages are self-evident to these programmers; and their lack comes as a shocking disappointment when they start to write application code for PostgreSQL*. The absence of packages and inner subprograms is huge. The absence of parameterizable anonymous blocks is a smaller limitation.
I don't think Postgres needs packages - this is a redundant concept in Postgres, when Postgres has schemas (different from Oracle's schemas) and extensions.
There are a lot of successful migrations from Oracle to Postgres that shows so that the absence of mentioned features isn't too huge. Postgres is just not compatible with Oracle. The compatibility with Oracle is not possible without monstrous increasing size and complexity, and this is a benefit just for a small part of users. A lot of packages and concepts in Oracle are obsolete, or maybe not too well designed (from today's perspective). After my experience I think there are a lot of things that are possible in stored procedures, but I am sure it is not good to do it, and I don't think we need to promote these patterns in Postgres.
This doesn't mean that we can stop developing functionality around stored procedures in Postgres. I am working on session variables (an alternative to package variables), and a few years ago I proposed a concept of schema's private objects. Oracle has packages as inheritance of Ada language. The stored procedures in Oracle use a slightly different conceptual model. It is more like an application executed on the server side. On the other hand, the schema in Postgres has different functionality than in Oracle, and allows a more precious game with access rights, and although the syntax of stored procedures is very near to Oracle, the concept is different, because PL/pgSQL is executed inprocess inside SQL engine (all PL). More - PLpgSQL is an easy fully interpreted language without optimization (the environment is very well integrated with SQL engine, but still with low complexity), and trying to repeat some patterns from PL/SQL can be very counterproductive.
Notice that this point is entirely separable from the endeavor of migrating an extant application. It has first and foremost to do with how you think of designing code.
I’ve heard rumors that some contributors to the PostgreSQL implementation are interested in bringing the PL/pgSQL features that I mentioned. If there is any such thinking, please let me know. I’m not a C coder but I’d be very interested in reader any ordinary prose that describes how these features might be exposed to the PostgreSQL application developer.
________________________________________________________________________________
* Full disclosure: I was the product manager for PL/SQL, working at Oracle HQ, from about 2000 through 2019 when I started with Yugabyte, Inc. At least some people on this list have heard of YugabyteDB and know that it uses Postgres’s SQL processing code “as is” (currently Version 11.2, but presently Version 13) on top of its own implementation of a distributed storage layer (inspired by Google Spanner).
Re: Packages, inner subprograms, and parameterizable anonymous blocks for PL/pgSQL
From
Mladen Gogala
Date:
On 12/16/21 16:48, Pavel Stehule wrote: > I don't think Postgres needs packages - this is a redundant concept in > Postgres, when Postgres has schemas (different from Oracle's schemas) > and extensions. > Yes, ORAFCE uses schemas as the package names. However, one very practical thing is missing: session variables. Yes, you can emulate those with ON COMMIT PRESERVE ROWS temporary tables, but that's a rather ugly hack. On the other hand, packages can easily be emulated by using Python. Having packages would make PLPg/SQL programming much prettier. It would be much prettier to group related routines into a package than to have them laying around without anything indicating that the routines are related. On the plus side, packages would make it much easier to migrate from Oracle to Postgres. And you do want that, don't you? I am very well aware of the animosity toward the Oracle community and the reasons for that animosity. Oracle wiped the floor with the predecessor of Postgres, the database called "Ingres". Sandra Kurtzig, the Ingres CEO at the the time, and Michael Stonebraker were both involved in very public spat with Larry Ellison. Stonebraker is still very much a factor in Postgres community and I doubt that his feelings toward the Larry and his company have got any warmer with years. Postgres was created with the idea of fighting the Dark Lord in the land of Belmont, CA. However, if it is the goal of the community to have users convert from Oracle to Postgres, than making that conversion easy would be conducive to that goal. A long time ago a former pastry baker wrote the following article: https://www.toolbox.com/tech/data-management/blogs/why-postgresql-doesnt-have-query-hints-020411/ Disclosure: I am the Oracle DBA that Josh Berkus is talking about in that article. Recent events with "advice" to a very well known Oracle personality testify to the fact that the emotions haven't changed much since that article was written. Of course, Oracle is still the most popular database in the world. -- Mladen Gogala Database Consultant Tel: (347) 321-1217 https://dbwhisperer.wordpress.com
Re: Packages, inner subprograms, and parameterizable anonymous blocks for PL/pgSQL
From
Peter Geoghegan
Date:
On Thu, Dec 16, 2021 at 6:39 PM Mladen Gogala <gogala.mladen@gmail.com> wrote: > I am very well aware of the animosity toward the Oracle community and > the reasons for that animosity. Oracle wiped the floor with the > predecessor of Postgres, the database called "Ingres". Sandra Kurtzig, > the Ingres CEO at the the time, and Michael Stonebraker were both > involved in very public spat with Larry Ellison. These events happened several years before I was born. > Stonebraker is still > very much a factor in Postgres community and I doubt that his feelings > toward the Larry and his company have got any warmer with years. No, he isn't. I think that Stonebraker appeared at a few events that were hosted by EDB, but to the best of my knowledge that's the extent of his involvement in the project after university POSTGRES. I've personally never met the man, or communicated with him online, and I've worked on Postgres more or less full time for a full decade now. As far as I'm aware he hasn't ever publicly posting to any of the mailing lists. -- Peter Geoghegan
Re: Packages, inner subprograms, and parameterizable anonymous blocks for PL/pgSQL
From
Pavel Stehule
Date:
Hi
pá 17. 12. 2021 v 3:39 odesílatel Mladen Gogala <gogala.mladen@gmail.com> napsal:
On 12/16/21 16:48, Pavel Stehule wrote:
> I don't think Postgres needs packages - this is a redundant concept in
> Postgres, when Postgres has schemas (different from Oracle's schemas)
> and extensions.
>
Yes, ORAFCE uses schemas as the package names. However, one very
practical thing is missing: session variables. Yes, you can emulate
those with ON COMMIT PRESERVE ROWS temporary tables, but that's a rather
ugly hack.
Regards
Pavel
On the other hand, packages can easily be emulated by using Python.
Having packages would make PLPg/SQL programming much prettier. It would
be much prettier to group related routines into a package than to have
them laying around without anything indicating that the routines are
related. On the plus side, packages would make it much easier to migrate
from Oracle to Postgres. And you do want that, don't you?
I am very well aware of the animosity toward the Oracle community and
the reasons for that animosity. Oracle wiped the floor with the
predecessor of Postgres, the database called "Ingres". Sandra Kurtzig,
the Ingres CEO at the the time, and Michael Stonebraker were both
involved in very public spat with Larry Ellison. Stonebraker is still
very much a factor in Postgres community and I doubt that his feelings
toward the Larry and his company have got any warmer with years.
Postgres was created with the idea of fighting the Dark Lord in the land
of Belmont, CA. However, if it is the goal of the community to have
users convert from Oracle to Postgres, than making that conversion easy
would be conducive to that goal. A long time ago a former pastry baker
wrote the following article:
https://www.toolbox.com/tech/data-management/blogs/why-postgresql-doesnt-have-query-hints-020411/
Disclosure: I am the Oracle DBA that Josh Berkus is talking about in
that article. Recent events with "advice" to a very well known Oracle
personality testify to the fact that the emotions haven't changed much
since that article was written. Of course, Oracle is still the most
popular database in the world.
--
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com
Re: Packages, inner subprograms, and parameterizable anonymous blocks for PL/pgSQL
From
Laurenz Albe
Date:
On Thu, 2021-12-16 at 11:22 -0800, Bryn Llewellyn wrote: > The advantages are self-evident to these programmers I am not trying to belittle this, but when you are used to system A and start working with system B you always miss some features of A, until you get to know B better and figure out how to do things there. Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com
Re: Packages, inner subprograms, and parameterizable anonymous blocks for PL/pgSQL
From
Bryn Llewellyn
Date:
laurenz.albe@cybertec.at wrote:
bryn@yugabyte.com wrote:
The advantages are self-evident to these programmers…
I am not trying to belittle this, but when you are used to system A and start working with system B you always miss some features of A, until you get to know B better and figure out how to do things there.
Several people have responded with various points—some technical and some political. Thanks to all of you. I’m simply using the most recent turn in the thread as the hook for my present turn.
I made a terrible essay design choice with my “advantages of packages are self-evident”. I used this as a wrong-headed shortcut to save myself the effort of writing about modular software design principles—and to save you all the effort of (re)reading that stuff. So I’ll make the briefest attempt here.
«Modular design recommends exposing functionality through a purpose oriented interface and hiding all implementation details from the API’s user. A package achieves this with declarative syntax via the spec/body separation. The body encapsulates as many (top-level) subprograms as you want. Each of these is visible to all of its peers. But none is visible outside of the package unless the spec declares that it should be. This is a simple opt-in scheme.Inner subprograms are a very nice bonus because they allow locally useful helpers to be implemented right where they’re needed in one subprogram so that they’re invisible to all other subprograms in the body. The scope rules here a simply an extension of what PG already supports with block statements inside block statements.There’s also the business of globals, at any level, and package-level globals that bring package state. Some programmers have religious objections here. But the value of constants declared in the package spec seems to be seen by all who program using packages as only useful.»
I firmly believe that the intrinsic value of all of this has nothing to do with Oracle Database, with migrating from it to PG, or with Ada. It’s just that Oracle’s PL/SQL has a working implementation. And many people find it easier to think when they can experiment with something concrete rather than trying to hold, and run, a pretty big abstract model entirely in their head.
Anyway… enough of this. I fear that even what I said above will annoy some folks on this list.
It seems to me that there’s sufficiently vigorous opposition to anything like packages for PL/pgSQL that such an enhancement will never happen. So I must just accept this and (as Laurenz recommends) learn the best design patterns for singleton PL/pgSQL functions and procedures in PG as it presently is.
When I first started to use PG, I read “43.13. Porting from Oracle PL/SQL” (www.postgresql.org/docs/current/plpgsql-porting.html). These are the relevant bullets: «Instead of packages, use schemas to organize your functions into groups.» and «Since there are no packages, there are no package-level variables either. This is somewhat annoying. You can keep per-session state in temporary tables instead.»
This says nothing about how to model the spec/body distinction. I experimented with various schemes. For example, one schema for the exposed API and another one for the to-be-hidden implementation. This depends on a careful, practice-based, use of the privileges scheme and implies using “security definer” units. But you can’t do transaction control in such a unit—and this brings its own problems. I tried to model package state using temporary tables but I hit what seemed to be a bootstrap conundrum. How, in the absence of a trigger that fires when a session starts, can I make sure that the table that I want is in place?
Can anybody please recommend a whitepaper, or similar, that explains the recommended practice in this space?
Re: Packages, inner subprograms, and parameterizable anonymous blocks for PL/pgSQL
From
Peter Geoghegan
Date:
On Fri, Dec 17, 2021 at 11:43 AM Bryn Llewellyn <bryn@yugabyte.com> wrote: > Modular design recommends exposing functionality through a purpose oriented interface and hiding all implementation detailsfrom the API’s user. A package achieves this with declarative syntax via the spec/body separation. The body encapsulatesas many (top-level) subprograms as you want. Each of these is visible to all of its peers. But none is visibleoutside of the package unless the spec declares that it should be. This is a simple opt-in scheme. I still don't get it. It sounds like you're mostly talking about encapsulation, or Information hiding, for stored procedures. I can certainly see how plpgsql doesn't do those things very well, but it still seems like there might be a lot of nuance that isn't getting across. The list of specific features that seem to be missing are not unreasonable, individually, and yet it feels like I cannot see some bigger picture that's apparent to you. Maybe you should explain your position by way of a motivating example, involving a real world use case. Something that makes the issues concrete. Are these items compelling because of how they allow an organization to deploy a program in a production environment, complete with version control? Does it have something to do with decoupling the mutable business data stored in tables from the programs contained/run in the same database? -- Peter Geoghegan
Re: Packages, inner subprograms, and parameterizable anonymous blocks for PL/pgSQL
From
Bryn Llewellyn
Date:
pg@bowt.ie wrote:bryn@yugabyte.com wrote:
Modular design recommends exposing functionality through a purpose oriented interface and hiding all implementation details from the API’s user. A package achieves this with declarative syntax via the spec/body separation. The body encapsulates as many (top-level) subprograms as you want. Each of these is visible to all of its peers. But none is visible outside of the package unless the spec declares that it should be. This is a simple opt-in scheme.
I still don’t get it. It sounds like you’re mostly talking about encapsulation, or Information hiding, for stored procedures. I can certainly see how plpgsql doesn’t do those things very well, but it still seems like there might be a lot of nuance that isn’t getting across. The list of specific features that seem to be missing are not unreasonable, individually, and yet it feels like I cannot see some bigger picture that's apparent to you.
Maybe you should explain your position by way of a motivating example, involving a real world use case. Something that makes the issues concrete. Are these items compelling because of how they allow an organization to deploy a program in a production environment, complete with version control? Does it have something to do with decoupling the mutable business data stored in tables from the programs contained/run in the same database?
I can certainly make up an example. I’ll do this over the weekend. However, I fear that it will be time wasted because at least some of the addressees here who’ve expressed strong opposition to the notion of PL/pgSQL packages must understand very well what they’re objecting to. For example, pavel.stehule@gmail.com with his “schema variables, LET command” work.
Anyway… I’ll give it my best shot. I’ll try to address your specific questions in my follow-up reply. Hang on for a couple of days, please.
Re: Packages, inner subprograms, and parameterizable anonymous blocks for PL/pgSQL
From
Bryn Llewellyn
Date:
bryn@yugabyte.com wrote:
pg@bowt.ie wrote:bryn@yugabyte.com wrote:
Modular design recommends exposing functionality through a purpose oriented interface and hiding all implementation details from the API’s user. A package achieves this with declarative syntax via the spec/body separation. The body encapsulates as many (top-level) subprograms as you want. Each of these is visible to all of its peers. But none is visible outside of the package unless the spec declares that it should be. This is a simple opt-in scheme.
I still don’t get it. It sounds like you’re mostly talking about encapsulation, or Information hiding, for stored procedures. I can certainly see how plpgsql doesn’t do those things very well, but it still seems like there might be a lot of nuance that isn’t getting across. The list of specific features that seem to be missing are not unreasonable, individually, and yet it feels like I cannot see some bigger picture that's apparent to you.
Maybe you should explain your position by way of a motivating example, involving a real world use case. Something that makes the issues concrete. Are these items compelling because of how they allow an organization to deploy a program in a production environment, complete with version control? Does it have something to do with decoupling the mutable business data stored in tables from the programs contained/run in the same database?I can certainly make up an example. I’ll do this over the weekend. However, I fear that it will be time wasted because at least some of the addressees here who’ve expressed strong opposition to the notion of PL/pgSQL packages must understand very well what they’re objecting to. For example, pavel.stehule@gmail.com with his “schema variables, LET command” work.Anyway… I’ll give it my best shot. I’ll try to address your specific questions in my follow-up reply. Hang on for a couple of days, please.
I made a start on this. But I want to think carefully about the example use case(s). So I won’t promise a delivery date. Like I said, I don’t expect to change anybody’s mind. But I do hope that I might get some useful suggestions on how, using PG Version 14, I can best meet the requirements that I’ll aim to explain.
I’m still hoping that I might get some pointers to whitepapers or blog posts that expand on those bullets that I quoted from the PG doc: «Instead of packages, use schemas to organize your functions into groups.» and «Since there are no packages, there are no package-level variables either. This is somewhat annoying. You can keep per-session state in temporary tables instead.»
Re: Packages, inner subprograms, and parameterizable anonymous blocks for PL/pgSQL
From
Pavel Stehule
Date:
Hi
I’m still hoping that I might get some pointers to whitepapers or blog posts that expand on those bullets that I quoted from the PG doc: «Instead of packages, use schemas to organize your functions into groups.» and «Since there are no packages, there are no package-level variables either. This is somewhat annoying. You can keep per-session state in temporary tables instead.»
I fixed a patch https://commitfest.postgresql.org/36/1608/ so you can check it.
Using temporary tables instead of session variables is not too practical. There are more alternative ways - a) one extension, b) using global variables from Perl, c) using global configuration variables.
The @c is most common today
Regards
Pavel
Re: Packages, inner subprograms, and parameterizable anonymous blocks for PL/pgSQL
From
Bryn Llewellyn
Date:
pavel.stehule@gmail.com wrote:bryn@yugabyte.com wrote:
Yes, I did read your “schema variables” post on your site “Pavel Stehule’s blog — Some notes about PostgreSQL”. It gives me a very good idea about what you have in mind.
But as I’ve come to understand the term “Functional Spec”, this denotes a formal deliverable that a product development (sub)group owns and maintains collectively. Especially, it has a status which, eventually, is set to “Approved”. And apart from minor bug fixes, no code makes its way into a released version of the (sub)product in question unless the group has approved its Functional Spec.
Is this model not used for the PostgreSQL system?
Development model in Postgres is not too formal, the specification can be changed until the last moment, the main word has commetter with an agreement of the author of patch and all other people. The development of Postgres is much more agile than waterfall.bryn continued:
Your blogpost and all the comments are dated Feb 2018—so almost four years ago now. What needs to happen for your project to be exposed in a PG Release? And which release might this be?
Technically, this patch is not too complex, but a) it was harder to find cleaner with good performance implementation in architecture (the current architecture knows queries and utility commands, and LET is something between). b) it was harder to find an agreement about specification because global temporal objects like global temporary tables or session variables are not in Postgres today (there is not long experience with this feature). Unfortunately 1. the related part of standard SQL/PSM is not widely accepted and this part of the standard is almost dead , 2. the introduced modules are +/- Postgres's schema, so it is a really redundant concept, 3. SET command (in SQL/PSM (ANSI/SQL) is used for different purpose in Postgres, and I had to use keyword LET (there is not possibility to use keyword SET without compatibility break), c) there was long discussion if variables should be transactional or non transactional (I strongly support not transactional by default - like any other databases does, and transactional behavior will be optional in next step).
I invite any help with code and documentation review (and support in discussion) - I am not native speaker, and my English is very poor. I hope there is a change to commit this patch in Postgres 15. But it depends on commiter's capacite - and there are a lot of patches in the queue.
You can watch the progress of this work on pgsql-hackers mailing list
older https://www.postgresql.org/message-id/flat/CAFj8pRDY+m9OOxfO10R7J0PAkCCauM-TweaTrdsrsLGMb1VbEQ@mail.gmail.com
current https://www.postgresql.org/message-id/flat/CAFj8pRD053CY_N4=6SvPe7ke6xPbh=K50LUAOwjC3jm8Me9Obg@mail.gmail.com
bryn continued: Back to the substance of your proposal,
1. I see that a schema variable will be another kind of object in the class that has tables, views, sequences, indexes, UDTs, procedures, functions, and so on. So it’s persisted within a database; it’s owned by a user; and it’s localized in a schema. Does it have its own namespace too? For example, can a table x, a function x(), and a schema variable x all live in the same schema s in some database. And can I use the normal qualified name syntax to access a schema variable?
variables have their own namespace, because they have their own catalog table. It is designed like any catalog object - so you can use (or you don't need) to use a qualified identifier. But there is not any schema scope in Postgres now. Anything depends on SEARCH_PATH setting. So variables can be in the same schema with tables and functions (the access rules are the same).bryn continued: 2. It seems, then, that all the familiar notions, and especially the ability for a non-owner to access it apply. I see that the SELECT privilege governs who can read the value of a schema variable. But there seems to be no mechanism that governs who can change the value of a schema variable. It looks like only the owner can change the value—using the new LET SQL statement. Does this mean both that a top-level call from a client session that’s authorized as the owner can change it and a “security definer” subprogram with the same owner can change it—but that there is no explicit (say, WRITE) privilege for this. Is this what you intend? If so, why not have a WRITE privilege?
In the last patch I renamed schema variables to session variables (on the community request). You can grant READ or WRITE rights to other users by command GRANT, or you can write a security owner function, if you need more precious work with sensitive data.bryn continued: 4. You said “security definer function”. Is this what you mean? Or do you mean ANY function or procedure as long as the current user (i.e. the owner of the most tightly enclosing security definer unit) is the schema variable’s owner?
inside the security owner function you are running under functions's owner identity. It can have the same identity as the variable's owner. It is common pattern in Postgres (not just in Postgres)bryn continued: 5. Could you please consider allowing a CONSTANT schema variable (with the usual syntax and requirement for an initialization expression)? One very popular use of a spec-level package variable is for a universal constant that has to be accessed in several places—like, for example, the conversion factor between some metric unit and some imperial unit.
There is it - it is not named CONSTANT, but IMMUTABLE (because CONSTANT needs to introduction of new SQL keyword, and IMMUTABLE is already used keyword (new keywords can introduce some compatibility issues))
CREATE IMMUTABLE VARIABLE iv AS int DEFAULT 100;
-- should to fail
LET iv = 10000;
ERROR: session variable "iv" is declared IMMUTABLEbryn continued: 3. What is the precedence scheme? For example, if a SQL statement in a PL/pgSQL unit has a restriction like this:
…where col = x…
and x is both the name of an in-scope variable (or formal parameter) in the unit and the name of a schema variable? When the table has a column called x, then there’s (at least sometimes) no way round a run-time collision error except to rename one of the X’s. (Qualifying the names doesn’t fix it.) Will it be the same with schema variables? The question extends to ordinary assignment statements that become SQL statements under the covers:
v := a + b;
where b happens to be both an in-scope variable and a schema variable.
You can handle collisions by using qualified identifiers. For PL/pgSQL you can use block labels, for other objects schema.
From PostgreSQL’s SQL perspective the session variables are common database objects (contra PL/SQL where package variables are PL/SQL language objects), and SQL disallows ambiguity. This is a little bit more complex problem, because session variables can be used everywhere in Postgres (not just in PL/pgSQL).
PL/pgSQL doesn’t see session variables like something special - PL/pgSQL runtime doesn’t see session variables ever (the work with session variables are done one level deeper) (the usage of session variables are fully transparent for this environment) so there are the precedence rules without change, but you can use with session variables inside PL/pgSQL (but from implementation perspective it is just any other SQL object).
https://www.postgresql.org/docs/9.6/plpgsql-implementation.html#PLPGSQL-VAR-SUBST
and you can set different precedence#variable_conflict error#variable_conflict use_variable#variable_conflict use_column
but I strongly don't advise it. From my perspective, allowing collisions was a significant Oracle's PL/SQL design error.
Thank you very much indeed for this careful reply, Pavel—and for the links to the threads on the Hackers list. A great deal is now clarified for me. You said “I am not native speaker, and my English is very poor”. You’re far, far, too modest. I am a native English speaker. And I often see writing from other native English speakers that’s pretty awful. Your writing is a lot better than theirs. Of course, I don’t know a single word of Czech.
I’m glad to see that the non-transactional behavior is still part of the plan—even if, later, a transactional variant is added. I’m glad to see, too, that READ/WRITE privileges and IMMUTABLE are now part of the plan.
Just a detail, now. You sad this:
variables have their own namespace, because they have their own catalog table. It is designed like any catalog object - so you can use (or you don't need) to use a qualified identifier. But there is not any schema scope
I probably misused the phrase “schema scope”. I meant only to say that, if two schemas, s1 and s2, exist in the same database, then it’s OK to have two session variables called s1.x and s2.x. And you confirmed that this is the case. It seems that you can also have a session variable called s1.x and a table called s1.x. In a separate reply to my email, david.g.johnston@gmail.com thought that this would be disallowed because session variables would share the same relations namespace that tables and views already do. (I have no opinion here.) Thanks for the clarification.
All is clear about collisions and name qualification now. And, yes: I agree with your advice to leave those settings that you mentioned at their shipped defaults and to go with the native behavior (collisions simply cause a run-time error). Then to fix such errors by ordinary spelling changes (esp. schema qualification) in one’s code.
You said “I invite any help with code and documentation review”. I’m not a C coder. But I’d be happy to help with documentation review when the time comes.
Finally, I’m well advanced with my promised mini-project to prepare and explain a realistic use-case to show the benefits of a package construct in PL/pgSQL. I’ll wait until after Xmas before I share it with the General list. Having said this, I already appreciate that the chances of bringing packages to some future PG release are vanishingly small.
Re: Packages, inner subprograms, and parameterizable anonymous blocks for PL/pgSQL
From
Pavel Stehule
Date:
Thank you very much indeed for this careful reply, Pavel—and for the links to the threads on the Hackers list. A great deal is now clarified for me. You said “I am not native speaker, and my English is very poor”. You’re far, far, too modest. I am a native English speaker. And I often see writing from other native English speakers that’s pretty awful. Your writing is a lot better than theirs. Of course, I don’t know a single word of Czech.
Thank you. Gmail auto correction (ai) does a lot of good work :-)
I’m glad to see that the non-transactional behavior is still part of the plan—even if, later, a transactional variant is added. I’m glad to see, too, that READ/WRITE privileges and IMMUTABLE are now part of the plan.Just a detail, now. You sad this:variables have their own namespace, because they have their own catalog table. It is designed like any catalog object - so you can use (or you don't need) to use a qualified identifier. But there is not any schema scopeI probably misused the phrase “schema scope”. I meant only to say that, if two schemas, s1 and s2, exist in the same database, then it’s OK to have two session variables called s1.x and s2.x. And you confirmed that this is the case. It seems that you can also have a session variable called s1.x and a table called s1.x. In a separate reply to my email, david.g.johnston@gmail.com thought that this would be disallowed because session variables would share the same relations namespace that tables and views already do. (I have no opinion here.) Thanks for the clarification.
There are more possible collisions than like the mentioned. You can have composite variable x with field y, and you can have schema x with table y - and then what is x.y in a query when you don't know context? Good news is fact, so we know context - and we know so FROM clause can contain just table identifiers (so some collisions can be solved automatically with context knowledge) . Fortunately, implicit FROM clause functionality was removed a few years ago.
Inside the patch is a routine that calculates possible applications of identifiers or quoted identifiers. When there is more than one valid interpretation, the exception about ambiguous identifiers is raised.
Unfortunately a new badly named session variable can break working queries, but this is not a new situation in SQL. New badly named column of table can do this break too.
postgres=# create variable x as int default 10;
CREATE VARIABLE
postgres=# create table x (a int);
CREATE TABLE
postgres=# select * from x;
┌───┐
│ a │
╞═══╡
└───┘
(0 rows)
CREATE VARIABLE
postgres=# create table x (a int);
CREATE TABLE
postgres=# select * from x;
┌───┐
│ a │
╞═══╡
└───┘
(0 rows)
postgres=# select x.a from x;
┌───┐
│ a │
╞═══╡
└───┘
(0 rows)
┌───┐
│ a │
╞═══╡
└───┘
(0 rows)
postgres=# select x, x.a from x;
ERROR: column reference "x" is ambiguous (note - there is collision with hidden column x of table x)
LINE 1: select x, x.a from x;
^
DETAIL: The qualified identifier can be column reference or session variable reference
postgres=# select public.x, x.a from x;
┌───┬───┐
│ x │ a │
╞═══╪═══╡
└───┴───┘
(0 rows)
postgres=# insert into x values(1000);
INSERT 0 1
postgres=# select public.x, x.a from x;
┌────┬──────┐
│ x │ a │
╞════╪══════╡
│ 10 │ 1000 │
└────┴──────┘
(1 row)
ERROR: column reference "x" is ambiguous (note - there is collision with hidden column x of table x)
LINE 1: select x, x.a from x;
^
DETAIL: The qualified identifier can be column reference or session variable reference
postgres=# select public.x, x.a from x;
┌───┬───┐
│ x │ a │
╞═══╪═══╡
└───┴───┘
(0 rows)
postgres=# insert into x values(1000);
INSERT 0 1
postgres=# select public.x, x.a from x;
┌────┬──────┐
│ x │ a │
╞════╪══════╡
│ 10 │ 1000 │
└────┴──────┘
(1 row)
All is clear about collisions and name qualification now. And, yes: I agree with your advice to leave those settings that you mentioned at their shipped defaults and to go with the native behavior (collisions simply cause a run-time error). Then to fix such errors by ordinary spelling changes (esp. schema qualification) in one’s code.You said “I invite any help with code and documentation review”. I’m not a C coder. But I’d be happy to help with documentation review when the time comes.
documentation check or just check in code comments can be more than good enough (or testing), or just voice in discussion in the mailing list.
Finally, I’m well advanced with my promised mini-project to prepare and explain a realistic use-case to show the benefits of a package construct in PL/pgSQL. I’ll wait until after Xmas before I share it with the General list. Having said this, I already appreciate that the chances of bringing packages to some future PG release are vanishingly small.
I am sure packages have some advantages - this is an important feature of ADA language. The possibility of private objects is important and interesting. Possibility to sharing code is interesting too.
But Postgres already has schemas (a little bit different from Oracle) and extensions. And internal implementation of PL/pgSQL disallow any sharing across databases. So introduction of packages to Postgres is introducing some not trivial and partially redundant concept. Currently, Postgres is relatively small and very very consistent software - and I believe so is one of the reasons why Postgres is popular. It is easy to learn, easy to use. The internal complexity is well solved and hidden. This is a long goal for community Postgres. The compatibility with Oracle should not be important after 20 years (although it is very important for a lot of current users and for users who can leave Oracle). If we miss some feature in Postgres, we should to implement it, but with respect to current features.
Regards
Pavel
Thread start URL: https://www.postgresql.org/message-id/48A5C07D-6A66-4B96-A48C-75C530CB9DCF@yugabyte.comOn 16-Dec-2021, bryn@yugabyte.com wrote:
Folks who develop applications for Oracle Database have had the features that the subject line of this email lists since the arrival of PL/SQL in the early nineties. The advantages are self-evident to these programmers; and their lack comes as a shocking disappointment when they start to write application code for PostgreSQL*. The absence of packages and inner subprograms is huge. The absence of parameterizable anonymous blocks is a smaller limitation.
Notice that this point is entirely separable from the endeavor of migrating an extant application. It has first and foremost to do with how you think of designing code.
I’ve heard rumors that some contributors to the PostgreSQL implementation are interested in bringing the PL/pgSQL features that I mentioned. If there is any such thinking, please let me know. I’m not a C coder but I’d be very interested in reading any ordinary prose that describes how these features might be exposed to the PostgreSQL application developer.
________________________________________________________________________________
* Full disclosure: I was the product manager for PL/SQL, working at Oracle HQ, from about 2000 through 2019 when I started with Yugabyte Inc. At least some people on this list have heard of YugabyteDB and know that it uses Postgres’s SQL processing code “as is” (currently Version 11.2, but presently Version 13) on top of its own implementation of a distributed storage layer (inspired by Google Spanner).
…On 23-Dec-2021, pavel.stehule@gmail.com wrote:I am sure packages have some advantages — this is an important feature of ADA language. The possibility of private objects is important and interesting. Possibility to sharing code is interesting too.But Postgres already has schemas (a little bit different from Oracle) and extensions. And internal implementation of PL/pgSQL disallow any sharing across databases. So introduction of packages to Postgres is introducing some not trivial and partially redundant concept. Currently, Postgres is relatively small and very very consistent software — and I believe so is one of the reasons why Postgres is popular. It is easy to learn, easy to use. The internal complexity is well solved and hidden. This is a long goal for community Postgres. The compatibility with Oracle should not be important after 20 years (although it is very important for a lot of current users and for users who can leave Oracle). If we miss some feature in Postgres, we should to implement it, but with respect to current features.
My message, sent six weeks ago, started a thread. I believe that Pavel’s message, sent about a week later, is the most recent. In this turn, I’ve tried to copy everybody who contributed to the thread.
In one of the turns, I promised a proper write-up of my case for PL/pgSQL packages. It took me some time because of the usual reasons (the Holiday period and ordinary work). It’s done now—attached as case-for-plpgsql-packages.zip. I decided to exclude the “parameterizable anonymous blocks” part of what I wrote to start this thread from my write-up. It’s an entirely separable notion. I allowed myself to change the “Subject” of this reply to reflect this..
The .zip contains these files:
case-for-plpgsql-packages.pdf — the prose write-up. It describes a use-case that is nicely implemented using a package (and package state). It presents the elided code of a working PL/SQL implementation. It transliterates this code naïvely into a strawman PL/pgSQL package. And it presents runnable the code that emulates the package like the PG doc section “ Porting from Oracle PL/SQL” recommends (except that I use a user-defined run-time parameter rather than a temporary table for the state.) The complete, non-ellided code is provided in the following .sql files.orcl----run-me-in-Oracle-Database.sqlstrawman-pg----pseudocode-cannot-be-run.sqlrunnable-pg----run-me-in-PostgreSQL.sql
I’d be delighted to hear suggestions for a better runnable PL/pgSQL implementation—and happy to revise my code and write-up to use this. I’d also welcome general feedback (ordinarily in email, of course) and I’d be happy to revise my work and make a new .zip.
Finally, here are snippets from some of the responses. I hope that my essay, taken in its entirety, addresses all of these questions.
pg@bowt.ie wrote: Why are those things huge? It’s not self-evident to me. I can only speak for myself, but throwing around terms like “shocking disappointment” is never going to convince me of anything. You can make similar statements about many other things.
pavel.stehule@gmail.com wrote: There are a lot of successful migrations from Oracle to Postgres that shows so that the absence of mentioned features isn’t too huge. Postgres is just not compatible with Oracle. The compatibility with Oracle is not possible without monstrous increasing size and complexity, and this is a benefit just for a small part of users. A lot of packages and concepts in Oracle are obsolete, or maybe not too well designed (from today’s perspective). After my experience I think there are a lot of things that are possible in stored procedures, but I am sure it is not good to do it, and I don’t think we need to promote these patterns in Postgres.
gogala.mladen@gmail.com wrote: ORAFCE uses schemas as the package names. However, one very practical thing is missing: session variables. Yes, you can emulate those with ON COMMIT PRESERVE ROWS temporary tables, but that’s a rather ugly hack. On the other hand, packages can easily be emulated by using Python. Having packages would make PLPg/SQL programming much prettier. It would be much prettier to group related routines into a package than to have them laying around without anything indicating that the routines are related. On the plus side, packages would make it much easier to migrate from Oracle to Postgres.
laurenz.albe@cybertec.at wote: I am not trying to belittle this, but when you are used to system A and start working with system B you always miss some features of A, until you get to know B better and figure out how to do things there.
bryn@yugabyte.com wrote: I firmly believe that the intrinsic value of all of this has nothing to do with Oracle Database, with migrating from it to PG, or with Ada. It’s just that Oracle’s PL/SQL has a working implementation. And many people find it easier to think when they can experiment with something concrete rather than trying to hold, and run, a pretty big abstract model entirely in their head.
pg@bowt.ie wrote: Maybe you should explain your position by way of a motivating example, involving a real world use case. Something that makes the issues concrete. Are these items compelling because of how they allow an organization to deploy a program in a production environment, complete with version control? Does it have something to do with decoupling the mutable business data stored in tables from the programs contained/run in the same database?