Re: support for CREATE MODULE - Mailing list pgsql-hackers

From Pavel Stehule
Subject Re: support for CREATE MODULE
Date
Msg-id CAFj8pRArQiR2GuThr5b7F-D2tEiOWAa0S8WB2QsG1vO=AAu9kw@mail.gmail.com
Whole thread Raw
In response to Re: support for CREATE MODULE  (Swaha Miller <swaha.miller@gmail.com>)
List pgsql-hackers


čt 3. 2. 2022 v 20:21 odesílatel Swaha Miller <swaha.miller@gmail.com> napsal:
Thank you for the feedback Pavel and Julien. I'll try to explain some of the issues and points you raise to the best of my understanding.

The reason for modules is that it would serve as an organizational unit that can allow setting permissions on those units. So, for example, all functions in a module can be subject to setting access permissions on for some user(s) or group(s). I didn't explain it well in the sgml docs, but along with module syntax, I'm proposing introducing privileges to grant/revoke on modules and routines in modules. And why modules for this purpose? Because its in the SQL spec so seems like a way to do it.

This part of the standard is dead - there is no strong reason to implement it.
 

I'm adding comments inline for the list of functionality you mentioned. I look forward to discussing this more and figuring out how to make a useful contribution to the community.

On Wed, Feb 2, 2022 at 11:22 PM Pavel Stehule <pavel.stehule@gmail.com> wrote:


čt 3. 2. 2022 v 5:46 odesílatel Julien Rouhaud <rjuju123@gmail.com> napsal:
Hi,

On Thu, Feb 03, 2022 at 05:25:27AM +0100, Pavel Stehule wrote:
>
> čt 3. 2. 2022 v 3:28 odesílatel Swaha Miller <swaha.miller@gmail.com>
> napsal:
>
> > Hi,
> >
> > I'm following up from Jim's POC for adding MODULE to PostgreSQL. [1]
> >
> > My proposal implements modules as schema objects to be stored in a new
> > system catalog pg_module with new syntax for CREATE [OR REPLACE] MODULE,
> > ALTER MODULE, DROP MODULE and for GRANT and REVOKE for privileges on
> > modules and module routines. I am attempting to follow the SQL spec.
> > However, for right now, I'm proposing to support only routines as module
> > contents, with local temporary tables and path specifications as defined
> > in the SQL spec, to be supported in a future submission. We could also
> > include support for variables depending on its status. [2]
>
> I dislike this feature. The modules are partially redundant to schemas and
> to extensions in Postgres, and I am sure, so there is no reason to
> introduce this.
>
> What is the benefit against schemas and extensions?

I agree with Pavel.  It seems that it's mainly adding another namespacing layer
between schemas and objects, and it's likely going to create a mess.
That's also going to be problematic if you want to add support for module
variables, as you won't be able to use e.g.
dbname.schemaname.modulename.variablename.fieldname.


I haven't yet added support for variables so will need to look into the problems with this if we're going to do that.
 
Also, my understanding was that the major interest of modules (at least for the
routines part) was the ability to make some of them private to the module, but
it doesn't look like it's doing that, so I also don't see any real benefit
compared to schemas and extensions.


Yes, that is indeed the goal/use-case with setting permissions with grant and revoke. Right now, I have proposed create and reference as the kinds of access that can be controlled on modules, and reference as the kind of access that can be controlled on routines inside modules.
 
The biggest problem is coexistence of Postgres's SEARCH_PATH  object identification, and local and public scopes used in MODULEs or in Oracle's packages.


I am not extremely familiar with Oracle's packages, but do know of them. I'm wondering if local and public scopes for MODULE is in the SQL spec? (I will check for that...) My thinking was to implement functionality that conforms to the SQL spec, not try to match Oracle's package which differs from the spec in some ways.
 
I can imagine MODULES as third level of database unit object grouping with following functionality

1. It should support all database objects like schemas

Do you mean that schemas should be groupable under modules? My thinking was to follow what the SQL spec says about what objects should be in modules, and I started with routines as one of the objects that there are use cases for. Such a controlling access permissions on routines at some granularity that is not an entire schema and not individual functions/procedures.

SQLspec says so there can be just temporary tables and routines. It is useless. Unfortunately SQL/PSM came too late and there is no progress in the last 20 years.  It is a dead horse.

 
2. all public objects should be accessed directly when outer schema is in SEARCH_PATH

Yes, an object inside a module is in a schema and can be accessed with schemaname.func() as well as modulename.func() as well as schemaname.modulename.func(). I think you are saying it should be accessible with func() without a qualifying schemaname or modulename if the schemaname is in the search path, and that sounds reasonable too. Unless, of course, func() was created in a module, in which case access permissions for the module and module contents will determine whether func() should be directly accessible. In my current proposal, a previously created func() can't be added to a module created later. The purpose of creating routines inside a module (either when the module is created or after the module is created) would be with the intent of setting access permissions on those routines differently than for the outer schema.
 
3. private objects cannot be accessed from other modules

Yes, I hope that is going to be the case with setting permissions with grant and revoke. Right now, I have proposed create and reference as the kinds of access that can be controlled on modules, and reference as the kind of access that can be controlled on routines inside modules.

The permission is not enough strategy -  if I implement some private objects in the module, and I push this module to the schema on the search path, the private objects need to be invisible. I don't want to allow shadowing of public objects by private objects.
 
 
4. modules should be movable between schemas, databases without a loss of functionality

pg_dump will dump modules so that can provide ways of moving them between databases. I hadn't envisioned moving modules between schemas, but can think of ways that can be supported. Would the objects within the modules also move implicitly to the new schema?

I thought more about extending the CREATE EXTENSION command to support modules.
 
 
5. modules should to support renaming without loss of functionality

yes renaming of modules is supported in my proposal

But if I call a module function from the same module, this should work after renaming.  That's mean so there should be some mechanism how to implement routine call without necessity to use absolute path

 
6. there should be redefined some rules of visibility, because there can be new identifier's collisions and ambiguities

I'm not sure I understand this point. Can you please explain more?

I can have function fx in schema s, and then I can have module s in public schema with function fx. What will be called when I write s.fx() ?
 
 

7. there should be defined relation of modules's objects and schema's objects. Maybe an introduction of the default module can be a good idea.

I was thinking of module as a unit of organization (with the goal of controlling access to it) of objects that are still in some schema, and the module itself as an object that is also in a schema.

I understand, but just this is not enough benefit for implementation, when Postgres supports schemas and extensions already. The benefit can be better encapsulation or better isolation than we have with schemas.
 
 

I had the opportunity to see a man who modified routines in pgAdmin. It can be hell, but if we introduce a new concept (and it is an important concept), then there should be strong benefits - for example - possibility of strong encapsulation of code inside modules (or some units - the name is not important).

The problem with pgAdmin maybe can be solved better by adding some annotations to database objects that allows more user friendly organization in the object tree in pgAdmin (and similar tools). Maybe it can be useful to have more tries (defined by locality, semantic, quality, ...).

Regards

Pavel

Best regards,
Swaha 

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Adding CI to our tree
Next
From: Robert Haas
Date:
Subject: Re: archive modules