Thread: Documentation about PL transforms
Hi, It looks to me as if the transforms feature for PLs was added in cac7658, and support was added to plperl and plpython at that time, with documentation added for CREATE FUNCTION, CREATE/DROP TRANSFORM, and the catalogs, but nothing was added at that time to plhandler.sgml to clarify that a PL handler itself must add code to look up and apply such transforms, or nothing happens. Without that information, a reader not in the know (it happened to me) could get the idea from the CREATE TRANSFORM docs that the support is more general than it is, and an incoming argument could already contain the 'internal' something-specific-to-the-language-implementation returned by a specified transform. (Which does lead to follow-on questions like "who says an arbitrary language implementation's transform result could be safely represented in an mmgr-managed argument list?"—it really does make better sense upon learning the PL handler has to do the deed. But I haven't spotted any place where the docs *say* that.) I'm thinking plhandler.sgml is the only place that really needs to be said; readers looking up CREATE TRANSFORM and using an existing PL that supports it don't need to know how the sausage is made. (Maybe it is worth mentioning there, though, that it isn't possible to develop transforms for an arbitrary PL unless that PL applies transforms.) I noticed the CREATE TRANSFORM docs show the argument list as (argument_type [, ...]) even though check_transform_function will reject any argument list of length other than 1 or with type other than internal. Is that an overly-generic way to show the syntax, or is that a style with precedent elsewhere in the docs? (I checked a couple obvious places like CREATE OPERATOR, CREATE FUNCTION, CREATE TYPE but did not see similar examples). As long as a PL handler has the sole responsibility for invoking its transforms, I wonder if it would make sense to allow a PL to define what its transforms should look like, maybe replacing check_transform_function with a transform_validator for the PL. I'm not proposing such a patch here, but I am willing to prepare one for plhandler.sgml and maybe pltemplate.c documenting the current situation, if nobody tells me I'm wrong about something here. Regards, -Chap
On 05.02.22 00:55, Chapman Flack wrote: > I'm thinking plhandler.sgml is the only place that really needs to be > said; readers looking up CREATE TRANSFORM and using an existing PL that > supports it don't need to know how the sausage is made. (Maybe it is > worth mentioning there, though, that it isn't possible to develop > transforms for an arbitrary PL unless that PL applies transforms.) makes sense > I noticed the CREATE TRANSFORM docs show the argument list as > (argument_type [, ...]) even though check_transform_function will reject > any argument list of length other than 1 or with type other than internal. > Is that an overly-generic way to show the syntax, or is that a style > with precedent elsewhere in the docs? That could be corrected. > As long as a PL handler has the sole responsibility for invoking > its transforms, I wonder if it would make sense to allow a PL to > define what its transforms should look like, maybe replacing > check_transform_function with a transform_validator for the PL. > I'm not proposing such a patch here, but I am willing to prepare > one for plhandler.sgml and maybe pltemplate.c documenting the current > situation, if nobody tells me I'm wrong about something here. Maybe. This kind of thing would mostly be driven what a PL wants in actual practice, and then how that could be generalized to many/all PLs.
On 02/07/22 10:59, Peter Eisentraut wrote: > On 05.02.22 00:55, Chapman Flack wrote: >> I'm thinking plhandler.sgml is the only place that really needs to be >> ... >> worth mentioning there, though, that it isn't possible to develop >> transforms for an arbitrary PL unless that PL applies transforms.) > > makes sense > >> (argument_type [, ...]) even though check_transform_function will reject >> any argument list of length other than 1 or with type other than internal. > > That could be corrected. I'll work on some doc patches. >> As long as a PL handler has the sole responsibility for invoking >> its transforms, I wonder if it would make sense to allow a PL to >> define what its transforms should look like, maybe replacing >> check_transform_function with a transform_validator for the PL. > > Maybe. This kind of thing would mostly be driven what a PL wants in actual > practice, and then how that could be generalized to many/all PLs. It has since occurred to me that another benefit of having a transform_validator per PL would be immediate error reporting if someone, for whatever reason, tries out CREATE TRANSFORM for a PL that doesn't grok transforms. Regards, -Chap
On 02/07/22 15:14, Chapman Flack wrote: > It has since occurred to me that another benefit of having a > transform_validator per PL would be immediate error reporting > if someone, for whatever reason, tries out CREATE TRANSFORM > for a PL that doesn't grok transforms. The same could be achieved, I guess, by an event trigger, though that would take positive effort to set up, where the benefit of a per-PL transform_validator would be that if a given PL does not bother to provide one, CREATE TRANSFORM for it would automatically fail. (And such a validator would not have to spend most of its life ignoring other DDL.) That does reveal another documentation gap: table 40.1 does not show CREATE or DROP TRANSFORM being supported for event triggers. I've confirmed they work, though. I'll tack that onto the doc patch. I notice our transforms lack the named groups of 9075-2. With those (plus our LANGUAGE clause), I could write, for a made-up example: CREATE TRANSFORM FOR hstore LANGUAGE plpython3u asdict ( FROM SQL WITH FUNCTION hstore_to_plpython3dict, TO SQL WITH FUNCTION plpython3dict_to_hstore) asnamedtuple ( FROM SQL WITH FUNCTION hstore_to_plpython3namedtup, TO SQL WITH FUNCTION plpython3namedtup_to_hstore); CREATE FUNCTION f1(val hstore) RETURNS int LANGUAGE plpython3u TRANSFORM GROUP asdict FOR TYPE hstore ... CREATE FUNCTION f2(val hstore) RETURNS int LANGUAGE plpython3u TRANSFORM GROUP asnamedtuple FOR TYPE hstore ... It seems to me that could be useful, in cases where a PL offers more than one good choice for representing a PostgreSQL type and the preferred one could depend on the function. Was that considered and rejected for our transforms, or were ours just based on an earlier 9075-2 without the named groups? Also, I am doubtful of our Compatibility note, "There is a CREATE TRANSFORM command in the SQL standard, but it is for adapting data types to client languages." In my reading of 9075-2, I do see the transforms used for client languages (all the <embedded SQL Foo program>s), but I also see the to-sql and from-sql functions being applied in <routine invocation> whenever "R is an external routine" and the type "is a user-defined type". The latter doesn't seem much different from our usage. The differences I see are (1) our LANGUAGE clause, (2) we don't have a special "user-defined type" category limiting what types can have transforms (and (3), we don't have the named groups). And we are applying them /only/ for server-side routines, rather than for server and client code. The ISO transforms work by mapping the ("user-defined") type to some existing SQL type for which the PL has a standard mapping already. Ours work by mapping the type directly to some suitable type in the PL. Am I reading this accurately? Regards, -Chap
On 02/07/22 15:14, Chapman Flack wrote: > I'll work on some doc patches. It seems a bit of an impedance mismatch that there is a get_func_trftypes producing a C Oid[] (with its length returned separately), while get_transform_fromsql and get_transform_tosql both expect a List. There is only one in-core caller of get_func_trftypes (it is print_function_trftypes in ruleutils.c), while both in-core PLs that currently support transforms are reduced to rolling their own List by grabbing the protrftypes Datum and hitting it with oid_array_to_list. Would it be reasonable to deprecate get_func_trftypes and instead provide a get_call_trftypes (following the naming of get_call_result_type where _call_ encompasses functions and procedures) that returns a List, and use that consistently? Regards, -Chap
On 02/21/22 16:09, Chapman Flack wrote: > On 02/07/22 15:14, Chapman Flack wrote: >> I'll work on some doc patches. > > It seems a bit of an impedance mismatch that there is a get_func_trftypes > producing a C Oid[] (with its length returned separately), while > get_transform_fromsql and get_transform_tosql both expect a List. > ... > Would it be reasonable to deprecate get_func_trftypes and instead > provide a get_call_trftypes It would have been painful to write documentation of get_func_trftypes saying its result isn't what get_transform_{from.to}sql expect, so patch 1 does add a get_call_trftypes that returns a List *. Patch 2 then updates the docs as discussed in this thread. It turned out plhandler.sgml was kind of a long monolith of text even before adding transform information, so I broke it into sections first. This patch adds the section markup without reindenting, so the changes aren't obscured. The chapter had also fallen behind the introduction of procedures, so I have changed many instances of 'function' to the umbrella term 'routine'. Patch 3 simply reindents for the new section markup and rewraps. Whitespace only. Patch 4 updates src/test/modules/plsample to demonstrate handling of transforms (and to add some more comments generally). I'll add this to the CF. Regards, -Chap
Attachment
On 2022-02-22 12:59, Chapman Flack wrote: > It would have been painful to write documentation of get_func_trftypes > saying its result isn't what get_transform_{from.to}sql expect, so > patch 1 does add a get_call_trftypes that returns a List *. > > Patch 2 then updates the docs as discussed in this thread. It turned > out > plhandler.sgml was kind of a long monolith of text even before adding > transform information, so I broke it into sections first. This patch > adds > the section markup without reindenting, so the changes aren't obscured. > > The chapter had also fallen behind the introduction of procedures, so > I have changed many instances of 'function' to the umbrella term > 'routine'. > > Patch 3 simply reindents for the new section markup and rewraps. > Whitespace only. > > Patch 4 updates src/test/modules/plsample to demonstrate handling of > transforms (and to add some more comments generally). Here is a rebase.
Attachment
Hi
I am doing an review of this patch and I have two comments
1. I am not sure if get_call_trftypes is a good name - the prefix get_call is used when some runtime data is processed. This function just returns reformatted data from the system catalogue. Maybe get_func_trftypes_list, or just replace function get_func_trftypes (now, the list is an array, so there should not be performance issues). For consistency, the new function should be used in plperl and plpython too. Probably this function is not widely used, so I don't see why we need to have two almost equal functions in code.
2.
+ It also contains the OID of the intended procedural language and whether
+ that procedural language is declared as <literal>TRUSTED</literal>, useful
+ if a single inline handler is supporting more than one procedural language.
I am not sure if this sentence is in the correct place. Maybe can be mentioned separately,
so generally handlers can be used by more than one procedural language. But maybe
I don't understand this sentence.
+ that procedural language is declared as <literal>TRUSTED</literal>, useful
+ if a single inline handler is supporting more than one procedural language.
I am not sure if this sentence is in the correct place. Maybe can be mentioned separately,
so generally handlers can be used by more than one procedural language. But maybe
I don't understand this sentence.
Regards
Pavel
On 2022-07-13 00:26, Pavel Stehule wrote: > 1. I am not sure if get_call_trftypes is a good name - the prefix > get_call > is used when some runtime data is processed. I guess I hadn't caught on that the prefix carried that meaning. To me, it appeared to be a prefix used to avoid being specific to 'function' or 'procedure'. > This function just returns > reformatted data from the system catalogue. Maybe > get_func_trftypes_list, > or just replace function get_func_trftypes (now, the list is an array, > so > there should not be performance issues). For consistency, the new > function > should be used in plperl and plpython too. Probably this function is > not If it is acceptable to replace get_func_trftypes like that, I can produce such a patch. > 2. > > + It also contains the OID of the intended procedural language and > whether > + that procedural language is declared as > <literal>TRUSTED</literal>, > useful > + if a single inline handler is supporting more than one procedural > language. > > I am not sure if this sentence is in the correct place. Maybe can be > mentioned separately, > so generally handlers can be used by more than one procedural language. > But > maybe > I don't understand this sentence. My point was that, if the structure did /not/ include the OID of the PL and its TRUSTED property, then it would not be possible for a single inline handler to support more than one PL. So that is why it is a good thing that those are included in the structure, and why it would be a bad thing if they were not. Would it be clearer to say: It also contains the OID of the intended procedural language and whether that procedural language is declared as <literal>TRUSTED</literal>. While these values are redundant if the inline handler is serving only a single procedural language, they are necessary to allow one inline handler to serve more than one PL. Regards, -Chap
čt 28. 7. 2022 v 4:06 odesílatel <chap@anastigmatix.net> napsal:
On 2022-07-13 00:26, Pavel Stehule wrote:
> 1. I am not sure if get_call_trftypes is a good name - the prefix
> get_call
> is used when some runtime data is processed.
I guess I hadn't caught on that the prefix carried that meaning.
To me, it appeared to be a prefix used to avoid being specific to
'function' or 'procedure'.
I am sure this function is in Postgres significantly longer than Postgres has support for 'procedures'.
> This function just returns
> reformatted data from the system catalogue. Maybe
> get_func_trftypes_list,
> or just replace function get_func_trftypes (now, the list is an array,
> so
> there should not be performance issues). For consistency, the new
> function
> should be used in plperl and plpython too. Probably this function is
> not
If it is acceptable to replace get_func_trftypes like that, I can
produce
such a patch.
> 2.
>
> + It also contains the OID of the intended procedural language and
> whether
> + that procedural language is declared as
> <literal>TRUSTED</literal>,
> useful
> + if a single inline handler is supporting more than one procedural
> language.
>
> I am not sure if this sentence is in the correct place. Maybe can be
> mentioned separately,
> so generally handlers can be used by more than one procedural language.
> But
> maybe
> I don't understand this sentence.
My point was that, if the structure did /not/ include the OID of
the PL and its TRUSTED property, then it would not be possible
for a single inline handler to support more than one PL. So that
is why it is a good thing that those are included in the structure,
and why it would be a bad thing if they were not.
Would it be clearer to say:
It also contains the OID of the intended procedural language and whether
that procedural language is declared as <literal>TRUSTED</literal>.
While these values are redundant if the inline handler is serving only
a single procedural language, they are necessary to allow one inline
handler to serve more than one PL.
ok
Regards
Pavel
Regards,
-Chap
On 2022-07-28 01:51, Pavel Stehule wrote: >> Would it be clearer to say: >> >> It also contains the OID of the intended procedural language and >> whether >> that procedural language is declared as <literal>TRUSTED</literal>. >> While these values are redundant if the inline handler is serving only >> a single procedural language, they are necessary to allow one inline >> handler to serve more than one PL. > > ok I will probably be unable to produce a new patch for this CF. Family obligation. Regards, -Chap
pá 29. 7. 2022 v 18:35 odesílatel <chap@anastigmatix.net> napsal:
On 2022-07-28 01:51, Pavel Stehule wrote:
>> Would it be clearer to say:
>>
>> It also contains the OID of the intended procedural language and
>> whether
>> that procedural language is declared as <literal>TRUSTED</literal>.
>> While these values are redundant if the inline handler is serving only
>> a single procedural language, they are necessary to allow one inline
>> handler to serve more than one PL.
>
> ok
I will probably be unable to produce a new patch for this CF.
Family obligation.
ok
Regards
Pavel
Regards,
-Chap