Thread: Documentation about PL transforms

Documentation about PL transforms

From

Chapman Flack

Date:

05 February 2022, 02:55:42

Hi,

It looks to me as if the transforms feature for PLs was added in
cac7658, and support was added to plperl and plpython at that time,
with documentation added for CREATE FUNCTION, CREATE/DROP TRANSFORM,
and the catalogs, but nothing was added at that time to plhandler.sgml
to clarify that a PL handler itself must add code to look up and apply
such transforms, or nothing happens.

Without that information, a reader not in the know (it happened to me)
could get the idea from the CREATE TRANSFORM docs that the support
is more general than it is, and an incoming argument could already
contain the 'internal' something-specific-to-the-language-implementation
returned by a specified transform. (Which does lead to follow-on questions
like "who says an arbitrary language implementation's transform result
could be safely represented in an mmgr-managed argument list?"—it really
does make better sense upon learning the PL handler has to do the deed.
But I haven't spotted any place where the docs *say* that.)

I'm thinking plhandler.sgml is the only place that really needs to be
said; readers looking up CREATE TRANSFORM and using an existing PL that
supports it don't need to know how the sausage is made. (Maybe it is
worth mentioning there, though, that it isn't possible to develop
transforms for an arbitrary PL unless that PL applies transforms.)

I noticed the CREATE TRANSFORM docs show the argument list as
(argument_type [, ...]) even though check_transform_function will reject
any argument list of length other than 1 or with type other than internal.
Is that an overly-generic way to show the syntax, or is that a style
with precedent elsewhere in the docs? (I checked a couple obvious places
like CREATE OPERATOR, CREATE FUNCTION, CREATE TYPE but did not see similar
examples).

As long as a PL handler has the sole responsibility for invoking
its transforms, I wonder if it would make sense to allow a PL to
define what its transforms should look like, maybe replacing
check_transform_function with a transform_validator for the PL.
I'm not proposing such a patch here, but I am willing to prepare
one for plhandler.sgml and maybe pltemplate.c documenting the current
situation, if nobody tells me I'm wrong about something here.

Regards,
-Chap

Re: Documentation about PL transforms

From

Peter Eisentraut

Date:

07 February 2022, 18:59:48

On 05.02.22 00:55, Chapman Flack wrote:
> I'm thinking plhandler.sgml is the only place that really needs to be
> said; readers looking up CREATE TRANSFORM and using an existing PL that
> supports it don't need to know how the sausage is made. (Maybe it is
> worth mentioning there, though, that it isn't possible to develop
> transforms for an arbitrary PL unless that PL applies transforms.)

makes sense

> I noticed the CREATE TRANSFORM docs show the argument list as
> (argument_type [, ...]) even though check_transform_function will reject
> any argument list of length other than 1 or with type other than internal.
> Is that an overly-generic way to show the syntax, or is that a style
> with precedent elsewhere in the docs?

That could be corrected.

> As long as a PL handler has the sole responsibility for invoking
> its transforms, I wonder if it would make sense to allow a PL to
> define what its transforms should look like, maybe replacing
> check_transform_function with a transform_validator for the PL.
> I'm not proposing such a patch here, but I am willing to prepare
> one for plhandler.sgml and maybe pltemplate.c documenting the current
> situation, if nobody tells me I'm wrong about something here.

Maybe.  This kind of thing would mostly be driven what a PL wants in 
actual practice, and then how that could be generalized to many/all PLs.

Re: Documentation about PL transforms

From

Chapman Flack

Date:

07 February 2022, 23:14:43

On 02/07/22 10:59, Peter Eisentraut wrote:
> On 05.02.22 00:55, Chapman Flack wrote:
>> I'm thinking plhandler.sgml is the only place that really needs to be
>> ...
>> worth mentioning there, though, that it isn't possible to develop
>> transforms for an arbitrary PL unless that PL applies transforms.)
> 
> makes sense
> 
>> (argument_type [, ...]) even though check_transform_function will reject
>> any argument list of length other than 1 or with type other than internal.
> 
> That could be corrected.

I'll work on some doc patches.

>> As long as a PL handler has the sole responsibility for invoking
>> its transforms, I wonder if it would make sense to allow a PL to
>> define what its transforms should look like, maybe replacing
>> check_transform_function with a transform_validator for the PL.
> 
> Maybe.  This kind of thing would mostly be driven what a PL wants in actual
> practice, and then how that could be generalized to many/all PLs.

It has since occurred to me that another benefit of having a
transform_validator per PL would be immediate error reporting
if someone, for whatever reason, tries out CREATE TRANSFORM
for a PL that doesn't grok transforms.

Regards,
-Chap

Re: Documentation about PL transforms

From

Chapman Flack

Date:

08 February 2022, 22:03:21

On 02/07/22 15:14, Chapman Flack wrote:
> It has since occurred to me that another benefit of having a
> transform_validator per PL would be immediate error reporting
> if someone, for whatever reason, tries out CREATE TRANSFORM
> for a PL that doesn't grok transforms.

The same could be achieved, I guess, by an event trigger, though that
would take positive effort to set up, where the benefit of a per-PL
transform_validator would be that if a given PL does not bother to
provide one, CREATE TRANSFORM for it would automatically fail. (And
such a validator would not have to spend most of its life ignoring
other DDL.)

That does reveal another documentation gap: table 40.1 does not show
CREATE or DROP TRANSFORM being supported for event triggers. I've
confirmed they work, though. I'll tack that onto the doc patch.

I notice our transforms lack the named groups of 9075-2. With those
(plus our LANGUAGE clause), I could write, for a made-up example:

CREATE TRANSFORM FOR hstore LANGUAGE plpython3u
  asdict (
    FROM SQL WITH FUNCTION hstore_to_plpython3dict,
    TO SQL WITH FUNCTION plpython3dict_to_hstore)
  asnamedtuple (
    FROM SQL WITH FUNCTION hstore_to_plpython3namedtup,
    TO SQL WITH FUNCTION plpython3namedtup_to_hstore);

CREATE FUNCTION f1(val hstore) RETURNS int
LANGUAGE plpython3u
TRANSFORM GROUP asdict FOR TYPE hstore
...

CREATE FUNCTION f2(val hstore) RETURNS int
LANGUAGE plpython3u
TRANSFORM GROUP asnamedtuple FOR TYPE hstore
...

It seems to me that could be useful, in cases where a PL offers
more than one good choice for representing a PostgreSQL type and
the preferred one could depend on the function.

Was that considered and rejected for our transforms, or were ours
just based on an earlier 9075-2 without the named groups?

Also, I am doubtful of our Compatibility note, "There is a CREATE
TRANSFORM command in the SQL standard, but it is for adapting data
types to client languages."

In my reading of 9075-2, I do see the transforms used for client
languages (all the <embedded SQL Foo program>s), but I also see
the to-sql and from-sql functions being applied in <routine invocation>
whenever "R is an external routine" and the type "is a user-defined type".
The latter doesn't seem much different from our usage. The differences
I see are (1) our LANGUAGE clause, (2) we don't have a special
"user-defined type" category limiting what types can have transforms
(and (3), we don't have the named groups). And we are applying them
/only/ for server-side routines, rather than for server and client code.

The ISO transforms work by mapping the ("user-defined") type to some
existing SQL type for which the PL has a standard mapping already. Ours
work by mapping the type directly to some suitable type in the PL.

Am I reading this accurately?

Regards,
-Chap

Re: Documentation about PL transforms

From

Chapman Flack

Date:

22 February 2022, 00:09:52

On 02/07/22 15:14, Chapman Flack wrote:
> I'll work on some doc patches.

It seems a bit of an impedance mismatch that there is a get_func_trftypes
producing a C Oid[] (with its length returned separately), while
get_transform_fromsql and get_transform_tosql both expect a List.

There is only one in-core caller of get_func_trftypes (it is
print_function_trftypes in ruleutils.c), while both in-core PLs that
currently support transforms are reduced to rolling their own List by
grabbing the protrftypes Datum and hitting it with oid_array_to_list.

Would it be reasonable to deprecate get_func_trftypes and instead
provide a get_call_trftypes (following the naming of get_call_result_type
where _call_ encompasses functions and procedures) that returns a List,
and use that consistently?

Regards,
-Chap

Re: Documentation about PL transforms

From

Chapman Flack

Date:

22 February 2022, 20:59:06

On 02/21/22 16:09, Chapman Flack wrote:
> On 02/07/22 15:14, Chapman Flack wrote:
>> I'll work on some doc patches.
> 
> It seems a bit of an impedance mismatch that there is a get_func_trftypes
> producing a C Oid[] (with its length returned separately), while
> get_transform_fromsql and get_transform_tosql both expect a List.
> ...
> Would it be reasonable to deprecate get_func_trftypes and instead
> provide a get_call_trftypes

It would have been painful to write documentation of get_func_trftypes
saying its result isn't what get_transform_{from.to}sql expect, so
patch 1 does add a get_call_trftypes that returns a List *.

Patch 2 then updates the docs as discussed in this thread. It turned out
plhandler.sgml was kind of a long monolith of text even before adding
transform information, so I broke it into sections first. This patch adds
the section markup without reindenting, so the changes aren't obscured.

The chapter had also fallen behind the introduction of procedures, so
I have changed many instances of 'function' to the umbrella term
'routine'.

Patch 3 simply reindents for the new section markup and rewraps.
Whitespace only.

Patch 4 updates src/test/modules/plsample to demonstrate handling of
transforms (and to add some more comments generally).

I'll add this to the CF.

Regards,
-Chap

Attachment

Re: Documentation about PL transforms

From

chap@anastigmatix.net

Date:

08 April 2022, 04:22:11

On 2022-02-22 12:59, Chapman Flack wrote:
> It would have been painful to write documentation of get_func_trftypes
> saying its result isn't what get_transform_{from.to}sql expect, so
> patch 1 does add a get_call_trftypes that returns a List *.
> 
> Patch 2 then updates the docs as discussed in this thread. It turned 
> out
> plhandler.sgml was kind of a long monolith of text even before adding
> transform information, so I broke it into sections first. This patch 
> adds
> the section markup without reindenting, so the changes aren't obscured.
> 
> The chapter had also fallen behind the introduction of procedures, so
> I have changed many instances of 'function' to the umbrella term
> 'routine'.
> 
> Patch 3 simply reindents for the new section markup and rewraps.
> Whitespace only.
> 
> Patch 4 updates src/test/modules/plsample to demonstrate handling of
> transforms (and to add some more comments generally).

Here is a rebase.

Attachment

Re: Documentation about PL transforms

From

Pavel Stehule

Date:

13 July 2022, 07:26:47

I am doing an review of this patch and I have two comments

1. I am not sure if get_call_trftypes is a good name - the prefix get_call is used when some runtime data is processed. This function just returns reformatted data from the system catalogue. Maybe get_func_trftypes_list, or just replace function get_func_trftypes (now, the list is an array, so there should not be performance issues). For consistency, the new function should be used in plperl and plpython too. Probably this function is not widely used, so I don't see why we need to have two almost equal functions in code.

+ It also contains the OID of the intended procedural language and whether
+ that procedural language is declared as <literal>TRUSTED</literal>, useful
+ if a single inline handler is supporting more than one procedural language.

I am not sure if this sentence is in the correct place. Maybe can be mentioned separately,
so generally handlers can be used by more than one procedural language. But maybe
I don't understand this sentence.

Regards

Pavel

Re: Documentation about PL transforms

From

chap@anastigmatix.net

Date:

28 July 2022, 05:05:58

On 2022-07-13 00:26, Pavel Stehule wrote:

> 1. I am not sure if get_call_trftypes is a good name - the prefix 
> get_call
> is used when some runtime data is processed.

I guess I hadn't caught on that the prefix carried that meaning.
To me, it appeared to be a prefix used to avoid being specific to
'function' or 'procedure'.

> This function just returns
> reformatted data from the system catalogue. Maybe 
> get_func_trftypes_list,
> or just replace function get_func_trftypes (now, the list is an array, 
> so
> there should not be performance issues). For consistency, the new 
> function
> should be used in plperl and plpython too. Probably this function is 
> not

If it is acceptable to replace get_func_trftypes like that, I can 
produce
such a patch.

> 2.
> 
> +    It also contains the OID of the intended procedural language and
> whether
> +    that procedural language is declared as 
> <literal>TRUSTED</literal>,
> useful
> +    if a single inline handler is supporting more than one procedural
> language.
> 
> I am not sure if this sentence is in the correct place. Maybe can be
> mentioned separately,
> so generally handlers can be used by more than one procedural language. 
> But
> maybe
> I don't understand this sentence.

My point was that, if the structure did /not/ include the OID of
the PL and its TRUSTED property, then it would not be possible
for a single inline handler to support more than one PL. So that
is why it is a good thing that those are included in the structure,
and why it would be a bad thing if they were not.

Would it be clearer to say:

It also contains the OID of the intended procedural language and whether
that procedural language is declared as <literal>TRUSTED</literal>.
While these values are redundant if the inline handler is serving only
a single procedural language, they are necessary to allow one inline
handler to serve more than one PL.

Regards,
-Chap

Re: Documentation about PL transforms

From

Pavel Stehule

Date:

28 July 2022, 08:51:12

čt 28. 7. 2022 v 4:06 odesílatel <chap@anastigmatix.net> napsal:

On 2022-07-13 00:26, Pavel Stehule wrote:

> 1. I am not sure if get_call_trftypes is a good name - the prefix
> get_call
> is used when some runtime data is processed.

I guess I hadn't caught on that the prefix carried that meaning.
To me, it appeared to be a prefix used to avoid being specific to
'function' or 'procedure'.

I am sure this function is in Postgres significantly longer than Postgres has support for 'procedures'.

> This function just returns
> reformatted data from the system catalogue. Maybe
> get_func_trftypes_list,
> or just replace function get_func_trftypes (now, the list is an array,
> so
> there should not be performance issues). For consistency, the new
> function
> should be used in plperl and plpython too. Probably this function is
> not

If it is acceptable to replace get_func_trftypes like that, I can
produce
such a patch.

> 2.
>
> + It also contains the OID of the intended procedural language and
> whether
> + that procedural language is declared as
> <literal>TRUSTED</literal>,
> useful
> + if a single inline handler is supporting more than one procedural
> language.
>
> I am not sure if this sentence is in the correct place. Maybe can be
> mentioned separately,
> so generally handlers can be used by more than one procedural language.
> But
> maybe
> I don't understand this sentence.

My point was that, if the structure did /not/ include the OID of
the PL and its TRUSTED property, then it would not be possible
for a single inline handler to support more than one PL. So that
is why it is a good thing that those are included in the structure,
and why it would be a bad thing if they were not.

Would it be clearer to say:

It also contains the OID of the intended procedural language and whether
that procedural language is declared as <literal>TRUSTED</literal>.
While these values are redundant if the inline handler is serving only
a single procedural language, they are necessary to allow one inline
handler to serve more than one PL.

Regards

Pavel

Regards,
-Chap

Re: Documentation about PL transforms

From

chap@anastigmatix.net

Date:

29 July 2022, 19:35:20

On 2022-07-28 01:51, Pavel Stehule wrote:
>> Would it be clearer to say:
>> 
>> It also contains the OID of the intended procedural language and 
>> whether
>> that procedural language is declared as <literal>TRUSTED</literal>.
>> While these values are redundant if the inline handler is serving only
>> a single procedural language, they are necessary to allow one inline
>> handler to serve more than one PL.
> 
> ok

I will probably be unable to produce a new patch for this CF.
Family obligation.

Regards,
-Chap

Re: Documentation about PL transforms

From

Pavel Stehule

Date:

29 July 2022, 21:47:59

pá 29. 7. 2022 v 18:35 odesílatel <chap@anastigmatix.net> napsal:

On 2022-07-28 01:51, Pavel Stehule wrote:
>> Would it be clearer to say:
>>
>> It also contains the OID of the intended procedural language and
>> whether
>> that procedural language is declared as <literal>TRUSTED</literal>.
>> While these values are redundant if the inline handler is serving only
>> a single procedural language, they are necessary to allow one inline
>> handler to serve more than one PL.
>
> ok

I will probably be unable to produce a new patch for this CF.
Family obligation.

Regards

Pavel

Regards,
-Chap