Re: [PATCH] Add transforms feature - Mailing list pgsql-hackers

From Peter Eisentraut
Subject Re: [PATCH] Add transforms feature
Date
Msg-id 528CE8A0.3060207@gmx.net
Whole thread Raw
In response to Re: [PATCH] Add transforms feature  (Dimitri Fontaine <dimitri@2ndQuadrant.fr>)
Responses Re: [PATCH] Add transforms feature  (Robert Haas <robertmhaas@gmail.com>)
Re: [PATCH] Add transforms feature  (Dimitri Fontaine <dimitri@2ndQuadrant.fr>)
List pgsql-hackers
On 11/15/13, 11:04 AM, Dimitri Fontaine wrote:
>   - Documentation style seems to be to be different from the "man page"
>     or "reference docs" style that we use elsewhere, and is instead
>     deriving the general case from examples. Reads strange.

Which specific section do you have in mind?  It's hard to explain this
feature in abstract terms, I think.

>   - The internal datatype argument and return type discussion for
>     function argument looks misplaced, but I don't have a better
>     proposition for that.

OK, maybe I'll put that in parentheses or a separate paragraph.

>   - Do we need an ALTER TRANSFORM command?
> 
>     Usually we have at least an Owner for the new objects and a command
>     to change the owner. Then should we be able to change the
>     function(s) used in a transform?

We don't have ALTER CAST either, and no one's been too bothered about
that.  It's possible, of course.

>   - Should transform live in a schema?
> 
>     At first sight, no reason why, but see next point about a use case
>     that we might be able to solve doing that.

Transforms don't have a name, so I don't quite see what you mean here.

>   - SQL Standard has something different named the same thing,
>     targetting client side types apparently. Is there any reason why we
>     would want to stay away from using the same name for something
>     really different in PostgreSQL?

Let's review that, as there as been some confusion about that.  The SQL
standard syntax is

CREATE TRANSFORM FOR <type> <groupname> (...details...);

and then there is

SET DEFAULT TRANSFORM GROUP <groupname>
SET TRANSFORM GROUP FOR TYPE <type> <groupname>

This is essentially an elaborate way to have custom input/output
formats, like DateStyle or bytea_output.

(You can find examples of this in the IBM DB2 documentation.  Some of
their clients apparently set a certain transform group automatically,
allowing you to set per-interface output formats.)

The proposed syntax in the other hand is

CREATE TRANSFORM FOR <type> LANGUAGE <lang> (...details...);

So you could consider LANGUAGE <lang> to be the implicit transform group
of language <lang>, if you like.

Or you could consider that this is a situation like VIEW vs.
MATERERIALIZED VIEW: they sound the same, they are a bit alike, but the
implementation details are different.

All obvious synonyms of "transform" (conversion, translation, etc.) are
already in use.

> On the higher level design, the big question here is about selective
> behavior. As soon as you CREATE TRANSFORM FOR hstore LANGUAGE plperl
> then any plperl function will now receive its hstore arguments as a
> proper perl hash rather than a string.
> 
> Any pre-existing plperl function with hstore arguments or return type
> then needs to be upgraded to handle the new types nicely, and some of
> those might not be under the direct control of the DBA running the
> CREATE TRANSFORM command, when using some plperl extensions for example.

I had proposed disallowing installing a transform that would affect
existing functions.  That was rejected or deemed unnecessary.  You can't
have it both ways. ;-)

> A mechanism allowing for the transform to only be used in some functions
> but not others might be useful. The simplest such mechanism I can think
> of is modeled against the PL/Java classpath facility as specified in the
> SQL standard: you attach a classpath per schema.

Anything that's a problem per-database would also be a problem per schema.

> Should using the schema to that ends be frowned upon, then we need a way
> to register each plperl function against using or not using the
> transform facility, defaulting to not using anything. Maybe something
> like the following:
> 
>   CREATE FUNCTION foo(hash hstore, x ltree)
>      RETURNS hstore
>      LANGUAGE plperl
>      USING TRANSFORM FOR hstore, ltree
>   AS $$ … $$;

This is a transition problem.  Nobody is required to install the
transforms into their existing databases.  They probably shouldn't.

How many people actually use hstore with PL/Perl or PL/Python now?
Probably not many, because it's weird.

I like to think about how this works for new development:  Here is my
extension type, here is how it interfaces with languages.  Once you have
established that, you don't want to have to repeat that every time you
write a function.  That's error prone and cumbersome.  And anything
that's set per schema or higher is a dependency tracking and caching mess.

Also, extension types should work the same as built-in types.
Eventually, I'd like to rip out the hard-coded data type support in
PL/Python and replace it with built-in transforms.  Even if we don't
actually do it, conceptually it should be possible.  Now if we require
"USING TRANSFORM FOR int, bytea" every time, we'd have taken a big step
back.  Effectively, we already have built-in transforms in PL/Python.
We have added a few more over the years.  It's been a bit of a pain from
time to time.  At least, with this feature we'd be moving this decision
into user space and give people a way to fix things.  (Incidentally, if
you add a lot of transforms, you are probably dealing with a strongly
typed language.  And a strongly typed language is more likely to cleanly
catch type errors resulting from changes in the transforms.)




pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Data corruption issues using streaming replication on 9.0.14/9.2.5/9.3.1
Next
From: Heikki Linnakangas
Date:
Subject: Re: GIN improvements part 1: additional information