Thread: Transform groups

Transform groups

From
Peter Eisentraut
Date:
There was some talk in the context of the new protocol to do some
enhancements in the area of the "binary" format to make it useful across
different architectures etc.

The SQL standard defines a concept called "transform groups", which are
basically pairs of input/ouput functions that can be switched between.
The standard talks about different transform groups for different host
languages, so this essentially selects between different binary output
formats.

I think this would fit naturally with many things we are doing and want to
do.  We could have a (default) "text" transform group, a "binary little
endian" transform group, a "binary big endian" transform group".
Transform groups could map more sophisticated data types into native
programming language structures (think timestamp or inet).  Users could
also plug in their own transform groups for existing data types if they
hate the input or output format.  It would also reduce the pressure to add
more ad-hoc format-twiddling parameters like extra_float_digits or even
datestyle.

Is this something worth investigating?

-- 
Peter Eisentraut   peter_e@gmx.net



Re: Transform groups

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> The SQL standard defines a concept called "transform groups", which are
> basically pairs of input/ouput functions that can be switched between.

> Is this something worth investigating?

Perhaps, but there is little time left to get it done for 7.4.  Can we
lay out a strategy that will let us add such things incrementally,
without another protocol break later?

What I had personally been planning to do was restore the typsend and
typreceive columns of pg_type, defined on the order of
typsend(mytype) returns byteatypreceive(bytea) returns mytype

and then define some simple conventions, probably network-byte-order
based, that these functions would be expected to adhere to for the
external binary representation.  The on-the-wire representation is
a network-byte-order length followed by whatever contents the
send routines put into their bytea results.

I don't think this would scale very well to multiple external
representations, though; sure you could define a switch somewhere, but
what are the odds that N different send/receive functions would all pay
attention to it?  And how would you know if they didn't?

The implementations of these functions would probably mostly depend on a
small set of conversion routines similar to the existing pq_getint and
pq_putint routines, so it might be possible to do simple things like
switching endianness in a centralized way.  But I'm not sure what
happens if you want bigger changes than that.
        regards, tom lane



Re: Transform groups (more FE/BE protocol issues)

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> The SQL standard defines a concept called "transform groups", which are
> basically pairs of input/ouput functions that can be switched between.
> The standard talks about different transform groups for different host
> languages, so this essentially selects between different binary output
> formats.

> I think this would fit naturally with many things we are doing and want to
> do.

I've been thinking about this a little more; it seems to open up some
questions about the current design of the new FE/BE protocol.

* There are two places presently in the protocol where the client can
specify text/binary via a boolean (represented as int8).  To move to a
transform-group world, we could redefine that field as a group selector:
0 = standard text representation, 1 = standard binary representation,
other values = reserved for future implementation.  The obvious
question is whether we should widen these fields to more than 8 bits.
Are we likely to need more than 256 transform groups?  More than 64K?
(Read on before you answer, since some of the points below suggest we
might be transmitting a lot more of these fields than at present;
keeping them narrow might be important for bandwidth reasons.)

* The DataRow/BinaryRow distinction obviously doesn't scale to multiple
transform groups.  I propose dropping the BinaryRow message type in
protocol 3.0, and instead carrying the format code (group selector)
somewhere else.  A straightforward conversion would be to add it to the
body of DataRow, but I'm not convinced that's the best place; again,
read on.

* At what granularity do you wish to select the transform group type for
data being transferred in or out?  Right now we've essentially assumed
that you only need to specify it once for an entire command result, but
it's fairly easy to imagine scenarios where this isn't what you want.
For example, very many people are going to want to send or receive bytea
fields as raw binary, since that's more or less the native
representation for the client (nobody likes escaping or unescaping).
It does not follow that they want raw binary for, say, timestamp fields
appearing in the same table.  The problem gets even more pressing if you
want to use transform groups as a substitute for things like DateStyle,
as Peter suggested in the above-quoted message.

* ISTM the most natural granularity for specifying transform group is at
the column level.  I can't see a good use-case for varying transform
type across rows of a select result, but being able to select it for
each column has clear usefulness.

* That leaves us with two issues: where does the client say what it
wants, and where does the backend report the actual transform group used
for each column?  For SELECTs, from an efficiency point of view it'd be
nicest to have the client request desired transforms in Bind, and then
we could have RowDescription report the actual transforms used for each
column.  This way there'd be no need to include transform info in
DataRow, which would be redundant if one doesn't expect per-row changes
in transform.  I'd suggest allowing Bind to specify either a single
transform group to be applied to all columns, or per-column groups.
We'd remove the output-is-binary field from Execute.

* More or less the same considerations apply for parameter values being
sent in a Bind message.  Here I'd opt for always sending a transform
group for each parameter value being sent.

* The client can hardly be expected to select per-column transforms in
Bind if it doesn't know the result column datatypes yet.  In the
protocol document as it stands today, there's no way to find out the
result datatypes except a portal Describe --- which requires that you've
already done Bind.  I took out the result datatypes in
prepared-statement Describe because it seemed unnecessarily complicated
to implement (there's no support in the backend right now to derive a
tupdesc from a plan without starting the executor).  Clearly that'll
have to be put back though.  Presumably the RowDescriptor returned by
prepared-statement Describe will return default (zero == text) transform
groups for all columns, and the client will have to know to believe its
own requests instead if it doesn't trouble to do a portal Describe after
Bind.

* Textual COPY doesn't need any changes since it'll presumably always
use transform group zero, but what do we do with binary COPY?  Probably
the best thing is to add an optional header field showing the transform
group for each column, with the default assumption being that all
columns are transform group 1 (standard binary).  I don't know what the
user does in the COPY TO command to select other transform groups, but
that's not a protocol-level issue so it need not be solved today.

Comments?  In particular I need some feedback about how wide to make the
transform-group fields ...
        regards, tom lane



Re: Transform groups (more FE/BE protocol issues)

From
Peter Eisentraut
Date:
Tom Lane writes:

> I've been thinking about this a little more; it seems to open up some
> questions about the current design of the new FE/BE protocol.

A transform group is a named object attached to a type.  A transform group
converts a user-defined type to a standard SQL type or vice versa.  (The
rationale being that clients know how to handle standard SQL types.)  A
client selects a transform group as session state, either individually for
each type (SET TRANFORM GROUP FOR TYPE mytype 'mygroup') or using a
default mechanism (SET DEFAULT TRANSFORM GROUP 'mygroup'), which means
that all types that have a transform group with the given name activate
that group.

> * There are two places presently in the protocol where the client can
> specify text/binary via a boolean (represented as int8).  To move to a
> transform-group world, we could redefine that field as a group selector:
> 0 = standard text representation, 1 = standard binary representation,
> other values = reserved for future implementation.

I don't think we need any more representations than those two, and the
transform group feature can be independent from this.

Here's an example:  a set of transform groups (for various data types)
that convert SQL data types to more C-like types, for example timestamp to
struct tm.  Would you want to pass struct tm-data over the wire as a blob
of 36 bytes?  I would rather get a standard binary representation with a
length word, so a middle layer can still copy this data around without
having to be data type-aware.

In fact, this suggests that the format field should not be represented at
all in the protocol and only be handled as session state.  Users can just
say "I want these types in text and these types in binary" and that is
respected for the rest of the session.  Obviously there are some backward
compatibility problem in this.

> * The DataRow/BinaryRow distinction obviously doesn't scale to multiple
> transform groups.  I propose dropping the BinaryRow message type in
> protocol 3.0, and instead carrying the format code (group selector)
> somewhere else.  A straightforward conversion would be to add it to the
> body of DataRow, but I'm not convinced that's the best place; again,
> read on.

We would need a format code for each returned column.

It would be the decision of the transform group about which format to
return.  If you want a transform group that provides your client
application with binary data, then you write your transform functions to
return bytea.  If you want a text format, write them to return cstring.
(Other, less subtle flagging mechanisms could be invented.)
DataRow/BinaryRow basically only tells the client which method is used to
signal the end of data.  Both methods have some use, but there aren't a
lot of other methods that will come into use soon.

> * At what granularity do you wish to select the transform group type for
> data being transferred in or out?  Right now we've essentially assumed
> that you only need to specify it once for an entire command result, but
> it's fairly easy to imagine scenarios where this isn't what you want.

As mentioned above, it would be per data type and session.  I think that
makes sense.  As you mentioned, certainly you would want to have different
choices for different data types.  Also, session makes sense, because if
an application is set up to handle a given type in a given format in one
place, then it probably wants to handle it like that every time.  This
would move away the format decision from the command level to the session
level.  Most configuration parameters are session level, so that makes
sense.

> * That leaves us with two issues: where does the client say what it
> wants,

SET-like commands; see at the top.

> and where does the backend report the actual transform group used for
> each column?

If you want to be pedantic, you don't report it at all, because the client
selected it, so it got it.  The client will know how to read the data
pieces, because it knows whether they are text (terminating zero byte) or
binary (starts with length byte).  What exactly is inside doesn't
necessarily need to be reported.  (We don't report the date style in each
result either.)  If you want to report it, the RowDescription with one OID
for each column would seem the best place.

> * More or less the same considerations apply for parameter values being
> sent in a Bind message.  Here I'd opt for always sending a transform
> group for each parameter value being sent.

Same here; you don't need it.  If you set the format to X, then you send
the data in format X, and things work.  If the data does not conform to
format X, the tranform function will tell you soon enough.

> * The client can hardly be expected to select per-column transforms in
> Bind if it doesn't know the result column datatypes yet.  In the
> protocol document as it stands today, there's no way to find out the
> result datatypes except a portal Describe --- which requires that you've
> already done Bind.

This is not a problem if transform groups are per-type, not per-column.


The consequence for the protocol:  Keep the text/binary distinction, but
make it per-column in the result.  For backward compatibility, the client
can still choose between text and binary on a command-level basis, but we
should move this to a session parameter, and if command and session
settings are incompatible, one prevails or we signal an error.

-- 
Peter Eisentraut   peter_e@gmx.net



Re: Transform groups (more FE/BE protocol issues)

From
Tom Lane
Date:
Peter Eisentraut <peter_e@gmx.net> writes:
> A transform group is a named object attached to a type.  A transform group
> converts a user-defined type to a standard SQL type or vice versa.  (The
> rationale being that clients know how to handle standard SQL types.)

Hmm, I hope they didn't restrict the design so that transform groups
could do *only* that; if so, they'd be far less useful than I thought.
(Your example below doesn't work if so, because "struct tm" isn't SQL.)

> Here's an example:  a set of transform groups (for various data types)
> that convert SQL data types to more C-like types, for example timestamp to
> struct tm.  Would you want to pass struct tm-data over the wire as a blob
> of 36 bytes?  I would rather get a standard binary representation with a
> length word, so a middle layer can still copy this data around without
> having to be data type-aware.

Sure, but the length word now becomes part of the protocol wrapper
rather than being considered part of the data.  A struct tm
representation seems like a fine thing to me (at least if you like
struct tm, which I don't especially --- but if that's what your app is
coded to use, then that is exactly what you want).

> In fact, this suggests that the format field should not be represented at
> all in the protocol and only be handled as session state.  Users can just
> say "I want these types in text and these types in binary" and that is
> respected for the rest of the session.

I don't think I like that, because it assumes that the client side of
things is monolithic, which it is not.  The client library (libpq, jdbc,
etc) may or may not be aware of what the client application has done,
and neither of them may be aware of what the user has done via GUC
settings or interactive SQL commands.  One of the things we're trying to
fix in this protocol revision is to ensure that the client library has
enough information to do its job, *without* any assumptions about what
the user is doing.

Barry gave a fine example of this, which was that JDBC has a hard time
parsing timestamp values (as it's supposed to be able to do) without
knowing what DateStyle is set to.  What you're proposing above would
add even more hidden state to the protocol.

Perhaps we could add ParameterStatus support for these things, though.
I see your point that it's unlikely the client's desires will change
much over a session, so having to repeat the request with each query
might be overkill.

> The consequence for the protocol:  Keep the text/binary distinction, but
> make it per-column in the result.  For backward compatibility, the client
> can still choose between text and binary on a command-level basis, but we
> should move this to a session parameter, and if command and session
> settings are incompatible, one prevails or we signal an error.

I was actually envisioning that text and binary would become
interchangeable as far as the protocol is concerned: DataRow carries a
length and some bytes, and what's in the bytes is interpreted based on
format codes.  So I don't see a need to think in terms of the formats
being classified as text and binary.
        regards, tom lane



Re: Transform groups (more FE/BE protocol issues)

From
Sailesh Krishnamurthy
Date:
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
   Tom> Peter Eisentraut <peter_e@gmx.net> writes:   >> A transform group is a named object attached to a type.  A   >>
transformgroup converts a user-defined type to a standard SQL   >> type or vice versa.  (The rationale being that
clientsknow how   >> to handle standard SQL types.)
 
   Tom> Hmm, I hope they didn't restrict the design so that transform   Tom> groups could do *only* that; if so, they'd
befar less useful   Tom> than I thought.  (Your example below doesn't work if so,   Tom> because "struct tm" isn't
SQL.)

I'm pretty sure that's how it is. Associated with each tranform group
is a FROM_SQL and a TO_SQL function, and each function, of course,
only returns an SQL type. That's how it got implemented in db2 .. 
   >> example timestamp to struct tm.  Would you want to pass struct   >> tm-data over the wire as a blob of 36 bytes?
Iwould rather   >> get a standard binary representation with a length word, so a   >> middle layer can still copy this
dataaround without having to   >> be data type-aware.
 

There was some talk about standardizing wire representations, but
there was never enough momentum to get that going. By that time, the
standard became fairly Java-biased. IIRC, the decision was that this
would be addressed by the call-level interface bindings. So for
instance, with JDBC/SQLJ there are the SQLOutput and SQLInput streams
(similar in spirit to ObjectOutput and ObjectInput). The idea is to
map each database structured type to a Java class that implements the
SQLData interface. All this is very clearly laid out in the SQLJ
standard which is quite readable. 

So, while things are well standardized in the Java world, they're
practically non-existant for the other client bindings. 

-- 
Pip-pip
Sailesh
http://www.cs.berkeley.edu/~sailesh