Thread: plperl vs. bytea

plperl vs. bytea

From
Andrew Dunstan
Date:
I have been talking with Theo some more about his recent problem with 
bytea arguments and results (see recent discussion on -bugs and also 
recent docs patch),  what he needs is a way to have bytea  (and possibly 
other unknown types) passed as binary data to and from plperl. The 
conversion overhead is too big both computationally and  in increased 
memory usage. After discussing some possibilities, we decided that maybe 
the best approach would be to allow a custom GUC variable that would 
specify a list of types to be passed in binary form with no conversion, e.g.
 plperl.pass_as_binary = 'bytea, other-type'

This would affect function args, trigger data, return results, and I 
think it should also apply to arguments for SPI prepared queries and to 
SPI returned results.

If this seems like a good idea maybe it should go on the TODO list in 
whatever is the current incarnation.


cheers

andrew




Re: plperl vs. bytea

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> After discussing some possibilities, we decided that maybe 
> the best approach would be to allow a custom GUC variable that would 
> specify a list of types to be passed in binary form with no conversion, e.g.

>   plperl.pass_as_binary = 'bytea, other-type'

At minimum this GUC would have to be superuser-only, and even then the
security risks seem a bit high.  But the real problem with this thinking
is the same one I already pointed out to Theo: why do you think this
issue is plperl-specific?
        regards, tom lane


Re: plperl vs. bytea

From
Andrew Dunstan
Date:

Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>   
>> After discussing some possibilities, we decided that maybe 
>> the best approach would be to allow a custom GUC variable that would 
>> specify a list of types to be passed in binary form with no conversion, e.g.
>>     
>
>   
>>   plperl.pass_as_binary = 'bytea, other-type'
>>     
>
> At minimum this GUC would have to be superuser-only, and even then the
> security risks seem a bit high.  But the real problem with this thinking
> is the same one I already pointed out to Theo: why do you think this
> issue is plperl-specific?
>
>     
>   

It's not. If we really want to tackle this root and branch without 
upsetting legacy code, I think we'd need to have a way of marking data 
items as binary in the grammar, e.g.
 create function myfunc(myarg binary bytea) returns binary bytea 
language plperl as $$ ...$$;

That's what I originally suggested to Theo. It would be a lot more work, 
though :-)

cheers

andrew


Re: plperl vs. bytea

From
Peter Eisentraut
Date:
Andrew Dunstan wrote:
> It's not. If we really want to tackle this root and branch without
> upsetting legacy code, I think we'd need to have a way of marking
> data items as binary in the grammar, e.g.
>
>   create function myfunc(myarg binary bytea) returns binary bytea
> language plperl as $$ ...$$;

This ought to be a property of data type plus language, not a property 
of a function.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/


Re: plperl vs. bytea

From
Andrew Dunstan
Date:

Peter Eisentraut wrote:
> Andrew Dunstan wrote:
>   
>> It's not. If we really want to tackle this root and branch without
>> upsetting legacy code, I think we'd need to have a way of marking
>> data items as binary in the grammar, e.g.
>>
>>   create function myfunc(myarg binary bytea) returns binary bytea
>> language plperl as $$ ...$$;
>>     
>
> This ought to be a property of data type plus language, not a property 
> of a function.
>
>   

Why should it?

And how would you do it in such a way that it didn't break legacy code?

My GUC proposal would have made it language+type specific, but Tom 
didn't like that approach.

cheers

andrew


Re: plperl vs. bytea

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> Peter Eisentraut wrote:
>> This ought to be a property of data type plus language, not a property 
>> of a function.

> Why should it?

> And how would you do it in such a way that it didn't break legacy code?

> My GUC proposal would have made it language+type specific, but Tom 
> didn't like that approach.

It may indeed need to be language+type specific; what I was objecting to
was the proposal of an ad-hoc plperl-specific solution without any
consideration for other languages (or other data types for that matter).
I think that's working at the wrong level of detail, at least for
initial design.

What we've basically got here is a complaint that the default
textual-representation-based method for transmitting PL function
parameters and results is awkward and inefficient for bytea.
So the first question is whether this is really localized to only
bytea, and if not which other types have got similar issues.
(Even if you make the case that no other scalar types need help,
what of bytea[] and composite types containing bytea or bytea[]?)

After that we have to look at which PLs have the issue.  I think
this is largely driven by what the PL's internal type system is 
like, in particular does it have a datatype that is a natural
conversion target for bytea, or other types with the same issue?
(Tcl for instance once did not have 8-bit-clean strings, though
I think it does today.)

After we've got a handle on the scope of the problem we can start
to think about solutions.
        regards, tom lane


Re: plperl vs. bytea

From
"Pavel Stehule"
Date:
> What we've basically got here is a complaint that the default
> textual-representation-based method for transmitting PL function
> parameters and results is awkward and inefficient for bytea.
> So the first question is whether this is really localized to only
> bytea, and if not which other types have got similar issues.
> (Even if you make the case that no other scalar types need help,
> what of bytea[] and composite types containing bytea or bytea[]?)
>

It can be solution for known isues. Current textual representation is
more ugly hack than everythink else.

Regards
Pavel Stehule


Re: plperl vs. bytea

From
Martijn van Oosterhout
Date:
On Sun, May 06, 2007 at 08:48:28PM -0400, Tom Lane wrote:
> What we've basically got here is a complaint that the default
> textual-representation-based method for transmitting PL function
> parameters and results is awkward and inefficient for bytea.
> So the first question is whether this is really localized to only
> bytea, and if not which other types have got similar issues.
> (Even if you make the case that no other scalar types need help,
> what of bytea[] and composite types containing bytea or bytea[]?)

I must say I was indeed surprised by the idea that bytea is passed by
text, since Perl handles embedded nulls in strings without any problem
at all. Does this mean integers are passed as text also? I would have
expected an array argument to be passed as an array, but now I'm not so
sure.

So I'm with Tom on this one: there needs to be a serious discussion
about how types are passed to Perl and the costs associated with it.

I do have one problem though: for bytea/integers/floats Perl has
appropriate internel representations. But what about other user-defined
types? Say the user-defined UUID type, it should probably also passed
by a byte string, yet how could Perl know that. That would imply that
user-defined types need to be able to specify how they are passed to
PLs, to *any* PL.

So fixing it for bytea is one thing, but there's a bigger issue here
that needs discussion.

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Re: plperl vs. bytea

From
Tino Wildenhain
Date:
Martijn van Oosterhout schrieb:
...> I do have one problem though: for bytea/integers/floats Perl has> appropriate internel representations. But what
aboutother user-defined> types? Say the user-defined UUID type, it should probably also passed> by a byte string, yet
howcould Perl know that. That would imply that> user-defined types need to be able to specify how they are passed to>
PLs,to *any* PL.>
 
Yes exactly. One way could be to pass the type binary and provide
a hull class for the PL/languages which then call the input/output
routines on the string boundaries of the type unless overridden by
user implementation. So default handling could be done in string
representation of the type whatever that is and for a defined set
of types every pl/language could implement special treatment like
mapping to natural types.

This handling can be done independently for every pl implementation
since it would for the most types just move the current type treatment
just a bit closer to the user code instead of doing all of it
in the call handler.

2nd problem is language interface for outside of the database scripting.
Efficient and lossless type handling there would improve some
situations - maybe a similar approach could be taken here.

Regards
Tino


Re: plperl vs. bytea

From
Andrew Dunstan
Date:

Tom Lane wrote:
>
>> My GUC proposal would have made it language+type specific, but Tom 
>> didn't like that approach.
>>     
>
> It may indeed need to be language+type specific; what I was objecting to
> was the proposal of an ad-hoc plperl-specific solution without any
> consideration for other languages (or other data types for that matter).
> I think that's working at the wrong level of detail, at least for
> initial design.
>
> What we've basically got here is a complaint that the default
> textual-representation-based method for transmitting PL function
> parameters and results is awkward and inefficient for bytea.
> So the first question is whether this is really localized to only
> bytea, and if not which other types have got similar issues.
> (Even if you make the case that no other scalar types need help,
> what of bytea[] and composite types containing bytea or bytea[]?)
>   

Well, the proposal would have allowed the user to specify the types to 
be passed binary, so it wouldn't have been bytea only.

Array types are currently passed as text. This item used to be on the 
TODO list but it disappeared at some stage:

. Pass arrays natively instead of as text between plperl and postgres

(Perhaps it's naughty of me to observe that if we had a tracker we might 
know why it disappeared). Arrays can be returned as arrayrefs, and 
plperl has a little postprocessing magic that turns that into text which 
will in turn be parsed back into a postgres array. Not very efficient 
but it's a placeholder until we get better array support.

Composites are in fact passed as hashrefs and can be returned as 
hashrefs. Unfortunately, this is not true recursively - a composite 
within a composite will be received as text.

Another aspect of this is how we deal with SPI arguments and results.  I 
need to look into that, but sufficient unto the day ...


cheers

andrew


Re: plperl vs. bytea

From
Andrew Dunstan
Date:

Tino Wildenhain wrote:
> Martijn van Oosterhout schrieb:
> ...
> > I do have one problem though: for bytea/integers/floats Perl has
> > appropriate internel representations. But what about other user-defined
> > types? Say the user-defined UUID type, it should probably also passed
> > by a byte string, yet how could Perl know that. That would imply that
> > user-defined types need to be able to specify how they are passed to
> > PLs, to *any* PL.
> >
> Yes exactly. One way could be to pass the type binary and provide
> a hull class for the PL/languages which then call the input/output
> routines on the string boundaries of the type unless overridden by
> user implementation. So default handling could be done in string
> representation of the type whatever that is and for a defined set
> of types every pl/language could implement special treatment like
> mapping to natural types.
>
> This handling can be done independently for every pl implementation
> since it would for the most types just move the current type treatment
> just a bit closer to the user code instead of doing all of it
> in the call handler.
>
> 2nd problem is language interface for outside of the database scripting.
> Efficient and lossless type handling there would improve some
> situations - maybe a similar approach could be taken here.
>
>

This seems like an elaborate piece of scaffolding for a relatively small 
problem.

This does not need to be over-engineered, IMNSHO.

cheers

andrew




Re: plperl vs. bytea

From
Tom Lane
Date:
Martijn van Oosterhout <kleptog@svana.org> writes:
> On Sun, May 06, 2007 at 08:48:28PM -0400, Tom Lane wrote:
>> What we've basically got here is a complaint that the default
>> textual-representation-based method for transmitting PL function
>> parameters and results is awkward and inefficient for bytea.

> I must say I was indeed surprised by the idea that bytea is passed by
> text, since Perl handles embedded nulls in strings without any problem
> at all. Does this mean integers are passed as text also?

Pretty much everything is passed as text.  This is a historical
accident, in part: our first PL with an external interpreter was pltcl,
and Tcl of the day had no other variable type besides "text string".
(They've gotten smarter since then, but from a user's-eye point of view
it's still true that every value in Tcl is a string.)  So it was natural
to decree that the value transmission protocol was just to convert to
text and back with the SQL datatype I/O functions.  Later PLs copied
that decision without thinking hard about it.  We've wedged a few bits
of custom transmission protocol into plperl for arrays and records, but
it's been pretty ad-hoc each time.  Seems it's time to take a step back
and question the assumptions.
        regards, tom lane


Re: plperl vs. bytea

From
Tino Wildenhain
Date:
Andrew Dunstan schrieb:
> 
> 
> Tino Wildenhain wrote:
>> Martijn van Oosterhout schrieb:
>> ...
>> > I do have one problem though: for bytea/integers/floats Perl has
>> > appropriate internel representations. But what about other user-defined
>> > types? Say the user-defined UUID type, it should probably also passed
>> > by a byte string, yet how could Perl know that. That would imply that
>> > user-defined types need to be able to specify how they are passed to
>> > PLs, to *any* PL.
>> >
>> Yes exactly. One way could be to pass the type binary and provide
>> a hull class for the PL/languages which then call the input/output
>> routines on the string boundaries of the type unless overridden by
>> user implementation. So default handling could be done in string
>> representation of the type whatever that is and for a defined set
>> of types every pl/language could implement special treatment like
>> mapping to natural types.
>>
>> This handling can be done independently for every pl implementation
>> since it would for the most types just move the current type treatment
>> just a bit closer to the user code instead of doing all of it
>> in the call handler.
>>
>> 2nd problem is language interface for outside of the database scripting.
>> Efficient and lossless type handling there would improve some
>> situations - maybe a similar approach could be taken here.
>>
>>
> 
> This seems like an elaborate piece of scaffolding for a relatively small 
> problem.
> 
> This does not need to be over-engineered, IMNSHO.

Well could you explain where it would appear over-engineered?
All I was proposing is to move the rather hard-coded
type mapping to a softer approach where the language
is able to support it.

Is there any insufficience in perl which makes it harder to
do in a clean way?

Regards
Tino



Re: plperl vs. bytea

From
Andrew Dunstan
Date:

Tino Wildenhain wrote:
> Andrew Dunstan schrieb:
>>
>>
>> Tino Wildenhain wrote:
>>> Martijn van Oosterhout schrieb:
>>> ...
>>> > I do have one problem though: for bytea/integers/floats Perl has
>>> > appropriate internel representations. But what about other 
>>> user-defined
>>> > types? Say the user-defined UUID type, it should probably also passed
>>> > by a byte string, yet how could Perl know that. That would imply that
>>> > user-defined types need to be able to specify how they are passed to
>>> > PLs, to *any* PL.
>>> >
>>> Yes exactly. One way could be to pass the type binary and provide
>>> a hull class for the PL/languages which then call the input/output
>>> routines on the string boundaries of the type unless overridden by
>>> user implementation. So default handling could be done in string
>>> representation of the type whatever that is and for a defined set
>>> of types every pl/language could implement special treatment like
>>> mapping to natural types.
>>>
>>> This handling can be done independently for every pl implementation
>>> since it would for the most types just move the current type treatment
>>> just a bit closer to the user code instead of doing all of it
>>> in the call handler.
>>>
>>> 2nd problem is language interface for outside of the database 
>>> scripting.
>>> Efficient and lossless type handling there would improve some
>>> situations - maybe a similar approach could be taken here.
>>>
>>>
>>
>> This seems like an elaborate piece of scaffolding for a relatively 
>> small problem.
>>
>> This does not need to be over-engineered, IMNSHO.
>
> Well could you explain where it would appear over-engineered?
> All I was proposing is to move the rather hard-coded
> type mapping to a softer approach where the language
> is able to support it.
>
> Is there any insufficience in perl which makes it harder to
> do in a clean way?
>
>

Anything that imposes extra requirements on type creators seems undesirable.

I'm not sure either that the UUID example is a very good one. This whole 
problem arose because of performance problems handling large gobs of 
data, not just anything that happens to be binary.


cheers

andrew


Re: plperl vs. bytea

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> Tino Wildenhain wrote:
>> Andrew Dunstan schrieb:
>>> This does not need to be over-engineered, IMNSHO.
>> 
>> Well could you explain where it would appear over-engineered?

> Anything that imposes extra requirements on type creators seems undesirable.

> I'm not sure either that the UUID example is a very good one. This whole 
> problem arose because of performance problems handling large gobs of 
> data, not just anything that happens to be binary.

Well, we realize that bytea has got a performance problem, but are we so
sure that nothing else does?  I don't want to stick in a one-purpose
wart only to find later that we need a few more warts of the same kind.

An example of something else we ought to be considering is binary
transmission of float values.  The argument in favor of that is not
so much performance (although text-and-back conversion is hardly cheap)
as it is that the conversion is potentially lossy, since float8out
doesn't by default generate enough digits to ensure a unique
back-conversion.

ISTM there are three reasons for considering non-text-based
transmission:

1. Performance, as in the bytea case
2. Avoidance of information loss, as for float
3. Providing a natural/convenient mapping to the PL's internal data types,  as we already do --- but incompletely ---
forarrays and records
 

It's clear that the details of #3 have to vary across PLs, but I'd
like it not to vary capriciously.  For instance plperl currently has
special treatment for returning perl arrays as SQL arrays, but AFAICS
from the manual not for going in the other direction; plpython and
pltcl overlook arrays entirely, even though there are natural mappings
they could and should be using.

I don't know to what extent we should apply point #3 to situations other
than arrays and records, but now is the time to think about it.  An
example: working with the geometric types in a PL function is probably
going to be pretty painful for lack of simple access to the constituent
float values (not to mention the lossiness problem).

We should also be considering some non-core PLs such as PL/Ruby and
PL/R; they might provide additional examples to influence our thinking.
        regards, tom lane


Re: plperl vs. bytea

From
Andrew Dunstan
Date:

Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
>   
>> Tino Wildenhain wrote:
>>     
>>> Andrew Dunstan schrieb:
>>>       
>>>> This does not need to be over-engineered, IMNSHO.
>>>>         
>>> Well could you explain where it would appear over-engineered?
>>>       
>
>   
>> Anything that imposes extra requirements on type creators seems undesirable.
>>     
>
>   
>> I'm not sure either that the UUID example is a very good one. This whole 
>> problem arose because of performance problems handling large gobs of 
>> data, not just anything that happens to be binary.
>>     
>
> Well, we realize that bytea has got a performance problem, but are we so
> sure that nothing else does?  I don't want to stick in a one-purpose
> wart only to find later that we need a few more warts of the same kind.
>
> An example of something else we ought to be considering is binary
> transmission of float values.  The argument in favor of that is not
> so much performance (although text-and-back conversion is hardly cheap)
> as it is that the conversion is potentially lossy, since float8out
> doesn't by default generate enough digits to ensure a unique
> back-conversion.
>
> ISTM there are three reasons for considering non-text-based
> transmission:
>
> 1. Performance, as in the bytea case
> 2. Avoidance of information loss, as for float
> 3. Providing a natural/convenient mapping to the PL's internal data types,
>    as we already do --- but incompletely --- for arrays and records
>
> It's clear that the details of #3 have to vary across PLs, but I'd
> like it not to vary capriciously.  For instance plperl currently has
> special treatment for returning perl arrays as SQL arrays, but AFAICS
> from the manual not for going in the other direction; plpython and
> pltcl overlook arrays entirely, even though there are natural mappings
> they could and should be using.
>
> I don't know to what extent we should apply point #3 to situations other
> than arrays and records, but now is the time to think about it.  An
> example: working with the geometric types in a PL function is probably
> going to be pretty painful for lack of simple access to the constituent
> float values (not to mention the lossiness problem).
>
> We should also be considering some non-core PLs such as PL/Ruby and
> PL/R; they might provide additional examples to influence our thinking.
>   

OK, we have a lot of work to do here, then.

I can really only speak with any significant knowledge on the perl 
front. Fundamentally, it has 3 types of scalars: IV, NV and PV (integer, 
float, string). IV can accomodate at least the largest integer or 
pointer type on the platform, NV a double, and PV an arbitrary string of 
bytes.

As for structured types, as I noted elsewhere we have some of the work 
done for plperl. My suggestion would be to complete it for plperl and 
get it fully orthogonal and then retrofit that to plpython/pltcl.

I've actually been worried for some time that the conversion glue was 
probably imposing significant penalties on the non-native PLs, so I'm 
glad to see this getting some attention.


cheers

andrew


Re: plperl vs. bytea

From
Bruce Momjian
Date:
Added to TODO:
               o Allow data to be passed in native language formats, rather                 than only text
                http://archives.postgresql.org/pgsql-hackers/2007-05/msg00289$


---------------------------------------------------------------------------

Andrew Dunstan wrote:
> 
> 
> Tom Lane wrote:
> > Andrew Dunstan <andrew@dunslane.net> writes:
> >   
> >> Tino Wildenhain wrote:
> >>     
> >>> Andrew Dunstan schrieb:
> >>>       
> >>>> This does not need to be over-engineered, IMNSHO.
> >>>>         
> >>> Well could you explain where it would appear over-engineered?
> >>>       
> >
> >   
> >> Anything that imposes extra requirements on type creators seems undesirable.
> >>     
> >
> >   
> >> I'm not sure either that the UUID example is a very good one. This whole 
> >> problem arose because of performance problems handling large gobs of 
> >> data, not just anything that happens to be binary.
> >>     
> >
> > Well, we realize that bytea has got a performance problem, but are we so
> > sure that nothing else does?  I don't want to stick in a one-purpose
> > wart only to find later that we need a few more warts of the same kind.
> >
> > An example of something else we ought to be considering is binary
> > transmission of float values.  The argument in favor of that is not
> > so much performance (although text-and-back conversion is hardly cheap)
> > as it is that the conversion is potentially lossy, since float8out
> > doesn't by default generate enough digits to ensure a unique
> > back-conversion.
> >
> > ISTM there are three reasons for considering non-text-based
> > transmission:
> >
> > 1. Performance, as in the bytea case
> > 2. Avoidance of information loss, as for float
> > 3. Providing a natural/convenient mapping to the PL's internal data types,
> >    as we already do --- but incompletely --- for arrays and records
> >
> > It's clear that the details of #3 have to vary across PLs, but I'd
> > like it not to vary capriciously.  For instance plperl currently has
> > special treatment for returning perl arrays as SQL arrays, but AFAICS
> > from the manual not for going in the other direction; plpython and
> > pltcl overlook arrays entirely, even though there are natural mappings
> > they could and should be using.
> >
> > I don't know to what extent we should apply point #3 to situations other
> > than arrays and records, but now is the time to think about it.  An
> > example: working with the geometric types in a PL function is probably
> > going to be pretty painful for lack of simple access to the constituent
> > float values (not to mention the lossiness problem).
> >
> > We should also be considering some non-core PLs such as PL/Ruby and
> > PL/R; they might provide additional examples to influence our thinking.
> >   
> 
> OK, we have a lot of work to do here, then.
> 
> I can really only speak with any significant knowledge on the perl 
> front. Fundamentally, it has 3 types of scalars: IV, NV and PV (integer, 
> float, string). IV can accomodate at least the largest integer or 
> pointer type on the platform, NV a double, and PV an arbitrary string of 
> bytes.
> 
> As for structured types, as I noted elsewhere we have some of the work 
> done for plperl. My suggestion would be to complete it for plperl and 
> get it fully orthogonal and then retrofit that to plpython/pltcl.
> 
> I've actually been worried for some time that the conversion glue was 
> probably imposing significant penalties on the non-native PLs, so I'm 
> glad to see this getting some attention.
> 
> 
> cheers
> 
> andrew
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>        subscribe-nomail command to majordomo@postgresql.org so that your
>        message can get through to the mailing list cleanly

--  Bruce Momjian  <bruce@momjian.us>          http://momjian.us EnterpriseDB
http://www.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: plperl vs. bytea

From
Hannu Krosing
Date:
Ühel kenal päeval, E, 2007-05-07 kell 13:57, kirjutas Andrew Dunstan:
> 
> Tom Lane wrote:
> > Andrew Dunstan <andrew@dunslane.net> writes:
> >   
> >> Tino Wildenhain wrote:
> >>     
> >>> Andrew Dunstan schrieb:
> >>>       
> >>>> This does not need to be over-engineered, IMNSHO.
> >>>>         
> >>> Well could you explain where it would appear over-engineered?
> >>>       
> >
> >   
> >> Anything that imposes extra requirements on type creators seems undesirable.
> >>     
> >
> >   
> >> I'm not sure either that the UUID example is a very good one. This whole 
> >> problem arose because of performance problems handling large gobs of 
> >> data, not just anything that happens to be binary.
> >>     
> >
> > Well, we realize that bytea has got a performance problem, but are we so
> > sure that nothing else does?  I don't want to stick in a one-purpose
> > wart only to find later that we need a few more warts of the same kind.
> >
> > An example of something else we ought to be considering is binary
> > transmission of float values.  The argument in favor of that is not
> > so much performance (although text-and-back conversion is hardly cheap)
> > as it is that the conversion is potentially lossy, since float8out
> > doesn't by default generate enough digits to ensure a unique
> > back-conversion.
> >
> > ISTM there are three reasons for considering non-text-based
> > transmission:
> >
> > 1. Performance, as in the bytea case
> > 2. Avoidance of information loss, as for float
> > 3. Providing a natural/convenient mapping to the PL's internal data types,
> >    as we already do --- but incompletely --- for arrays and records
> >
> > It's clear that the details of #3 have to vary across PLs, but I'd
> > like it not to vary capriciously.  For instance plperl currently has
> > special treatment for returning perl arrays as SQL arrays, but AFAICS
> > from the manual not for going in the other direction; plpython and
> > pltcl overlook arrays entirely, even though there are natural mappings
> > they could and should be using.

plpy (from http://python.projects.postgresql.org/project/be.html ) goes
to another extreme and exposes the whole postgresql type system to
embedded python interpreter.

> > I don't know to what extent we should apply point #3 to situations other
> > than arrays and records, but now is the time to think about it.  

If we can avoid copying/converting large(ish) values between postgresql
and embedded language, we should try to do it. The main problems seem to
be in differences alloc/free, palloc, refcounting/CG between pg and
embedded languages.

> > An
> > example: working with the geometric types in a PL function is probably
> > going to be pretty painful for lack of simple access to the constituent
> > float values (not to mention the lossiness problem).

of course we should provide access to subparts of pg types, either by
writing some wrapper class/accessor functios or providing access through
postgresql's existing functions.

> > We should also be considering some non-core PLs such as PL/Ruby and
> > PL/R; they might provide additional examples to influence our thinking.
> >   
> 
> OK, we have a lot of work to do here, then.
> 
> I can really only speak with any significant knowledge on the perl 
> front. Fundamentally, it has 3 types of scalars: IV, NV and PV (integer, 
> float, string). IV can accomodate at least the largest integer or 
> pointer type on the platform, NV a double, and PV an arbitrary string of 
> bytes.

OTOH python has an extensible type system from the start (i.e. anything
is an object), and thus could be painlessly (just SMOP) extended to use
postgresql's native types when there is no 1:1 match with existing
types.

> As for structured types, as I noted elsewhere we have some of the work 
> done for plperl. My suggestion would be to complete it for plperl and 
> get it fully orthogonal and then retrofit that to plpython/pltcl.
> 
> I've actually been worried for some time that the conversion glue was 
> probably imposing significant penalties on the non-native PLs, so I'm 
> glad to see this getting some attention.
> 
> 
> cheers
> 
> andrew
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>        subscribe-nomail command to majordomo@postgresql.org so that your
>        message can get through to the mailing list cleanly