Thread: pg_proc.probin should become text?

pg_proc.probin should become text?

From
Tom Lane
Date:
pg_proc.probin is currently declared as being type bytea.  Perhaps that
had some real utility once upon a time, but no currently defined
procedural language stores binary data there.  In fact, the only
thing that gets stored there is the shared-library file name for
C functions.  To make things even more fun, none of the code touching
probin is doing anything to honor the special escaping conventions for
bytea.  This was pretty broken already, and might perhaps explain some
of the weirder error reports we've gotten.  It is now obvious to me
that no shlib file name containing non-ASCII characters will correctly
survive a dump/reload, in any existing PG release.  Backslashes
accidentally fail to fail, at least for some values of
standard_conforming_strings.

However, with the hex bytea output patch in place, it's *completely*
broken.  I offer the following pg_dump output:

CREATE FUNCTION interpt_pp(path, path) RETURNS point   LANGUAGE c   AS
'\\x2f686f6d652f706f7374677265732f706773716c2f7372632f746573742f726567726573732f726567726573732e736c','interpt_pp';
 

which should of course have looked like this:

CREATE FUNCTION interpt_pp(path, path) RETURNS point   LANGUAGE c   AS
'/home/postgres/testversion/src/test/regress/regress.sl','interpt_pp';
 

I think that the least painful solution might be to change
pg_proc.probin to be declared as text.  Otherwise we're going to need
version-specific klugery in pg_dump and who knows where else.

Comments?
        regards, tom lane


Re: pg_proc.probin should become text?

From
Robert Haas
Date:
On Mon, Aug 3, 2009 at 10:03 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote:
> pg_proc.probin is currently declared as being type bytea.  Perhaps that
> had some real utility once upon a time, but no currently defined
> procedural language stores binary data there.  In fact, the only
> thing that gets stored there is the shared-library file name for
> C functions.  To make things even more fun, none of the code touching
> probin is doing anything to honor the special escaping conventions for
> bytea.  This was pretty broken already, and might perhaps explain some
> of the weirder error reports we've gotten.  It is now obvious to me
> that no shlib file name containing non-ASCII characters will correctly
> survive a dump/reload, in any existing PG release.  Backslashes
> accidentally fail to fail, at least for some values of
> standard_conforming_strings.
>
> However, with the hex bytea output patch in place, it's *completely*
> broken.  I offer the following pg_dump output:
>
> CREATE FUNCTION interpt_pp(path, path) RETURNS point
>    LANGUAGE c
>    AS '\\x2f686f6d652f706f7374677265732f706773716c2f7372632f746573742f726567726573732f726567726573732e736c',
'interpt_pp';
>
> which should of course have looked like this:
>
> CREATE FUNCTION interpt_pp(path, path) RETURNS point
>    LANGUAGE c
>    AS '/home/postgres/testversion/src/test/regress/regress.sl', 'interpt_pp';
>
> I think that the least painful solution might be to change
> pg_proc.probin to be declared as text.  Otherwise we're going to need
> version-specific klugery in pg_dump and who knows where else.
>
> Comments?

Will that require a special hack in pg_migrator?

...Robert


Re: pg_proc.probin should become text?

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Mon, Aug 3, 2009 at 10:03 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote:
>> I think that the least painful solution might be to change
>> pg_proc.probin to be declared as text. �Otherwise we're going to need
>> version-specific klugery in pg_dump and who knows where else.

> Will that require a special hack in pg_migrator?

No, pg_dump (and hence pg_migrator) wouldn't care.

If we thought we were going to try to back-patch a fix for the
non-ASCII-char problem into older releases, then some other approach
would be preferable.  But given the fact that this hasn't gotten
complained of (at least not enough to be identified) in all the years
since Berkeley, I can't see doing the pushups that would be needed to
back-patch a fix.  It looks to me like everyone has been effectively
assuming that probin stores text, and we should just make it really
truly do that.
        regards, tom lane


Re: pg_proc.probin should become text?

From
David Fetter
Date:
On Mon, Aug 03, 2009 at 10:03:11PM -0400, Tom Lane wrote:
> However, with the hex bytea output patch in place, it's *completely*
> broken.  I offer the following pg_dump output:
> 
> CREATE FUNCTION interpt_pp(path, path) RETURNS point
>     LANGUAGE c
>     AS '\\x2f686f6d652f706f7374677265732f706773716c2f7372632f746573742f726567726573732f726567726573732e736c',
'interpt_pp';
> 
> which should of course have looked like this:
> 
> CREATE FUNCTION interpt_pp(path, path) RETURNS point
>     LANGUAGE c
>     AS '/home/postgres/testversion/src/test/regress/regress.sl', 'interpt_pp';

Oh, of course!  How could I have been so dense? ;)

> I think that the least painful solution might be to change
> pg_proc.probin to be declared as text.  Otherwise we're going to need
> version-specific klugery in pg_dump and who knows where else.
> 
> Comments?

+1 on changing it to text.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


Re: pg_proc.probin should become text?

From
Robert Haas
Date:
On Mon, Aug 3, 2009 at 10:40 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Mon, Aug 3, 2009 at 10:03 PM, Tom Lane<tgl@sss.pgh.pa.us> wrote:
>>> I think that the least painful solution might be to change
>>> pg_proc.probin to be declared as text.  Otherwise we're going to need
>>> version-specific klugery in pg_dump and who knows where else.
>
>> Will that require a special hack in pg_migrator?
>
> No, pg_dump (and hence pg_migrator) wouldn't care.
>
> If we thought we were going to try to back-patch a fix for the
> non-ASCII-char problem into older releases, then some other approach
> would be preferable.  But given the fact that this hasn't gotten
> complained of (at least not enough to be identified) in all the years
> since Berkeley, I can't see doing the pushups that would be needed to
> back-patch a fix.  It looks to me like everyone has been effectively
> assuming that probin stores text, and we should just make it really
> truly do that.

Sounds good to me, then.

...Robert


Re: pg_proc.probin should become text?

From
Pavel Stehule
Date:
2009/8/4 Tom Lane <tgl@sss.pgh.pa.us>:
> pg_proc.probin is currently declared as being type bytea.  Perhaps that
> had some real utility once upon a time, but no currently defined
> procedural language stores binary data there.  In fact, the only
> thing that gets stored there is the shared-library file name for
> C functions.  To make things even more fun, none of the code touching
> probin is doing anything to honor the special escaping conventions for
> bytea.  This was pretty broken already, and might perhaps explain some
> of the weirder error reports we've gotten.  It is now obvious to me
> that no shlib file name containing non-ASCII characters will correctly
> survive a dump/reload, in any existing PG release.  Backslashes
> accidentally fail to fail, at least for some values of
> standard_conforming_strings.
>
> However, with the hex bytea output patch in place, it's *completely*
> broken.  I offer the following pg_dump output:
>
> CREATE FUNCTION interpt_pp(path, path) RETURNS point
>    LANGUAGE c
>    AS '\\x2f686f6d652f706f7374677265732f706773716c2f7372632f746573742f726567726573732f726567726573732e736c',
'interpt_pp';
>
> which should of course have looked like this:
>
> CREATE FUNCTION interpt_pp(path, path) RETURNS point
>    LANGUAGE c
>    AS '/home/postgres/testversion/src/test/regress/regress.sl', 'interpt_pp';
>
> I think that the least painful solution might be to change
> pg_proc.probin to be declared as text.  Otherwise we're going to need
> version-specific klugery in pg_dump and who knows where else.
>
> Comments?
>

-1

correct solution is moving the path to prosrc col or create new colum
"externalproc". I thing so probin should be useful for plpgsql -
sometime we should to store parser result from plpgsql compilation
stage in this column. So my option is don't change native sense of
this column. If we actually have not using, the we should to drop it,
not create some breaks in future.

Pavel Stehule

>                        regards, tom lane
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>


Re: pg_proc.probin should become text?

From
Tom Lane
Date:
Pavel Stehule <pavel.stehule@gmail.com> writes:
> 2009/8/4 Tom Lane <tgl@sss.pgh.pa.us>:
>> I think that the least painful solution might be to change
>> pg_proc.probin to be declared as text.

> correct solution is moving the path to prosrc col or create new colum
> "externalproc". I thing so probin should be useful for plpgsql -
> sometime we should to store parser result from plpgsql compilation
> stage in this column. So my option is don't change native sense of
> this column. If we actually have not using, the we should to drop it,
> not create some breaks in future.

[ shrug... ]  Can't get excited about it.  What you propose would be a
significant amount of work and added complexity, because pg_dump or
somebody would have to take care to translate existing function
definitions properly.  And the benefit is purely speculative.
I seriously doubt that serializing plpgsql compilation trees into a
bytea representation would be worth anyone's time.  The original
Berkeley authors probably had some idea of storing compiled machine code
in probin, but nobody's done that either, and I'll bet long odds that
nobody ever will.  (The advantage compared to external C functions
implemented as we currently know them seems negligible.  The Berkeley
folk probably did not foresee the prevalence of loadable-shared-library
support, nor the general security-driven move away from allowing
execution of writable memory.)
        regards, tom lane


Re: pg_proc.probin should become text?

From
Pavel Stehule
Date:
2009/8/4 Tom Lane <tgl@sss.pgh.pa.us>:
> Pavel Stehule <pavel.stehule@gmail.com> writes:
>> 2009/8/4 Tom Lane <tgl@sss.pgh.pa.us>:
>>> I think that the least painful solution might be to change
>>> pg_proc.probin to be declared as text.
>
>> correct solution is moving the path to prosrc col or create new colum
>> "externalproc". I thing so probin should be useful for plpgsql -
>> sometime we should to store parser result from plpgsql compilation
>> stage in this column. So my option is don't change native sense of
>> this column. If we actually have not using, the we should to drop it,
>> not create some breaks in future.
>
> [ shrug... ]  Can't get excited about it.  What you propose would be a
> significant amount of work and added complexity, because pg_dump or
> somebody would have to take care to translate existing function
> definitions properly.  And the benefit is purely speculative.
> I seriously doubt that serializing plpgsql compilation trees into a
> bytea representation would be worth anyone's time.  The original
> Berkeley authors probably had some idea of storing compiled machine code
> in probin, but nobody's done that either, and I'll bet long odds that
> nobody ever will.  (The advantage compared to external C functions
> implemented as we currently know them seems negligible.  The Berkeley
> folk probably did not foresee the prevalence of loadable-shared-library
> support, nor the general security-driven move away from allowing
> execution of writable memory.)
>
>                        regards, tom lane
>


Re: pg_proc.probin should become text?

From
Pavel Stehule
Date:
2009/8/4 Tom Lane <tgl@sss.pgh.pa.us>:
> Pavel Stehule <pavel.stehule@gmail.com> writes:
>> 2009/8/4 Tom Lane <tgl@sss.pgh.pa.us>:
>>> I think that the least painful solution might be to change
>>> pg_proc.probin to be declared as text.
>
>> correct solution is moving the path to prosrc col or create new colum
>> "externalproc". I thing so probin should be useful for plpgsql -
>> sometime we should to store parser result from plpgsql compilation
>> stage in this column. So my option is don't change native sense of
>> this column. If we actually have not using, the we should to drop it,
>> not create some breaks in future.
>
> [ shrug... ]  Can't get excited about it.  What you propose would be a
> significant amount of work and added complexity, because pg_dump or
> somebody would have to take care to translate existing function
> definitions properly.  And the benefit is purely speculative.
> I seriously doubt that serializing plpgsql compilation trees into a
> bytea representation would be worth anyone's time.  The original
> Berkeley authors probably had some idea of storing compiled machine code
> in probin, but nobody's done that either, and I'll bet long odds that
> nobody ever will.  (The advantage compared to external C functions
> implemented as we currently know them seems negligible.  The Berkeley
> folk probably did not foresee the prevalence of loadable-shared-library
> support, nor the general security-driven move away from allowing
> execution of writable memory.)

I agree, so information about patch would be store in text field. But
I am not sure, if your fix isn't too simply. I haven't plan to compile
plpgsql to C or to binary code. But could be interesting link postgres
with some virtual machine like parrot or lua vm, and translate plpgsql
to p code. It's maybe far future.

Early future is integration main SQL parser to plpgsql. I am not sure,
but maybe we will need some persistent cache for store parametrized
sql queries. I though about store it in probin column.

Pavel

>
>                        regards, tom lane
>


Re: pg_proc.probin should become text?

From
Tom Lane
Date:
Pavel Stehule <pavel.stehule@gmail.com> writes:
> I agree, so information about patch would be store in text field. But
> I am not sure, if your fix isn't too simply. I haven't plan to compile
> plpgsql to C or to binary code. But could be interesting link postgres
> with some virtual machine like parrot or lua vm, and translate plpgsql
> to p code. It's maybe far future.

> Early future is integration main SQL parser to plpgsql. I am not sure,
> but maybe we will need some persistent cache for store parametrized
> sql queries. I though about store it in probin column.

Well, probin is the wrong place for any such thing anyhow.  Regardless
of datatype issues, the system is clearly built on the assumption that
the value of probin is to be specified *by the user* in CREATE FUNCTION.
If you want a cache for derived information it would need to be a new
column.

(I remain of the opinion that caching such stuff in pg_proc would be
a bad design decision.  Every data structure that goes to disk is
another data structure that you cannot easily change.  There just isn't
enough gain there to justify the maintenance headaches.)
        regards, tom lane


Re: pg_proc.probin should become text?

From
Alvaro Herrera
Date:
Pavel Stehule escribió:

> I agree, so information about patch would be store in text field. But
> I am not sure, if your fix isn't too simply. I haven't plan to compile
> plpgsql to C or to binary code. But could be interesting link postgres
> with some virtual machine like parrot or lua vm, and translate plpgsql
> to p code. It's maybe far future.

In this case I think it would make more sense to compile the code and
keep the p-code in backend local memory.  Keeping compiled code around
in more permanent or global storage would only make sense if you had
large amounts of code, and in that case I think we'd want to have a more
concrete and complete proposal.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: pg_proc.probin should become text?

From
Pavel Stehule
Date:
2009/8/4 Tom Lane <tgl@sss.pgh.pa.us>:
> Pavel Stehule <pavel.stehule@gmail.com> writes:
>> I agree, so information about patch would be store in text field. But
>> I am not sure, if your fix isn't too simply. I haven't plan to compile
>> plpgsql to C or to binary code. But could be interesting link postgres
>> with some virtual machine like parrot or lua vm, and translate plpgsql
>> to p code. It's maybe far future.
>
>> Early future is integration main SQL parser to plpgsql. I am not sure,
>> but maybe we will need some persistent cache for store parametrized
>> sql queries. I though about store it in probin column.
>
> Well, probin is the wrong place for any such thing anyhow.  Regardless
> of datatype issues, the system is clearly built on the assumption that
> the value of probin is to be specified *by the user* in CREATE FUNCTION.
> If you want a cache for derived information it would need to be a new
> column.
>
> (I remain of the opinion that caching such stuff in pg_proc would be
> a bad design decision.  Every data structure that goes to disk is
> another data structure that you cannot easily change.  There just isn't
> enough gain there to justify the maintenance headaches.)

ook, I agree - but I am not sure, so some cache will be necessary.

Pavel


>
>                        regards, tom lane
>