Re: Bytea PL/Perl transform - Mailing list pgsql-hackers

From Pavel Stehule
Subject Re: Bytea PL/Perl transform
Date
Msg-id CAFj8pRB9S7gT-hkb4cUwwv0ApJs4mZ6Td8QhxekcPBmHN03U0w@mail.gmail.com
Whole thread Raw
In response to Re: Bytea PL/Perl transform  (Pavel Stehule <pavel.stehule@gmail.com>)
Responses Re: Bytea PL/Perl transform
List pgsql-hackers
Hi

út 30. 1. 2024 v 18:35 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:


út 30. 1. 2024 v 18:26 odesílatel Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> napsal:
Pavel Stehule <pavel.stehule@gmail.com> writes:

> út 30. 1. 2024 v 17:46 odesílatel Dagfinn Ilmari Mannsåker <
> ilmari@ilmari.org> napsal:
>
>> Pavel Stehule <pavel.stehule@gmail.com> writes:
>>
>> > út 30. 1. 2024 v 17:18 odesílatel Dagfinn Ilmari Mannsåker <
>> > ilmari@ilmari.org> napsal:
>> >
>> >> Pavel Stehule <pavel.stehule@gmail.com> writes:
>> >>
>> >> > út 30. 1. 2024 v 16:43 odesílatel Dagfinn Ilmari Mannsåker <
>> >> > ilmari@ilmari.org> napsal:
>> >> >
>> >> >> Pavel Stehule <pavel.stehule@gmail.com> writes:
>> >> >>
>> >> >> > I inserted perl reference support - hstore_plperl and json_plperl
>> does
>> >> >> it.
>> >> >> >
>> >> >> > +<->/* Dereference references recursively. */
>> >> >> > +<->while (SvROK(in))
>> >> >> > +<-><-->in = SvRV(in);
>> >> >>
>> >> >> That code in hstore_plperl and json_plperl is only relevant because
>> they
>> >> >> deal with non-scalar values (hashes for hstore, and also arrays for
>> >> >> json) which must be passed as references.  The recursive nature of
>> the
>> >> >> dereferencing is questionable, and masked the bug fixed by commit
>> >> >> 1731e3741cbbf8e0b4481665d7d523bc55117f63.
>> >> >>
>> >> >> bytea_plperl only deals with scalars (specifically strings), so
>> should
>> >> >> not concern itself with references.  In fact, this code breaks
>> returning
>> >> >> objects with overloaded stringification, for example:
>> >> >>
>> >> >> CREATE FUNCTION plperlu_overload() RETURNS bytea LANGUAGE plperlu
>> >> >>   TRANSFORM FOR TYPE bytea
>> >> >>   AS $$
>> >> >>     package StringOverload { use overload '""' => sub { "stuff" }; }
>> >> >>     return bless {}, "StringOverload";
>> >> >>   $$;
>> >> >>
>> >> >> This makes the server crash with an assertion failure from Perl
>> because
>> >> >> SvPVbyte() was passed a non-scalar value:
>> >> >>
>> >> >> postgres: ilmari regression_bytea_plperl [local] SELECT: sv.c:2865:
>> >> >> Perl_sv_2pv_flags:
>> >> >> Assertion `SvTYPE(sv) != SVt_PVAV && SvTYPE(sv) != SVt_PVHV &&
>> >> SvTYPE(sv)
>> >> >> != SVt_PVFM' failed.
>> >> >>
>> >> >> If I remove the dereferincing loop it succeeds:
>> >> >>
>> >> >> SELECT encode(plperlu_overload(), 'escape') AS string;
>> >> >>  string
>> >> >> --------
>> >> >>  stuff
>> >> >> (1 row)
>> >> >>
>> >> >> Attached is a v2 patch which removes the dereferencing and includes
>> the
>> >> >> above example as a test.
>> >> >>
>> >> >
>> >> > But without dereference it returns bad value.
>> >>
>> >> Where exactly does it return a bad value?  The existing tests pass, and
>> >> the one I included shows that it does the right thing in that case too.
>> >> If you pass it an unblessed reference it returns the stringified version
>> >> of that, as expected.
>> >>
>> >
>> > ugly test code
>> >
>> > (2024-01-30 13:44:28) postgres=# CREATE or replace FUNCTION
>> > perl_inverse_bytes(bytea) RETURNS bytea
>> > TRANSFORM FOR TYPE bytea
>> > AS $$ my $bytes =  pack 'H*', '0123'; my $ref = \$bytes;
>>
>> You are returning a reference, not a string.
>>
>
> I know, but for this case, should not be raised an error?

I don't think so, as I explained in my previous reply:

> There's no reason to ban references, that would break every Perl
> programmer's expectations.

To elaborate on this: when a function is defined to return a string
(which bytea effectively is, as far as Perl is converned), I as a Perl
programmer would expect PL/Perl to just stringify whatever value I
returned, according to the usual Perl rules.

ok

Pavel
 

I also said:

> If we really want to be strict, we should at least allow references to
> objects that overload stringification, as they are explicitly designed
> to be well-behaved as strings.  But that would be a lot of extra code
> for very little benefit over just letting Perl stringify everything.

By "a lot of code", I mean everything `string_amg`-related in the
amagic_applies() function
(https://github.com/Perl/perl5/blob/v5.38.0/gv.c#L3401-L3545).  We can't
just call it: it's only available since Perl 5.38 (released last year),
and we support Perl versions all the way back to 5.14.

- ilmari

I marked this patch as ready for committer.

It is almost trivial, make check-world, make doc passed

Regards

Pavel

pgsql-hackers by date:

Previous
From: Sergey Prokhorenko
Date:
Subject: Re: UUID v7
Next
From: Robert Haas
Date:
Subject: Re: Flushing large data immediately in pqcomm