Thread: BUG #5334: Version 2.22 of Perl Safe module breaks UTF8 PostgreSQL 8.4

BUG #5334: Version 2.22 of Perl Safe module breaks UTF8 PostgreSQL 8.4

From
"Tim Bunce"
Date:
The following bug has been logged online:

Bug reference:      5334
Logged by:          Tim Bunce
Email address:      Tim.Bunce@pobox.com
PostgreSQL version: 8.4.2
Operating system:   OS X
Description:        Version 2.22 of Perl Safe module breaks UTF8 PostgreSQL
8.4
Details:

Version 2.22 of the Perl Safe module breaks PostgreSQL 8.4 (and probably
earlier versions) that have a default encoding of UTF-8.

Safe 2.2x added extra security for closures created inside Safe but then
returned and executed from outside Safe (which is what PostgreSQL does).
These closures are now wrapped so that they are executed with Safe
restrictions in effect.

This extra security is causing the 'utf8fix' code in PostgreSQL 8.4 to die
(with the error "'require' trapped by operation mask").

I'm investigating this currently but don't yet have a complete fix. (The
immediate problem appears to be easily fixed by switching to the simpler
utf8fix code I added for PostgreSQL 9.0. But a failure still happens if
warn() is called with utf8 string.)

Re: BUG #5334: Version 2.22 of Perl Safe module breaks UTF8 PostgreSQL 8.4

From
Tim Bunce
Date:
On Thu, Feb 18, 2010 at 10:50:23AM +0000, Tim Bunce wrote:
>
> The following bug has been logged online:
>
> Bug reference:      5334
> Logged by:          Tim Bunce
> Email address:      Tim.Bunce@pobox.com
> PostgreSQL version: 8.4.2
> Operating system:   OS X
> Description:        Version 2.22 of Perl Safe module breaks UTF8 PostgreSQL
> 8.4
> Details:
>
> Version 2.22 of the Perl Safe module breaks PostgreSQL 8.4 (and probably
> earlier versions) that have a default encoding of UTF-8.
>
> Safe 2.2x added extra security for closures created inside Safe but then
> returned and executed from outside Safe (which is what PostgreSQL does).
> These closures are now wrapped so that they are executed with Safe
> restrictions in effect.
>
> This extra security is causing the 'utf8fix' code in PostgreSQL 8.4 to die
> (with the error "'require' trapped by operation mask").
>
> I'm investigating this currently but don't yet have a complete fix. (The
> immediate problem appears to be easily fixed by switching to the simpler
> utf8fix code I added for PostgreSQL 9.0. But a failure still happens if
> warn() is called with utf8 string.)

It took a depressingly large number of intense hours to work out what
was going on and then more to try to work out a relatively clean solution.

The underlying problem is in perl and Safe but sadly there's no
reasonable way to fix Safe such that PostgreSQL would work without
changes.

The attached patch (over REL8_4_STABLE) works around the problem.
The key line is:

    *PLPerl::utf8::SWASHNEW = \&utf8::SWASHNEW;

This allows the perl regex logic to call the SWASHNEW method that's
called when information from the Unicode character database is needed.
(The lack of that method was causing the regex logic to think that the
utf8 module wasn't loaded, so it would try to 'require' it but fail due
to the restrictions of the Safe compartment.)

The rest of the patch is updates the surrounding code to the same
simplified 'utf8fix' logic used in PostgreSQL 9.0, and the same Safe
version checks.

Tim.

Attachment

Re: BUG #5334: Version 2.22 of Perl Safe module breaks UTF8 PostgreSQL 8.4

From
Alex Hunsaker
Date:
On Thu, Feb 18, 2010 at 11:09, Tim Bunce <Tim.Bunce@pobox.com> wrote:
> The key line is:
>
> =C2=A0 =C2=A0*PLPerl::utf8::SWASHNEW =3D \&utf8::SWASHNEW;

Hrm... It seems to work for me in HEAD and AFAICS we dont have that
line.  Did I just miss it?  Or did you happen to fix it in another way
with your refactoring?

Another Idea that comes to mind would be instead of (in ::mksafefunc):
my $subref =3D ->reval(sub {} );
$subref->();

do:
my $subref =3D ->reval(sub {});
return sub { ->reval("$subreb->();"); }

or something...

I did a few quick tests but it failed miserably for me...  Im also not
fond of adding yet another closure. :)

> This allows the perl regex logic to call the SWASHNEW method that's
> called when information from the Unicode character database is needed.
> (The lack of that method was causing the regex logic to think that the
> utf8 module wasn't loaded, so it would try to 'require' it but fail due
> to the restrictions of the Safe compartment.)

Makes me think we might just be able to share some of utf8 package in the s=
afe?

> The rest of the patch is updates the surrounding code to the same
> simplified 'utf8fix' logic used in PostgreSQL 9.0, and the same Safe
> version checks.

=46rom a quick look it looks ok.
On Thu, Feb 18, 2010 at 01:36:59PM -0800, David E. Wheeler wrote:
> On Feb 18, 2010, at 10:09 AM, Tim Bunce wrote:
>
> > It took a depressingly large number of intense hours to work out what
> > was going on and then more to try to work out a relatively clean solution.
> >
> > The underlying problem is in perl and Safe but sadly there's no
> > reasonable way to fix Safe such that PostgreSQL would work without
> > changes.
>
> Hrm. I don't have this bug with Safe 3.19, FWIW.

That's because Safe 1.19 (which I presume you meant) doesn't execute
closures 'inside' the Safe compartment. So when the regex executes at
runtime the C code looks up the utf8::SWASHNEW method without a problem.

Tim.

Re: BUG #5334: Version 2.22 of Perl Safe module breaks UTF8 PostgreSQL 8.4

From
Tim Bunce
Date:
On Thu, Feb 18, 2010 at 11:32:38AM -0700, Alex Hunsaker wrote:
> On Thu, Feb 18, 2010 at 11:09, Tim Bunce <Tim.Bunce@pobox.com> wrote:
> > The key line is:
> >
> >    *PLPerl::utf8::SWASHNEW = \&utf8::SWASHNEW;
>
> Hrm... It seems to work for me in HEAD and AFAICS we dont have that
> line.  Did I just miss it?  Or did you happen to fix it in another way
> with your refactoring?

To be honest I'm not sure. I plan to look into that today.

> Another Idea that comes to mind would be instead of (in ::mksafefunc):
> my $subref = ->reval(sub {} );
> $subref->();
>
> do:
> my $subref = ->reval(sub {});
> return sub { ->reval("$subreb->();"); }
>
> or something...
>
> I did a few quick tests but it failed miserably for me...  Im also not
> fond of adding yet another closure. :)

No amount of closure wrapping will fix the problem.

> > This allows the perl regex logic to call the SWASHNEW method that's
> > called when information from the Unicode character database is needed.
> > (The lack of that method was causing the regex logic to think that the
> > utf8 module wasn't loaded, so it would try to 'require' it but fail due
> > to the restrictions of the Safe compartment.)
>
> Makes me think we might just be able to share some of utf8 package in the safe?

I tried. The perl utf8.c code does a method lookup of SWASHNEW to decide
if the utf8 module has been loaded. So if SWASHNEW is shared _before_
utf8 is loaded *and used* then the method lookup works (it finds the
shared stub) and the utf8 module never gets loaded.

(The *and used* complication is due to utf8.pm being a thin wrapper with
an AUTOLOAD trampoline that loads utf8_heavy.pl on first use. More fun.)

I do have a patch for Safe that, if utf8 is loaded, will ensure it's
'used' and then auto-share SWASHNEW. That won't fix PostgreSQL 8.x
though because utf8 isn't loaded at the time Safe->new is called.

> > The rest of the patch is updates the surrounding code to the same
> > simplified 'utf8fix' logic used in PostgreSQL 9.0, and the same Safe
> > version checks.
>
> From a quick look it looks ok.

Thanks.

Tim.

Re: BUG #5334: Version 2.22 of Perl Safe module breaks UTF8 PostgreSQL 8.4

From
Tim Bunce
Date:
I've got the patch to Safe ready but the more I think about it the more
I think the right fix is for Safe to automatically fully load utf8.pm
(and utf8_heavy.pl) and to always share SWASHNEW itself.

Assuming perl5-porters agree then the next release of Safe will do that
ad this patch won't be needed. (Other than it possibly being worthwhile
to detect the 'bad' versions of Safe.)

Tim.

On Fri, Feb 19, 2010 at 09:30:41AM +0000, Tim Bunce wrote:
> On Thu, Feb 18, 2010 at 11:32:38AM -0700, Alex Hunsaker wrote:
> > On Thu, Feb 18, 2010 at 11:09, Tim Bunce <Tim.Bunce@pobox.com> wrote:
> > > The key line is:
> > >
> > >    *PLPerl::utf8::SWASHNEW = \&utf8::SWASHNEW;
> >
> > Hrm... It seems to work for me in HEAD and AFAICS we dont have that
> > line.  Did I just miss it?  Or did you happen to fix it in another way
> > with your refactoring?
>
> To be honest I'm not sure. I plan to look into that today.
>
> > Another Idea that comes to mind would be instead of (in ::mksafefunc):
> > my $subref = ->reval(sub {} );
> > $subref->();
> >
> > do:
> > my $subref = ->reval(sub {});
> > return sub { ->reval("$subreb->();"); }
> >
> > or something...
> >
> > I did a few quick tests but it failed miserably for me...  Im also not
> > fond of adding yet another closure. :)
>
> No amount of closure wrapping will fix the problem.
>
> > > This allows the perl regex logic to call the SWASHNEW method that's
> > > called when information from the Unicode character database is needed.
> > > (The lack of that method was causing the regex logic to think that the
> > > utf8 module wasn't loaded, so it would try to 'require' it but fail due
> > > to the restrictions of the Safe compartment.)
> >
> > Makes me think we might just be able to share some of utf8 package in the safe?
>
> I tried. The perl utf8.c code does a method lookup of SWASHNEW to decide
> if the utf8 module has been loaded. So if SWASHNEW is shared _before_
> utf8 is loaded *and used* then the method lookup works (it finds the
> shared stub) and the utf8 module never gets loaded.
>
> (The *and used* complication is due to utf8.pm being a thin wrapper with
> an AUTOLOAD trampoline that loads utf8_heavy.pl on first use. More fun.)
>
> I do have a patch for Safe that, if utf8 is loaded, will ensure it's
> 'used' and then auto-share SWASHNEW. That won't fix PostgreSQL 8.x
> though because utf8 isn't loaded at the time Safe->new is called.
>
> > > The rest of the patch is updates the surrounding code to the same
> > > simplified 'utf8fix' logic used in PostgreSQL 9.0, and the same Safe
> > > version checks.
> >
> > From a quick look it looks ok.
>
> Thanks.
>
> Tim.
>

Re: [Tigerlead] BUG #5334: Version 2.22 of Perl Safe module breaks UTF8 PostgreSQL 8.4

From
"David E. Wheeler"
Date:
On Feb 18, 2010, at 10:09 AM, Tim Bunce wrote:

> It took a depressingly large number of intense hours to work out what
> was going on and then more to try to work out a relatively clean solution.
>
> The underlying problem is in perl and Safe but sadly there's no
> reasonable way to fix Safe such that PostgreSQL would work without
> changes.

Hrm. I don't have this bug with Safe 3.19, FWIW.

Best,

David

Re: BUG #5334: Version 2.22 of Perl Safe module breaks UTF8 PostgreSQL 8.4

From
Alex Hunsaker
Date:
On Fri, Feb 19, 2010 at 02:30, Tim Bunce <Tim.Bunce@pobox.com> wrote:
> On Thu, Feb 18, 2010 at 11:32:38AM -0700, Alex Hunsaker wrote:
> > On Thu, Feb 18, 2010 at 11:09, Tim Bunce <Tim.Bunce@pobox.com> wrote:
> >    *PLPerl::utf8::SWASHNEW =3D \&utf8::SWASHNEW;
> >
> > Hrm... It seems to work for me in HEAD and AFAICS we dont have that
> > line.  Did I just miss it?  Or did you happen to fix it in another way
> > with your refactoring?

> To be honest I'm not sure. I plan to look into that today.

My hunch is it has to do with the require strict;  require feature;
That's the only major difference I see (other than the require_op and
it being in its own package/file).

>> I did a few quick tests but it failed miserably for me... =C2=A0Im also =
not
>> fond of adding yet another closure. :)
>
> No amount of closure wrapping will fix the problem.

Yeah, brain fart... That's essentially what Safe.pm does now (and why
there is a problem :) )

>> Makes me think we might just be able to share some of utf8 package in th=
e safe?
>
> I tried. The perl utf8.c code does a method lookup of SWASHNEW to decide
> if the utf8 module has been loaded. So if SWASHNEW is shared _before_
> utf8 is loaded *and used* then the method lookup works (it finds the
> shared stub) and the utf8 module never gets loaded.

Hrm...  That seems wrong to me. Let me see If I can explain why.  The
below is what you seem to be saying:

package utf8;
sub import {  # or maybe this is a BEGIN
  return if(\&{'utf8::SWASHNEW'}; # already loaded
  # ok not loaded open the Unicode database and do junk which will
'trap' in safe
  do 'utf8_heavy.pl';
}

So if we define SWASHNEW without loading the unicode database how will
utf8/unicode work exactly?  I guess as long as it gets loaded at some
point it works.  So for postgres because we do the utf8 fix after
Safe->new and at that point we cant have any 'bad' strings, it will
work. (with your hack).  Sound right?

It seems to me a more correct fix would be to require utf8; inside of
the safe like we do strict.  Sorry thats a bit handwavy.  You have
obviously spent more time then me looking into this...

Im thinking (in pseudo code):

#define SAFE_OK
....
sub ::mksafefunc {
   permit->(qw(caller require));
   reval->('require utf8; 1;');
   deny->(qw(caller require));
...
}
sub ::mk_strict_safefunc {
    ...reval->('use strict; require utf8;)

}

static void
plperl_safe_init
{
    if (GetDatabaseEncoding() =3D=3D PG_UTF8)
   {
        eval_pv("my $a=3Dpack('U',0xC4); $a =3D~ /\\xE4\\d/i;", FALSE);
   }

   eval_pv(SAFE_MODULE, FALSE);
   eval_pv(SAFE_OK, FALSE)
}

One thing that stinks is while we might not do the utf8fix if we are
not PG_UTF8 we would always require utf8;.  And I dont see an easy way
around that in 8.4 :(  Also note that is all entirely untested :(  If
you think its sane (and it might not be) Im happy to work up a patch.
Id favor this approach as if you have utf8 strings the likely hood
that you want ::upgrade, ::downgrade, ::encode, ::valid or ::is_utf8
is fairly high.  Then again, no one has complained thus far...  Maybe
thats just me :)

Thoughts?

Anywhoo I cant reproduce this outside of postgres.  Maybe you can give
me a pointer?

use Safe();
binmode(STDOUT, ':utf8');
print $Safe::VERSION . "\n";
my $safe =3D Safe->new('t');
$safe->permit('print');
$safe->reval('sub { print "\x{263a}\n"; }')->();
print $@ ."\n" if($@);
-----
2.22
=E2=98=BA

Re: BUG #5334: Version 2.22 of Perl Safe module breaks UTF8 PostgreSQL 8.4

From
Alex Hunsaker
Date:
On Fri, Feb 19, 2010 at 06:06, Tim Bunce <Tim.Bunce@pobox.com> wrote:
> I've got the patch to Safe ready but the more I think about it the more
> I think the right fix is for Safe to automatically fully load utf8.pm
> (and utf8_heavy.pl) and to always share SWASHNEW itself.

It seems cleaner if we could just share say utf8::VERSION.  SWASHNEW
seems likely to be changed as it "feels" more like a implementation
detail.  But if thats what utf8 checks... well then thats what it
checks.

> Assuming perl5-porters agree then the next release of Safe will do that
> ad this patch won't be needed. (Other than it possibly being worthwhile
> to detect the 'bad' versions of Safe.)

It seems safer if there was some way to 'opt' in say if utf8 was
loaded then make safe do the above.  Or maybe a pragma? use utf8
qw(utf8);   We would still have to patch postgres...  But I can
imagine there are some users of utf8 that dont want utf8 strings.  BTW
as I could not reproduce this does this mean that reval->('"\x{}...")
works while reval->('sub { "\x{}"}') does not ?  Or is it before the
first one failed while the closure based one worked?

Re: BUG #5334: Version 2.22 of Perl Safe module breaks UTF8 PostgreSQL 8.4

From
Alex Hunsaker
Date:
On Fri, Feb 19, 2010 at 09:18, Alex Hunsaker <badalex@gmail.com> wrote:
> It seems to me a more correct fix would be to require utf8; inside of
> the safe like we do strict.
> ....
> Id favor this approach as if you have utf8 strings the likely hood
> that you want ::upgrade, ::downgrade, ::encode, ::valid or ::is_utf8
> is fairly high. =C2=A0Then again, no one has complained thus far... =C2=
=A0Maybe
> thats just me :)

On second thought, I dont think we should import any of those by
default.  And your hack for just SWASHNEW is better.

Re: [Tigerlead] BUG #5334: Version 2.22 of Perl Safe module breaks UTF8 PostgreSQL 8.4

From
"David E. Wheeler"
Date:
On Feb 19, 2010, at 1:13 AM, Tim Bunce wrote:

>> Hrm. I don't have this bug with Safe 3.19, FWIW.
>
> That's because Safe 1.19 (which I presume you meant) doesn't execute
> closures 'inside' the Safe compartment. So when the regex executes at
> runtime the C code looks up the utf8::SWASHNEW method without a problem.

Oh, so 2.19 is less secure in that regard, yes?

David
On Fri, Feb 19, 2010 at 09:00:59AM -0800, David E. Wheeler wrote:
> On Feb 19, 2010, at 1:13 AM, Tim Bunce wrote:
>
> >> Hrm. I don't have this bug with Safe 3.19, FWIW.
> >
> > That's because Safe 1.19 (which I presume you meant) doesn't execute
> > closures 'inside' the Safe compartment. So when the regex executes at
> > runtime the C code looks up the utf8::SWASHNEW method without a problem.
>
> Oh, so 2.19 is less secure in that regard, yes?

Yes.

When code that was compiled outside the compartment is executed by a
plperl function, including internal regex implementation code, that
code could call eval/require/do etc. and the newly compiled code
wouldn't have any restrictions. With Safe 2.20+ the newly compiled code
is subject to the same restrictions as plperl.

So what we're seeing is the knock-on effects of that tighting of security.
That's why I'd rather move forward rather than back (though there have
been times over the last 48 hours where moving back to Safe 1.19 looked
very appealing :)

Tim.

Re: BUG #5334: Version 2.22 of Perl Safe module breaks UTF8 PostgreSQL 8.4

From
Tim Bunce
Date:
On Fri, Feb 19, 2010 at 09:32:38AM -0700, Alex Hunsaker wrote:
> On Fri, Feb 19, 2010 at 09:18, Alex Hunsaker <badalex@gmail.com> wrote:
> > It seems to me a more correct fix would be to require utf8; inside of
> > the safe like we do strict.
> > ....
> > Id favor this approach as if you have utf8 strings the likely hood
> > that you want ::upgrade, ::downgrade, ::encode, ::valid or ::is_utf8
> > is fairly high.  Then again, no one has complained thus far...  Maybe
> > thats just me :)
>
> On second thought, I dont think we should import any of those by
> default.  And your hack for just SWASHNEW is better.

Here's the corresponding perlbug http://rt.perl.org/rt3/Ticket/Display.html?id=72942

I'll retest 8.4 and 9.0 against this on Monday.

Tim.

Re: BUG #5334: Version 2.22 of Perl Safe module breaks UTF8 PostgreSQL 8.4

From
Alex Hunsaker
Date:
On Fri, Feb 19, 2010 at 14:00, Tim Bunce <Tim.Bunce@pobox.com> wrote:
> On Fri, Feb 19, 2010 at 09:32:38AM -0700, Alex Hunsaker wrote:
>> On Fri, Feb 19, 2010 at 09:18, Alex Hunsaker <badalex@gmail.com> wrote:
>> > It seems to me a more correct fix would be to require utf8; inside of
>> > the safe like we do strict.
>> > ....
>> > Id favor this approach as if you have utf8 strings the likely hood
>> > that you want ::upgrade, ::downgrade, ::encode, ::valid or ::is_utf8
>> > is fairly high. =C2=A0Then again, no one has complained thus far... =
=C2=A0Maybe
>> > thats just me :)
>>
>> On second thought, I dont think we should import any of those by
>> default. =C2=A0And your hack for just SWASHNEW is better.

Funny.. Safe.pm already does this (share various utf8:: functions) so
I think there should be no question that what you did in the patch
below is correct and a bug with Safe.  Sorry for the handwaves, that
was me trying to understand the problem and your fix. :)

> Here's the corresponding perlbug http://rt.perl.org/rt3/Ticket/Display.ht=
ml?id=3D72942

Hrm... Is the require utf8; strictly needed? A reading of perldoc utf8
seems to say the do { my $unicode =3D ... } (aka load utf8_heavy.pl)
part should make it all work fine.

It also seems to still work
t/safeutf8.t ....... ok

*shrug*

> I'll retest 8.4 and 9.0 against this on Monday.

Ill see if I can squeeze in some pg 8.4 perl 5.10.1 linux x86_64
testing tonight of the above.  (Ill just reply to the perl bug )

Re: BUG #5334: Version 2.22 of Perl Safe module breaks UTF8 PostgreSQL 8.4

From
Tim Bunce
Date:
On Fri, Feb 19, 2010 at 02:22:33PM -0700, Alex Hunsaker wrote:
> On Fri, Feb 19, 2010 at 14:00, Tim Bunce <Tim.Bunce@pobox.com> wrote:
> > On Fri, Feb 19, 2010 at 09:32:38AM -0700, Alex Hunsaker wrote:
> >> On Fri, Feb 19, 2010 at 09:18, Alex Hunsaker <badalex@gmail.com> wrote:
> >> > It seems to me a more correct fix would be to require utf8; inside of
> >> > the safe like we do strict.
> >> > ....
> >> > Id favor this approach as if you have utf8 strings the likely hood
> >> > that you want ::upgrade, ::downgrade, ::encode, ::valid or ::is_utf8
> >> > is fairly high.  Then again, no one has complained thus far...  Maybe
> >> > thats just me :)
> >>
> >> On second thought, I dont think we should import any of those by
> >> default.  And your hack for just SWASHNEW is better.
>
> Funny.. Safe.pm already does this (share various utf8:: functions) so
> I think there should be no question that what you did in the patch
> below is correct and a bug with Safe.  Sorry for the handwaves, that
> was me trying to understand the problem and your fix. :)

No worries. It took a lot of head banging for me to understand the
problem and the fix :)

> > Here's the corresponding perlbug http://rt.perl.org/rt3/Ticket/Display.html?id=72942
>
> Hrm... Is the require utf8; strictly needed? A reading of perldoc utf8
> seems to say the do { my $unicode = ... } (aka load utf8_heavy.pl)
> part should make it all work fine.
>
> It also seems to still work
> t/safeutf8.t ....... ok
>
> *shrug*

Yesh, it's not really needed. It's harmless though, and avoids someone
thinking they can simply replace the regex with the require.

> > I'll retest 8.4 and 9.0 against this on Monday.
>
> Ill see if I can squeeze in some pg 8.4 perl 5.10.1 linux x86_64
> testing tonight of the above.  (Ill just reply to the perl bug )

The patched version of Safe works fine for me with 8.4.2 and 9.0.
So I no patch to PostgreSQL is needed. Though once Safe 2.23 is
released then Safe 2.20 should be detected and disabled, like Safe 2.21.
It's probably not very important though as I'd expect perl 5.12 to be
released with Safe 2.23 so very few people would have 2.20..2.22.

Tim.