Thread: BUG #5334: Version 2.22 of Perl Safe module breaks UTF8 PostgreSQL 8.4
The following bug has been logged online: Bug reference: 5334 Logged by: Tim Bunce Email address: Tim.Bunce@pobox.com PostgreSQL version: 8.4.2 Operating system: OS X Description: Version 2.22 of Perl Safe module breaks UTF8 PostgreSQL 8.4 Details: Version 2.22 of the Perl Safe module breaks PostgreSQL 8.4 (and probably earlier versions) that have a default encoding of UTF-8. Safe 2.2x added extra security for closures created inside Safe but then returned and executed from outside Safe (which is what PostgreSQL does). These closures are now wrapped so that they are executed with Safe restrictions in effect. This extra security is causing the 'utf8fix' code in PostgreSQL 8.4 to die (with the error "'require' trapped by operation mask"). I'm investigating this currently but don't yet have a complete fix. (The immediate problem appears to be easily fixed by switching to the simpler utf8fix code I added for PostgreSQL 9.0. But a failure still happens if warn() is called with utf8 string.)
On Thu, Feb 18, 2010 at 10:50:23AM +0000, Tim Bunce wrote: > > The following bug has been logged online: > > Bug reference: 5334 > Logged by: Tim Bunce > Email address: Tim.Bunce@pobox.com > PostgreSQL version: 8.4.2 > Operating system: OS X > Description: Version 2.22 of Perl Safe module breaks UTF8 PostgreSQL > 8.4 > Details: > > Version 2.22 of the Perl Safe module breaks PostgreSQL 8.4 (and probably > earlier versions) that have a default encoding of UTF-8. > > Safe 2.2x added extra security for closures created inside Safe but then > returned and executed from outside Safe (which is what PostgreSQL does). > These closures are now wrapped so that they are executed with Safe > restrictions in effect. > > This extra security is causing the 'utf8fix' code in PostgreSQL 8.4 to die > (with the error "'require' trapped by operation mask"). > > I'm investigating this currently but don't yet have a complete fix. (The > immediate problem appears to be easily fixed by switching to the simpler > utf8fix code I added for PostgreSQL 9.0. But a failure still happens if > warn() is called with utf8 string.) It took a depressingly large number of intense hours to work out what was going on and then more to try to work out a relatively clean solution. The underlying problem is in perl and Safe but sadly there's no reasonable way to fix Safe such that PostgreSQL would work without changes. The attached patch (over REL8_4_STABLE) works around the problem. The key line is: *PLPerl::utf8::SWASHNEW = \&utf8::SWASHNEW; This allows the perl regex logic to call the SWASHNEW method that's called when information from the Unicode character database is needed. (The lack of that method was causing the regex logic to think that the utf8 module wasn't loaded, so it would try to 'require' it but fail due to the restrictions of the Safe compartment.) The rest of the patch is updates the surrounding code to the same simplified 'utf8fix' logic used in PostgreSQL 9.0, and the same Safe version checks. Tim.
Attachment
On Thu, Feb 18, 2010 at 11:09, Tim Bunce <Tim.Bunce@pobox.com> wrote: > The key line is: > > =C2=A0 =C2=A0*PLPerl::utf8::SWASHNEW =3D \&utf8::SWASHNEW; Hrm... It seems to work for me in HEAD and AFAICS we dont have that line. Did I just miss it? Or did you happen to fix it in another way with your refactoring? Another Idea that comes to mind would be instead of (in ::mksafefunc): my $subref =3D ->reval(sub {} ); $subref->(); do: my $subref =3D ->reval(sub {}); return sub { ->reval("$subreb->();"); } or something... I did a few quick tests but it failed miserably for me... Im also not fond of adding yet another closure. :) > This allows the perl regex logic to call the SWASHNEW method that's > called when information from the Unicode character database is needed. > (The lack of that method was causing the regex logic to think that the > utf8 module wasn't loaded, so it would try to 'require' it but fail due > to the restrictions of the Safe compartment.) Makes me think we might just be able to share some of utf8 package in the s= afe? > The rest of the patch is updates the surrounding code to the same > simplified 'utf8fix' logic used in PostgreSQL 9.0, and the same Safe > version checks. =46rom a quick look it looks ok.
Re: [Tigerlead] BUG #5334: Version 2.22 of Perl Safe module breaks UTF8 PostgreSQL 8.4
From
Tim Bunce
Date:
On Thu, Feb 18, 2010 at 01:36:59PM -0800, David E. Wheeler wrote: > On Feb 18, 2010, at 10:09 AM, Tim Bunce wrote: > > > It took a depressingly large number of intense hours to work out what > > was going on and then more to try to work out a relatively clean solution. > > > > The underlying problem is in perl and Safe but sadly there's no > > reasonable way to fix Safe such that PostgreSQL would work without > > changes. > > Hrm. I don't have this bug with Safe 3.19, FWIW. That's because Safe 1.19 (which I presume you meant) doesn't execute closures 'inside' the Safe compartment. So when the regex executes at runtime the C code looks up the utf8::SWASHNEW method without a problem. Tim.
On Thu, Feb 18, 2010 at 11:32:38AM -0700, Alex Hunsaker wrote: > On Thu, Feb 18, 2010 at 11:09, Tim Bunce <Tim.Bunce@pobox.com> wrote: > > The key line is: > > > > *PLPerl::utf8::SWASHNEW = \&utf8::SWASHNEW; > > Hrm... It seems to work for me in HEAD and AFAICS we dont have that > line. Did I just miss it? Or did you happen to fix it in another way > with your refactoring? To be honest I'm not sure. I plan to look into that today. > Another Idea that comes to mind would be instead of (in ::mksafefunc): > my $subref = ->reval(sub {} ); > $subref->(); > > do: > my $subref = ->reval(sub {}); > return sub { ->reval("$subreb->();"); } > > or something... > > I did a few quick tests but it failed miserably for me... Im also not > fond of adding yet another closure. :) No amount of closure wrapping will fix the problem. > > This allows the perl regex logic to call the SWASHNEW method that's > > called when information from the Unicode character database is needed. > > (The lack of that method was causing the regex logic to think that the > > utf8 module wasn't loaded, so it would try to 'require' it but fail due > > to the restrictions of the Safe compartment.) > > Makes me think we might just be able to share some of utf8 package in the safe? I tried. The perl utf8.c code does a method lookup of SWASHNEW to decide if the utf8 module has been loaded. So if SWASHNEW is shared _before_ utf8 is loaded *and used* then the method lookup works (it finds the shared stub) and the utf8 module never gets loaded. (The *and used* complication is due to utf8.pm being a thin wrapper with an AUTOLOAD trampoline that loads utf8_heavy.pl on first use. More fun.) I do have a patch for Safe that, if utf8 is loaded, will ensure it's 'used' and then auto-share SWASHNEW. That won't fix PostgreSQL 8.x though because utf8 isn't loaded at the time Safe->new is called. > > The rest of the patch is updates the surrounding code to the same > > simplified 'utf8fix' logic used in PostgreSQL 9.0, and the same Safe > > version checks. > > From a quick look it looks ok. Thanks. Tim.
I've got the patch to Safe ready but the more I think about it the more I think the right fix is for Safe to automatically fully load utf8.pm (and utf8_heavy.pl) and to always share SWASHNEW itself. Assuming perl5-porters agree then the next release of Safe will do that ad this patch won't be needed. (Other than it possibly being worthwhile to detect the 'bad' versions of Safe.) Tim. On Fri, Feb 19, 2010 at 09:30:41AM +0000, Tim Bunce wrote: > On Thu, Feb 18, 2010 at 11:32:38AM -0700, Alex Hunsaker wrote: > > On Thu, Feb 18, 2010 at 11:09, Tim Bunce <Tim.Bunce@pobox.com> wrote: > > > The key line is: > > > > > > *PLPerl::utf8::SWASHNEW = \&utf8::SWASHNEW; > > > > Hrm... It seems to work for me in HEAD and AFAICS we dont have that > > line. Did I just miss it? Or did you happen to fix it in another way > > with your refactoring? > > To be honest I'm not sure. I plan to look into that today. > > > Another Idea that comes to mind would be instead of (in ::mksafefunc): > > my $subref = ->reval(sub {} ); > > $subref->(); > > > > do: > > my $subref = ->reval(sub {}); > > return sub { ->reval("$subreb->();"); } > > > > or something... > > > > I did a few quick tests but it failed miserably for me... Im also not > > fond of adding yet another closure. :) > > No amount of closure wrapping will fix the problem. > > > > This allows the perl regex logic to call the SWASHNEW method that's > > > called when information from the Unicode character database is needed. > > > (The lack of that method was causing the regex logic to think that the > > > utf8 module wasn't loaded, so it would try to 'require' it but fail due > > > to the restrictions of the Safe compartment.) > > > > Makes me think we might just be able to share some of utf8 package in the safe? > > I tried. The perl utf8.c code does a method lookup of SWASHNEW to decide > if the utf8 module has been loaded. So if SWASHNEW is shared _before_ > utf8 is loaded *and used* then the method lookup works (it finds the > shared stub) and the utf8 module never gets loaded. > > (The *and used* complication is due to utf8.pm being a thin wrapper with > an AUTOLOAD trampoline that loads utf8_heavy.pl on first use. More fun.) > > I do have a patch for Safe that, if utf8 is loaded, will ensure it's > 'used' and then auto-share SWASHNEW. That won't fix PostgreSQL 8.x > though because utf8 isn't loaded at the time Safe->new is called. > > > > The rest of the patch is updates the surrounding code to the same > > > simplified 'utf8fix' logic used in PostgreSQL 9.0, and the same Safe > > > version checks. > > > > From a quick look it looks ok. > > Thanks. > > Tim. >
Re: [Tigerlead] BUG #5334: Version 2.22 of Perl Safe module breaks UTF8 PostgreSQL 8.4
From
"David E. Wheeler"
Date:
On Feb 18, 2010, at 10:09 AM, Tim Bunce wrote: > It took a depressingly large number of intense hours to work out what > was going on and then more to try to work out a relatively clean solution. > > The underlying problem is in perl and Safe but sadly there's no > reasonable way to fix Safe such that PostgreSQL would work without > changes. Hrm. I don't have this bug with Safe 3.19, FWIW. Best, David
On Fri, Feb 19, 2010 at 02:30, Tim Bunce <Tim.Bunce@pobox.com> wrote: > On Thu, Feb 18, 2010 at 11:32:38AM -0700, Alex Hunsaker wrote: > > On Thu, Feb 18, 2010 at 11:09, Tim Bunce <Tim.Bunce@pobox.com> wrote: > > *PLPerl::utf8::SWASHNEW =3D \&utf8::SWASHNEW; > > > > Hrm... It seems to work for me in HEAD and AFAICS we dont have that > > line. Did I just miss it? Or did you happen to fix it in another way > > with your refactoring? > To be honest I'm not sure. I plan to look into that today. My hunch is it has to do with the require strict; require feature; That's the only major difference I see (other than the require_op and it being in its own package/file). >> I did a few quick tests but it failed miserably for me... =C2=A0Im also = not >> fond of adding yet another closure. :) > > No amount of closure wrapping will fix the problem. Yeah, brain fart... That's essentially what Safe.pm does now (and why there is a problem :) ) >> Makes me think we might just be able to share some of utf8 package in th= e safe? > > I tried. The perl utf8.c code does a method lookup of SWASHNEW to decide > if the utf8 module has been loaded. So if SWASHNEW is shared _before_ > utf8 is loaded *and used* then the method lookup works (it finds the > shared stub) and the utf8 module never gets loaded. Hrm... That seems wrong to me. Let me see If I can explain why. The below is what you seem to be saying: package utf8; sub import { # or maybe this is a BEGIN return if(\&{'utf8::SWASHNEW'}; # already loaded # ok not loaded open the Unicode database and do junk which will 'trap' in safe do 'utf8_heavy.pl'; } So if we define SWASHNEW without loading the unicode database how will utf8/unicode work exactly? I guess as long as it gets loaded at some point it works. So for postgres because we do the utf8 fix after Safe->new and at that point we cant have any 'bad' strings, it will work. (with your hack). Sound right? It seems to me a more correct fix would be to require utf8; inside of the safe like we do strict. Sorry thats a bit handwavy. You have obviously spent more time then me looking into this... Im thinking (in pseudo code): #define SAFE_OK .... sub ::mksafefunc { permit->(qw(caller require)); reval->('require utf8; 1;'); deny->(qw(caller require)); ... } sub ::mk_strict_safefunc { ...reval->('use strict; require utf8;) } static void plperl_safe_init { if (GetDatabaseEncoding() =3D=3D PG_UTF8) { eval_pv("my $a=3Dpack('U',0xC4); $a =3D~ /\\xE4\\d/i;", FALSE); } eval_pv(SAFE_MODULE, FALSE); eval_pv(SAFE_OK, FALSE) } One thing that stinks is while we might not do the utf8fix if we are not PG_UTF8 we would always require utf8;. And I dont see an easy way around that in 8.4 :( Also note that is all entirely untested :( If you think its sane (and it might not be) Im happy to work up a patch. Id favor this approach as if you have utf8 strings the likely hood that you want ::upgrade, ::downgrade, ::encode, ::valid or ::is_utf8 is fairly high. Then again, no one has complained thus far... Maybe thats just me :) Thoughts? Anywhoo I cant reproduce this outside of postgres. Maybe you can give me a pointer? use Safe(); binmode(STDOUT, ':utf8'); print $Safe::VERSION . "\n"; my $safe =3D Safe->new('t'); $safe->permit('print'); $safe->reval('sub { print "\x{263a}\n"; }')->(); print $@ ."\n" if($@); ----- 2.22 =E2=98=BA
On Fri, Feb 19, 2010 at 06:06, Tim Bunce <Tim.Bunce@pobox.com> wrote: > I've got the patch to Safe ready but the more I think about it the more > I think the right fix is for Safe to automatically fully load utf8.pm > (and utf8_heavy.pl) and to always share SWASHNEW itself. It seems cleaner if we could just share say utf8::VERSION. SWASHNEW seems likely to be changed as it "feels" more like a implementation detail. But if thats what utf8 checks... well then thats what it checks. > Assuming perl5-porters agree then the next release of Safe will do that > ad this patch won't be needed. (Other than it possibly being worthwhile > to detect the 'bad' versions of Safe.) It seems safer if there was some way to 'opt' in say if utf8 was loaded then make safe do the above. Or maybe a pragma? use utf8 qw(utf8); We would still have to patch postgres... But I can imagine there are some users of utf8 that dont want utf8 strings. BTW as I could not reproduce this does this mean that reval->('"\x{}...") works while reval->('sub { "\x{}"}') does not ? Or is it before the first one failed while the closure based one worked?
On Fri, Feb 19, 2010 at 09:18, Alex Hunsaker <badalex@gmail.com> wrote: > It seems to me a more correct fix would be to require utf8; inside of > the safe like we do strict. > .... > Id favor this approach as if you have utf8 strings the likely hood > that you want ::upgrade, ::downgrade, ::encode, ::valid or ::is_utf8 > is fairly high. =C2=A0Then again, no one has complained thus far... =C2= =A0Maybe > thats just me :) On second thought, I dont think we should import any of those by default. And your hack for just SWASHNEW is better.
Re: [Tigerlead] BUG #5334: Version 2.22 of Perl Safe module breaks UTF8 PostgreSQL 8.4
From
"David E. Wheeler"
Date:
On Feb 19, 2010, at 1:13 AM, Tim Bunce wrote: >> Hrm. I don't have this bug with Safe 3.19, FWIW. > > That's because Safe 1.19 (which I presume you meant) doesn't execute > closures 'inside' the Safe compartment. So when the regex executes at > runtime the C code looks up the utf8::SWASHNEW method without a problem. Oh, so 2.19 is less secure in that regard, yes? David
Re: [Tigerlead] BUG #5334: Version 2.22 of Perl Safe module breaks UTF8 PostgreSQL 8.4
From
Tim Bunce
Date:
On Fri, Feb 19, 2010 at 09:00:59AM -0800, David E. Wheeler wrote: > On Feb 19, 2010, at 1:13 AM, Tim Bunce wrote: > > >> Hrm. I don't have this bug with Safe 3.19, FWIW. > > > > That's because Safe 1.19 (which I presume you meant) doesn't execute > > closures 'inside' the Safe compartment. So when the regex executes at > > runtime the C code looks up the utf8::SWASHNEW method without a problem. > > Oh, so 2.19 is less secure in that regard, yes? Yes. When code that was compiled outside the compartment is executed by a plperl function, including internal regex implementation code, that code could call eval/require/do etc. and the newly compiled code wouldn't have any restrictions. With Safe 2.20+ the newly compiled code is subject to the same restrictions as plperl. So what we're seeing is the knock-on effects of that tighting of security. That's why I'd rather move forward rather than back (though there have been times over the last 48 hours where moving back to Safe 1.19 looked very appealing :) Tim.
On Fri, Feb 19, 2010 at 09:32:38AM -0700, Alex Hunsaker wrote: > On Fri, Feb 19, 2010 at 09:18, Alex Hunsaker <badalex@gmail.com> wrote: > > It seems to me a more correct fix would be to require utf8; inside of > > the safe like we do strict. > > .... > > Id favor this approach as if you have utf8 strings the likely hood > > that you want ::upgrade, ::downgrade, ::encode, ::valid or ::is_utf8 > > is fairly high. Then again, no one has complained thus far... Maybe > > thats just me :) > > On second thought, I dont think we should import any of those by > default. And your hack for just SWASHNEW is better. Here's the corresponding perlbug http://rt.perl.org/rt3/Ticket/Display.html?id=72942 I'll retest 8.4 and 9.0 against this on Monday. Tim.
On Fri, Feb 19, 2010 at 14:00, Tim Bunce <Tim.Bunce@pobox.com> wrote: > On Fri, Feb 19, 2010 at 09:32:38AM -0700, Alex Hunsaker wrote: >> On Fri, Feb 19, 2010 at 09:18, Alex Hunsaker <badalex@gmail.com> wrote: >> > It seems to me a more correct fix would be to require utf8; inside of >> > the safe like we do strict. >> > .... >> > Id favor this approach as if you have utf8 strings the likely hood >> > that you want ::upgrade, ::downgrade, ::encode, ::valid or ::is_utf8 >> > is fairly high. =C2=A0Then again, no one has complained thus far... = =C2=A0Maybe >> > thats just me :) >> >> On second thought, I dont think we should import any of those by >> default. =C2=A0And your hack for just SWASHNEW is better. Funny.. Safe.pm already does this (share various utf8:: functions) so I think there should be no question that what you did in the patch below is correct and a bug with Safe. Sorry for the handwaves, that was me trying to understand the problem and your fix. :) > Here's the corresponding perlbug http://rt.perl.org/rt3/Ticket/Display.ht= ml?id=3D72942 Hrm... Is the require utf8; strictly needed? A reading of perldoc utf8 seems to say the do { my $unicode =3D ... } (aka load utf8_heavy.pl) part should make it all work fine. It also seems to still work t/safeutf8.t ....... ok *shrug* > I'll retest 8.4 and 9.0 against this on Monday. Ill see if I can squeeze in some pg 8.4 perl 5.10.1 linux x86_64 testing tonight of the above. (Ill just reply to the perl bug )
On Fri, Feb 19, 2010 at 02:22:33PM -0700, Alex Hunsaker wrote: > On Fri, Feb 19, 2010 at 14:00, Tim Bunce <Tim.Bunce@pobox.com> wrote: > > On Fri, Feb 19, 2010 at 09:32:38AM -0700, Alex Hunsaker wrote: > >> On Fri, Feb 19, 2010 at 09:18, Alex Hunsaker <badalex@gmail.com> wrote: > >> > It seems to me a more correct fix would be to require utf8; inside of > >> > the safe like we do strict. > >> > .... > >> > Id favor this approach as if you have utf8 strings the likely hood > >> > that you want ::upgrade, ::downgrade, ::encode, ::valid or ::is_utf8 > >> > is fairly high. Then again, no one has complained thus far... Maybe > >> > thats just me :) > >> > >> On second thought, I dont think we should import any of those by > >> default. And your hack for just SWASHNEW is better. > > Funny.. Safe.pm already does this (share various utf8:: functions) so > I think there should be no question that what you did in the patch > below is correct and a bug with Safe. Sorry for the handwaves, that > was me trying to understand the problem and your fix. :) No worries. It took a lot of head banging for me to understand the problem and the fix :) > > Here's the corresponding perlbug http://rt.perl.org/rt3/Ticket/Display.html?id=72942 > > Hrm... Is the require utf8; strictly needed? A reading of perldoc utf8 > seems to say the do { my $unicode = ... } (aka load utf8_heavy.pl) > part should make it all work fine. > > It also seems to still work > t/safeutf8.t ....... ok > > *shrug* Yesh, it's not really needed. It's harmless though, and avoids someone thinking they can simply replace the regex with the require. > > I'll retest 8.4 and 9.0 against this on Monday. > > Ill see if I can squeeze in some pg 8.4 perl 5.10.1 linux x86_64 > testing tonight of the above. (Ill just reply to the perl bug ) The patched version of Safe works fine for me with 8.4.2 and 9.0. So I no patch to PostgreSQL is needed. Though once Safe 2.23 is released then Safe 2.20 should be detected and disabled, like Safe 2.21. It's probably not very important though as I'd expect perl 5.12 to be released with Safe 2.23 so very few people would have 2.20..2.22. Tim.