Thread: BUG #1831: plperl gives error after reconnect.
The following bug has been logged online: Bug reference: 1831 Logged by: Greg Sabino Mullane Email address: greg@turnstep.com PostgreSQL version: 8.0.3 Operating system: Linux Description: plperl gives error after reconnect. Details: Tested on 8.0.1 and in current cvs. This only happens if all the steps below are followed, including the reconnect. \c postgres CREATE TABLE g (name TEXT); CREATE OR REPLACE FUNCTION testone() RETURNS text LANGUAGE plperl AS $$ spi_exec_query(qq{INSERT INTO g(name) VALUES ('abc')}); return "ok"; $$; CREATE OR REPLACE FUNCTION enamer() RETURNS TRIGGER LANGUAGE plperl AS $$ return; $$; CREATE TRIGGER trigtest BEFORE INSERT ON g FOR EACH ROW EXECUTE PROCEDURE enamer(); \c postgres select testone(); ERROR: error from Perl function: creation of Perl function failed: (in cleanup) Undefined subroutine &main::mksafefunc called at (eval 4) line 2. at (eval 4) line 2.
"Greg Sabino Mullane" <greg@turnstep.com> writes: > ERROR: error from Perl function: creation of Perl function failed: > (in cleanup) Undefined subroutine &main::mksafefunc called at (eval 4) line > 2. at (eval 4) line 2. I could not duplicate this in either 8.0 or HEAD branches. It looks a bit like an old bug that we had in plperl, though. Are you sure your plperl.so is up to date? regards, tom lane
On Wed, Aug 17, 2005 at 06:46:20PM -0400, Tom Lane wrote: > "Greg Sabino Mullane" <greg@turnstep.com> writes: > > ERROR: error from Perl function: creation of Perl function failed: > > (in cleanup) Undefined subroutine &main::mksafefunc called at (eval 4) line > > 2. at (eval 4) line 2. > > I could not duplicate this in either 8.0 or HEAD branches. It looks > a bit like an old bug that we had in plperl, though. Are you sure your > plperl.so is up to date? Could this be another "depends on the junk on your stack" bug? I get different results depending on the OS and version of PostgreSQL: 8.0.3 (from CVS), FreeBSD 4.11-STABLE, Perl 5.8.7 (from ports) * error with or without reconnect HEAD, FreeBSD 4.11-STABLE, Perl 5.8.7 (from ports) * success without reconnect, error with reconnect 8.0.3 (from CVS), Solaris 9, Perl 5.8.7 (from source) * error with or without reconnect HEAD, Solaris 9, Perl 5.8.7 (from source) * error with or without reconnect The configure options for all PostgreSQL builds are nearly identical except that the builds on FreeBSD don't have --enable-cassert. -- Michael Fuhr
On Wed, Aug 17, 2005 at 06:49:11PM -0600, Michael Fuhr wrote: > On Wed, Aug 17, 2005 at 06:46:20PM -0400, Tom Lane wrote: > > "Greg Sabino Mullane" <greg@turnstep.com> writes: > > > ERROR: error from Perl function: creation of Perl function failed: > > > (in cleanup) Undefined subroutine &main::mksafefunc called at (eval 4) line > > > 2. at (eval 4) line 2. > > > > I could not duplicate this in either 8.0 or HEAD branches. It looks > > a bit like an old bug that we had in plperl, though. Are you sure your > > plperl.so is up to date? > > Could this be another "depends on the junk on your stack" bug? I > get different results depending on the OS and version of PostgreSQL: Also, on my systems the trigger isn't necessary, but the function call history is significant. This is in HEAD: \c test CREATE OR REPLACE FUNCTION foo() RETURNS text AS $$ return "foo"; $$ LANGUAGE plperl; CREATE OR REPLACE FUNCTION bar() RETURNS text AS $$ my $rv = spi_exec_query("SELECT foo() AS x"); return $rv->{rows}[0]->{x}; $$ LANGUAGE plperl; \c test SELECT bar(); ERROR: error from Perl function: creation of Perl function failed: (in cleanup) Undefined subroutine &main::mksafefunccalled at (eval 5) line 2. at (eval 5) line 2. SELECT bar(); ERROR: error from Perl function: creation of Perl function failed: (in cleanup) Undefined subroutine &main::mksafefunccalled at (eval 5) line 2. at (eval 5) line 2. SELECT foo(); foo ----- foo (1 row) SELECT bar(); bar ----- foo (1 row) I verified that the postmaster is using a current plperl.so by adding a debugging ereport() statement in plperl_call_perl_func() (output not shown above). -- Michael Fuhr
Michael Fuhr <mike@fuhr.org> writes: > Could this be another "depends on the junk on your stack" bug? Looks that way --- but I've still had no success in reproducing it, either on x86/Linux or PPC/Darwin. Anyone have some variant test cases? regards, tom lane
On Thu, Aug 18, 2005 at 12:26:28AM -0400, Tom Lane wrote: > Michael Fuhr <mike@fuhr.org> writes: > > Could this be another "depends on the junk on your stack" bug? > > Looks that way --- but I've still had no success in reproducing it, > either on x86/Linux or PPC/Darwin. Anyone have some variant test > cases? I see different results depending on whether the calling function is plperl or plperlu. Here again are the functions -- I'll change only the language: CREATE OR REPLACE FUNCTION foo() RETURNS text AS $$ return "foo"; $$ LANGUAGE plperl; CREATE OR REPLACE FUNCTION bar() RETURNS text AS $$ my $rv = spi_exec_query("SELECT foo()"); return $rv->{rows}[0]->{foo}; $$ LANGUAGE plperl; SELECT bar(); With HEAD on Solaris 9/sparc I don't have to reconnect before the SELECT to get the error (I tested both ways, with and without a reconnect, and it made no difference). Here's what I get with various language combinations: foo plperl, bar plperl - Undefined subroutine &main::mksafefunc foo plperl, bar plperlu - ok foo plperlu, bar plperl - Undefined subroutine &main::mkunsafefunc foo plperlu, bar plperlu - ok I get the same results on FreeBSD 4.11-STABLE/x86 but I have to reconnect before the SELECT to get the error. On both systems, if I execute SELECT foo() before SELECT bar() then I don't get the error. -- Michael Fuhr
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Tom Lane asked: > I could not duplicate this in either 8.0 or HEAD branches. It looks > a bit like an old bug that we had in plperl, though. Are you sure your > plperl.so is up to date? Looks like Michael is already far along, but yes, my plperl.so was up to date. This is on a Red Hat Linux box, using --with-perl and --with-gnu-ld as the only compile options. It's a very subtle bug: on my box, simply leaving out the trigger definition, or having the function not do a spi_exec_query will not raise the error. I've worked around this locally by not using plperlu (hence the original reason to switch to another user), but I sure miss being able to do "use strict" :) - -- Greg Sabino Mullane greg@turnstep.com PGP Key: 0x14964AC8 200508181050 https://www.biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -----BEGIN PGP SIGNATURE----- iEYEARECAAYFAkMEoFkACgkQvJuQZxSWSsjTpwCgmt9kLApba6xDygvgl5qb/vdc Zh4AoPx1or9LLWSTUZQDcDjxJCfNBb08 =5Jt7 -----END PGP SIGNATURE-----
On Thu, Aug 18, 2005 at 02:52:02PM -0000, Greg Sabino Mullane wrote: > Tom Lane asked: > > I could not duplicate this in either 8.0 or HEAD branches. It looks > > a bit like an old bug that we had in plperl, though. Are you sure your > > plperl.so is up to date? > > Looks like Michael is already far along, but yes, my plperl.so was up to date. > This is on a Red Hat Linux box, using --with-perl and --with-gnu-ld as the > only compile options. It's a very subtle bug: on my box, simply leaving out > the trigger definition, or having the function not do a spi_exec_query will > not raise the error. Tom Lane once mentioned that "Valgrind is fairly useless for debugging postgres," but has anybody tried it for this problem? I tried using the FreeBSD port but it's having trouble (first I had to hack in support for a system call, now it's terminating the postmaster with SIBGUS on a call to setproctitle). -- Michael Fuhr
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 > Tom Lane once mentioned that "Valgrind is fairly useless for debugging > postgres," but has anybody tried it for this problem? I tried using > the FreeBSD port but it's having trouble (first I had to hack in > support for a system call, now it's terminating the postmaster with > SIBGUS on a call to setproctitle). I've got valgrind working, but not sure exactly how to use it to debug this problem. What's the procedure? - -- Greg Sabino Mullane greg@turnstep.com PGP Key: 0x14964AC8 200508190955 https://www.biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -----BEGIN PGP SIGNATURE----- iEYEARECAAYFAkMF5N8ACgkQvJuQZxSWSsi6eQCggFJT5i9phqGomACJk/ZIKDgS vv8AnROppubywG9bY2ZU26MMfG3lKPdj =+srT -----END PGP SIGNATURE-----
On Fri, Aug 19, 2005 at 01:56:39PM -0000, Greg Sabino Mullane wrote: > > Tom Lane once mentioned that "Valgrind is fairly useless for debugging > > postgres," but has anybody tried it for this problem? > > I've got valgrind working, but not sure exactly how to use it to debug > this problem. What's the procedure? I haven't used valgrind much, but I was thinking of initdb'ing a test cluster, loading the PL/Perl functions into it, and running valgrind on a single-user-mode postgres process in which you issue a query that causes the problem. I'm wondering if valgrind's memory checks will show the code accessing memory that hasn't been initialized. -- Michael Fuhr
On Wed, Aug 17, 2005 at 11:37:58PM -0600, Michael Fuhr wrote: > With HEAD on Solaris 9/sparc I don't have to reconnect before the > SELECT to get the error (I tested both ways, with and without a > reconnect, and it made no difference). > I get the same results on FreeBSD 4.11-STABLE/x86 but I have to > reconnect before the SELECT to get the error. I discovered why these systems behaved differently: the Solaris box didn't have validator functions for plperl/plperlu. Apparently I had createlang'ed them before the validator function was added a couple of months ago: http://archives.postgresql.org/pgsql-committers/2005-06/msg00322.php I dropped and recreated plperl and plperlu on the Solaris box and now I do have to reconnect to get the error. Here's a summary of what I see on both systems: * The functions are foo() and bar(), both created as plperl. bar() calls foo() using spi_exec_query(). * In HEAD, where plperl has a validator, if I create the functions and call bar() in the same session, it works. If I reconnect and call bar() then I get the error. * In 8.0.3, where plperl has no validator, if I create the functions and call bar() in the same session, I get the error. Likewise if I reconnect. * In either version, if bar() is plperlu then I don't get the error, regardless of whether I've reconnected since creating the functions. Creating foo() as plperlu has no effect other than changing the undefined subroutine's name to mkunsafefunc. Something about the calling function being plperlu and something the validator does appear to be relevant, at least on my two systems. -- Michael Fuhr
On Fri, Aug 19, 2005 at 09:03:48PM -0600, Michael Fuhr wrote: > Here's a summary of what I see on both systems: I mentioned this before but forgot to include it in the summary: * If I execute "SELECT foo()" before I execute "SELECT bar()" then I don't get the error. Here's my current test case, run in HEAD in a database named "test" that has plperl: CREATE FUNCTION foo() RETURNS text AS $$ return "foo"; $$ LANGUAGE plperl; CREATE FUNCTION bar() RETURNS text AS $$ my $rv = spi_exec_query("SELECT foo($_[0])"); return $rv->{rows}[0]->{foo}; $$ LANGUAGE plperl; SELECT bar(); bar ----- foo (1 row) \c test You are now connected to database "test". SELECT bar(); ERROR: error from Perl function: creation of Perl function failed: (in cleanup) Undefined subroutine &main::mksafefunccalled at (eval 7) line 2. at (eval 7) line 2. SELECT foo(); foo ----- foo (1 row) SELECT bar(); bar ----- foo (1 row) -- Michael Fuhr
I'm wondering if this is Perl version dependent. I've tried with Fedora Core 3: This is perl, v5.8.5 built for i386-linux-thread-multi Darwin 10.4.2: This is perl, v5.8.6 built for darwin-thread-multi-2level with no failures observed... what are you guys using? regards, tom lane
[ eyeing current plperl code... ] Would anyone like to explain what ::_plperl_to_pg_array is for, and why it's only created by loose_embedding[] and not strict_embedding[]? Not that this looks to have any immediate impact on the problem at hand, but it still looks a tad broken. regards, tom lane
On Sat, Aug 20, 2005 at 12:03:46AM -0400, Tom Lane wrote: > I'm wondering if this is Perl version dependent. I've tried with > > Fedora Core 3: > This is perl, v5.8.5 built for i386-linux-thread-multi > Darwin 10.4.2: > This is perl, v5.8.6 built for darwin-thread-multi-2level > > with no failures observed... what are you guys using? FreeBSD 4.11-STABLE/x86 This is perl, v5.8.7 built for i386-freebsd-64int (built from ports directory with gcc 2.95.4) Solaris 9/sparc This is perl, v5.8.7 built for sun4-solaris (built from source with gcc 3.4.2) I just built PL/Perl on the Solaris box with Perl 5.8.6 and got the same results. I used ldd to verify that plperl.so was linked against 5.8.6, and I had one of the functions return Perl's $] variable just to make sure. I see that both of your Perl version strings have "thread-multi" whereas neither of mine do. I don't know if that's relevant, but it's something different about the Perl builds. % perl -V | grep thread usethreads=undef use5005threads=undef useithreads=undef usemultiplicity=undef You're doing the reconnect, right? The error appears to happen during the SPI call if the called function hasn't been compiled yet, either by being previously called or by being validated during creation. -- Michael Fuhr
Michael Fuhr <mike@fuhr.org> writes: > I see that both of your Perl version strings have "thread-multi" > whereas neither of mine do. I don't know if that's relevant, but > it's something different about the Perl builds. Hm. I also have a 5.8.7-no-threads Perl build on HPPA. I hadn't tried that because that platform sucks at debugging shared libraries. But building there now to see if I can reproduce the problem. > You're doing the reconnect, right? Sure. It seems perfectly clear that this has something to do with being the first call in a session. (The validator function would also count as the first call, which explains your previous Solaris results.) One thing I was kind of wondering about (this will betray the fact that I haven't really studied Perl since it was Perl 4) is what are the namespace issues for ::mksafefunc? In particular, could the inner invocation of foo() be executing in some context that makes the original definition of mksafefunc inaccessible? And why, if we are careful to declare mksafefunc with ::, don't we call it with :: too? regards, tom lane
On Sat, Aug 20, 2005 at 12:22:05AM -0400, Tom Lane wrote: > Would anyone like to explain what ::_plperl_to_pg_array is for, and > why it's only created by loose_embedding[] and not strict_embedding[]? It looks like plperl_convert_to_pg_array() calls that Perl function to convert a Perl list reference to the string representation of PostgreSQL array, so functions like this can work: CREATE FUNCTION foo() RETURNS integer[] AS $$ return [1, 2, 3]; $$ LANGUAGE plperl; SELECT foo(); foo --------- {1,2,3} (1 row) But this example crashes the backend if plperl.use_strict is enabled :-( -- Michael Fuhr
Michael Fuhr <mike@fuhr.org> writes: > But this example crashes the backend if plperl.use_strict is enabled :-( I suspect the croak() in plperl_convert_to_pg_array (the C function) ought to be an ereport(). It looks like that is not called from inside the Perl environment, and so croak is probably sending control to exit(). regards, tom lane
On Sat, Aug 20, 2005 at 01:14:02AM -0400, Tom Lane wrote: > One thing I was kind of wondering about (this will betray the fact > that I haven't really studied Perl since it was Perl 4) is what are > the namespace issues for ::mksafefunc? In particular, could the > inner invocation of foo() be executing in some context that makes > the original definition of mksafefunc inaccessible? And why, if > we are careful to declare mksafefunc with ::, don't we call it > with :: too? You might be on to something there. I just changed line 683 in plperl.c to the following: count = perl_call_pv((trusted ? "::mksafefunc" : "::mkunsafefunc"), My test cases now succeed where they had been failing. But if that's the problem, why are your systems behaving differently? -- Michael Fuhr
On Fri, Aug 19, 2005 at 11:16:25PM -0600, Michael Fuhr wrote: > But this example crashes the backend if plperl.use_strict is enabled :-( The PL/Perl regression tests also fail if use_strict is enabled, mostly due to not using "my" in a few places. I'll work on a patch. -- Michael Fuhr
Michael Fuhr <mike@fuhr.org> writes: > You might be on to something there. I just changed line 683 in > plperl.c to the following: > count = perl_call_pv((trusted ? "::mksafefunc" : "::mkunsafefunc"), > My test cases now succeed where they had been failing. But if > that's the problem, why are your systems behaving differently? Damifino. I have just found out that the HPPA 5.8.7 no threads build fails in exactly the way you describe ... and I can't really get into it with gdb to figure out what's the problem :-( One thing I see on the HP box is that if "SELECT bar()" fails, you can do it over and over and it'll fail each time. Until you call foo directly, and then bar is OK. You didn't mention having tried that --- same for you, or not? At this point I have the impression that we may be looking at a Perl bug. I also suspect that the thread-support point is significant; though I do not know why. regards, tom lane
On Sat, Aug 20, 2005 at 01:59:59AM -0400, Tom Lane wrote: > Michael Fuhr <mike@fuhr.org> writes: > > My test cases now succeed where they had been failing. But if > > that's the problem, why are your systems behaving differently? > > Damifino. I have just found out that the HPPA 5.8.7 no threads > build fails in exactly the way you describe ... and I can't really > get into it with gdb to figure out what's the problem :-( Interesting; at least we're narrowing down the conditions. > One thing I see on the HP box is that if "SELECT bar()" fails, > you can do it over and over and it'll fail each time. Until you > call foo directly, and then bar is OK. You didn't mention having > tried that --- same for you, or not? Yep, the same. > At this point I have the impression that we may be looking at a Perl > bug. I also suspect that the thread-support point is significant; > though I do not know why. Hmmm...Perl bug, or is PL/Perl doing something that works by accident in certain Perl builds? Still, it might be interesting to run this by somebody familiar with Perl internals. I expect they'd want a simplified test case, though, so they didn't have to wade through all the irrelevant PostgreSQL stuff.... -- Michael Fuhr
On Sat, Aug 20, 2005 at 12:22:05AM -0400, Tom Lane wrote: > Would anyone like to explain what ::_plperl_to_pg_array is for, and > why it's only created by loose_embedding[] and not strict_embedding[]? loose_embedding[] and strict_embedding[] have several lines in common. Should the common elements be in a separate string or macro so they can be maintained in one place? -- Michael Fuhr
Michael Fuhr <mike@fuhr.org> writes: > loose_embedding[] and strict_embedding[] have several lines in > common. Should the common elements be in a separate string or > macro so they can be maintained in one place? Definitely, assuming you can do it without too much violence to the readability. regards, tom lane