Re: segmentation fault postgres 9.3.5 core dump perlu related ? - Mailing list pgsql-general

From Day, David
Subject Re: segmentation fault postgres 9.3.5 core dump perlu related ?
Date
Msg-id 401084E5E73F4241A44F3C9E6FD7942801140533AB@exch-01
Whole thread Raw
In response to Re: segmentation fault postgres 9.3.5 core dump perlu related ?  (Alex Hunsaker <badalex@gmail.com>)
Responses Re: segmentation fault postgres 9.3.5 core dump perlu related ?
List pgsql-general

I am amending the info threads info there are two threads.

 

I was using the wrong instance of the gdb debugger.

Program terminated with signal SIGSEGV, Segmentation fault.

 

(gdb) bt

#0  0x000000080bfa50a3 in Perl_fbm_instr () from /usr/local/lib/perl5/5.18/mach/CORE/libperl.so.5.18

#1  0x000000080c00ff93 in Perl_re_intuit_start () from /usr/local/lib/perl5/5.18/mach/CORE/libperl.so.5.18

#2  0x000000080bfc27a2 in Perl_pp_match () from /usr/local/lib/perl5/5.18/mach/CORE/libperl.so.5.18

#3  0x000000080bfbe6a3 in Perl_runops_standard () from /usr/local/lib/perl5/5.18/mach/CORE/libperl.so.5.18

#4  0x000000080bf57bd8 in Perl_call_sv () from /usr/local/lib/perl5/5.18/mach/CORE/libperl.so.5.18

#5  0x000000080bcfb7c7 in plperl_call_perl_func () from /usr/local/lib/postgresql/plperl.so

#6  0x000000080bcf83c2 in plperl_call_handler () from /usr/local/lib/postgresql/plperl.so

#7  0x000000000057611f in ExecMakeTableFunctionResult ()

#8  0x000000000058b6c7 in ?? ()

#9  0x000000000057bab2 in ExecScan ()

#10 0x00000000005756b8 in ExecProcNode ()

#11 0x00000000005876a8 in ExecLimit ()

#12 0x0000000000575771 in ExecProcNode ()

#13 0x0000000000573630 in standard_ExecutorRun ()

#14 0x0000000000593294 in ?? ()

#15 0x000000000059379c in SPI_execute_plan_with_paramlist ()

#16 0x00000008024f19bc in ?? () from /usr/local/lib/postgresql/plpgsql.so

#17 0x00000008024ee909 in ?? () from /usr/local/lib/postgresql/plpgsql.so

#18 0x00000008024eaf3b in ?? () from /usr/local/lib/postgresql/plpgsql.so

#19 0x00000008024ea243 in plpgsql_exec_function () from /usr/local/lib/postgresql/plpgsql.so

#20 0x00000008024e6551 in plpgsql_call_handler () from /usr/local/lib/postgresql/plpgsql.so

#21 0x000000000057611f in ExecMakeTableFunctionResult ()

#22 0x000000000058b6c7 in ?? ()

#23 0x000000000057bab2 in ExecScan ()

#24 0x00000000005756b8 in ExecProcNode ()

#25 0x0000000000573630 in standard_ExecutorRun ()

#26 0x0000000000645b0a in ?? ()

#27 0x0000000000645719 in PortalRun ()

#28 0x00000000006438ea in PostgresMain ()

#29 0x00000000005ff267 in PostmasterMain ()

#30 0x00000000005a31ba in main ()

(gdb) info thread

  Id   Target Id         Frame

* 2    Thread 802c06400 (LWP 101353) 0x000000080bfa50a3 in Perl_fbm_instr ()

   from /usr/local/lib/perl5/5.18/mach/CORE/libperl.so.5.18

* 1    Thread 802c06400 (LWP 101353) 0x000000080bfa50a3 in Perl_fbm_instr ()

   from /usr/local/lib/perl5/5.18/mach/CORE/libperl.so.5.18

 

 

Hi Alan,

 

Thanks for your  input.

 

My initial simplistic stress  test ( two connections calling same suspect  function in a loop ) has failed in causing the problem albeit I have not used  any  range of inputs for the possible parameters. Given your thoughts on the the internal mechnanics it seems unlikely it is competing sessions.    I’ll see about varying and logging  arguments in future testing.   Reproducing is 90 % of the battle and

unfortunately we are losing on that front currently.

 

When I type (gdb) info threads  on the most recent core file I see:

* 1 Thread 802c06400 (LWP 101353/postgres)  0x00000000005756b8 in ExecProcNode ()

Not sure that fits with your expectations.

 

We only have two invoked perl functions in the database both of which are plperlu.  These functions are

both invoked at least once  in a normal usage  scenario,  which makes the infrequency of the segmentation fault puzzling.

 

 

Regards

 

 

Dave

 

 

 

 

From: Alex Hunsaker [mailto:badalex@gmail.com]
Sent: Thursday, January 29, 2015 12:58 AM
To: Day, David
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] segmentation fault postgres 9.3.5 core dump perlu related ?

 

 

On Wed, Jan 28, 2015 at 1:23 PM, Day, David <dday@redcom.com> wrote:

It has been some time since we have seen this problem.
See earlier  message on this subject/thread  for the suspect  plperl function executing
at the time of the core.

Someone on our development team  suggested it might relate to some build options of perl.
In particular MULTIPLICITY or THREADS . We can have this perl fx executing on
two different connections/sessions at the same time.

 

Hrm, I can't see how >1 connections/sessions could tickle the bug. Or THREADS/MULTIPLICITY, short of some perl bug. Each backend is its own process and so each perl interpreter is isolated at from each other at that level. IOW each new connection has its very own perl interpreter that has no shared state with any of the others (short of using $_SHARED). But hey, if your testing finds it is easier to trigger with more connections, it just makes the bug more interesting :).

 

open as use use it should just be standard pipe(); fork(); exec(); dance. And I'm fairly certain perl does not do anything magic like making a thread behind the scene. In gdb you could also try "info threads", just to see if somehow a thread did created.

 

Multiplicity should only come into play if you use plperl and plperlu in the same session (without it, it should error out with "Cannot allocate multiple Perl interpreters on this platform").

 

 


I believe below is an valid stack dump:

Core was generated by `postgres'.
Program terminated with signal 11, Segmentation fault.
(gdb) bt
#0  0x000000080bfa50a3 in Perl_fbm_instr () from /usr/local/lib/perl5/5.18/mach/CORE/libperl.so.5.18
#1  0x000000080c00ff93 in Perl_re_intuit_start () from /usr/local/lib/perl5/5.18/mach/CORE/libperl.so.5.18
#2  0x000000080bfc27a2 in Perl_pp_match () from /usr/local/lib/perl5/5.18/mach/CORE/libperl.so.5.18

 

This sure makes it look like it is segfaulting on some kind of regex /not/ open.

 

Any chance you could come up with a reproducible test case? I suspect the inputs to the function might help narrow it down to something reproducible. Maybe log the arguments at the start of the function? Or perhaps in your middleware when calling the function crashes, log how it was called?

 

pgsql-general by date:

Previous
From: Juan Pablo L
Date:
Subject: array in a store procedure in C
Next
From: Tom Lane
Date:
Subject: Re: segmentation fault postgres 9.3.5 core dump perlu related ?