Thread: [IBM z Systems] Getting server crash when jit_above_cost =0

[IBM z Systems] Getting server crash when jit_above_cost =0

From
tushar
Date:

Hi,

We are  getting a server crash on zlinux machine  if we set jit_above_cost=0 in postgresql.conf file after configuring  PG v12 server  with --with-llvm ( llvm-ttoolset-6.0)

We configured  PG v12 sources with switch --with-llvm  ( after  setting these variables on command prompt )
 export LD_LIBRARY_PATH=/opt/rh/llvm-toolset-6.0/root/usr/lib64:$LD_LIBRARY_PATH
 export LLVM_CONFIG=/opt/rh/llvm-toolset-6.0/root/usr/bin/llvm-config
 export CLANG=/opt/rh/llvm-toolset-6.0/root/usr/bin/clang
 export LDFLAGS="-Wl,-rpath,/opt/rh/llvm-toolset-6.0/root/lib64 ${LDFLAGS}"; export LDFLAGS

postgresql.conf file -
"
shared_preload_libraries=$libdir/llvmjit' ,
jit_provider = 'llvmjit'  ,
jit_above_cost = 0
jit=on,
"

able to see the crash  against any sql query

psql (12.2)
Type "help" for help.

postgres=# select 5;
2020-04-21 07:33:15.980 CDT [48149] DEBUG:  StartTransaction(1) name: unnamed; blockState: DEFAULT; state: INPROGRESS, xid/subid/cid: 0/1/0
2020-04-21 07:33:15.980 CDT [48149] DEBUG:  probing availability of JIT provider at /home/edb/pg/edb/edbpsql/lib/postgresql/llvmjit.so
2020-04-21 07:33:15.980 CDT [48149] DEBUG:  successfully loaded JIT provider in current session
2020-04-21 07:33:15.981 CDT [48149] DEBUG:  LLVMJIT detected CPU "z13", with features ""
terminate called after throwing an instance of 'std::bad_function_call'
  what():  bad_function_call
server closed the connection unexpectedly
    This probably means the server terminated abnormally
    before or while processing the request.
The connection to the server was lost. Attempting reset: 2020-04-21 07:33:16.476 CDT [48137] DEBUG:  reaping dead processes

Stack trace
[edb@etpgabc bin]$ gdb -q -c data/core.31542 postgres
Reading symbols from /home/edb/pg/edb/edbpsql/bin/postgres...done.
[New LWP 31542]
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib64/libthread_db.so.1".
Core was generated by `postgres: edb postgres [local] SELECT        '.
Program terminated with signal 6, Aborted.
#0  0x000003ffa9841220 in raise () from /lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install glibc-2.17-260.el7.s390x libedit-3.0-12.20121213cvs.el7.s390x libffi-3.0.13-18.el7.s390x libgcc-4.8.5-39.el7.s390x libstdc++-4.8.5-39.el7.s390x llvm-toolset-6.0-llvm-libs-6.0.1-5.el7.s390x ncurses-libs-5.9-14.20130511.el7_4.s390x zlib-1.2.7-18.el7.s390x
(gdb) bt
#0  0x000003ffa9841220 in raise () from /lib64/libc.so.6
#1  0x000003ffa9842aa8 in abort () from /lib64/libc.so.6
#2  0x000003ff9f7881b4 in __gnu_cxx::__verbose_terminate_handler() () from /lib64/libstdc++.so.6
#3  0x000003ff9f785c7e in ?? () from /lib64/libstdc++.so.6
#4  0x000003ff9f785cb6 in std::terminate() () from /lib64/libstdc++.so.6
#5  0x000003ff9f785f60 in __cxa_throw () from /lib64/libstdc++.so.6
#6  0x000003ff9f7e4468 in std::__throw_bad_function_call() () from /lib64/libstdc++.so.6
#7  0x000003ffa139e5c4 in std::function<std::unique_ptr<llvm::orc::IndirectStubsManager, std::default_delete<llvm::orc::IndirectStubsManager> > ()>::operator()() const () from /opt/rh/llvm-toolset-6.0/root/usr/lib64/libLLVM-6.0.so
#8  0x000003ffa139f2a8 in LLVMOrcCreateInstance () from /opt/rh/llvm-toolset-6.0/root/usr/lib64/libLLVM-6.0.so
#9  0x000003ffa9c8a984 in llvm_session_initialize () at llvmjit.c:670
#10 llvm_create_context (jitFlags=<optimized out>) at llvmjit.c:146
#11 0x000003ffa9c98992 in llvm_compile_expr (state=0xa8c52218) at llvmjit_expr.c:131
#12 0x0000000080219986 in ExecReadyExpr (state=state@entry=0xa8c52218) at execExpr.c:628
#13 0x000000008021cd6e in ExecBuildProjectionInfo (targetList=<optimized out>, econtext=<optimized out>, slot=<optimized out>, parent=parent@entry=0xa8c51e30, inputDesc=inputDesc@entry=0x0) at execExpr.c:472
#14 0x0000000080232ed6 in ExecAssignProjectionInfo (planstate=planstate@entry=0xa8c51e30, inputDesc=inputDesc@entry=0x0) at execUtils.c:504
#15 0x0000000080250178 in ExecInitResult (node=node@entry=0xa8c4fb98, estate=estate@entry=0xa8c51bf0, eflags=eflags@entry=16) at nodeResult.c:221
#16 0x000000008022c72c in ExecInitNode (node=node@entry=0xa8c4fb98, estate=estate@entry=0xa8c51bf0, eflags=eflags@entry=16) at execProcnode.c:164
#17 0x000000008022675e in InitPlan (eflags=16, queryDesc=0xa8c4f7d0) at execMain.c:1020
#18 standard_ExecutorStart (queryDesc=0xa8c4f7d0, eflags=16) at execMain.c:266
#19 0x0000000080388868 in PortalStart (portal=portal@entry=0xa8c91c80, params=params@entry=0x0, eflags=eflags@entry=0, snapshot=snapshot@entry=0x0) at pquery.c:518
#20 0x0000000080384b2e in exec_simple_query (query_string=query_string@entry=0xa8c06170 "select 5;") at postgres.c:1176
#21 0x00000000803852e0 in PostgresMain (argc=<optimized out>, argv=argv@entry=0xa8c55db8, dbname=0xa8c55c80 "postgres", username=<optimized out>) at postgres.c:4247
#22 0x000000008008007e in BackendRun (port=<optimized out>, port=<optimized out>) at postmaster.c:4437
#23 BackendStartup (port=0xa8c4dc10) at postmaster.c:4128
#24 ServerLoop () at postmaster.c:1704
#25 0x000000008030c89e in PostmasterMain (argc=argc@entry=3, argv=argv@entry=0xa8c00cc0) at postmaster.c:1377
#26 0x00000000800811f4 in main (argc=<optimized out>, argv=0xa8c00cc0) at main.c:228

-- 
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company

Re: [IBM z Systems] Getting server crash when jit_above_cost =0

From
Thomas Munro
Date:
On Wed, Apr 22, 2020 at 2:34 AM tushar <tushar.ahuja@enterprisedb.com> wrote:
> (gdb) bt
> #0  0x000003ffa9841220 in raise () from /lib64/libc.so.6
> #1  0x000003ffa9842aa8 in abort () from /lib64/libc.so.6
> #2  0x000003ff9f7881b4 in __gnu_cxx::__verbose_terminate_handler() () from /lib64/libstdc++.so.6
> #3  0x000003ff9f785c7e in ?? () from /lib64/libstdc++.so.6
> #4  0x000003ff9f785cb6 in std::terminate() () from /lib64/libstdc++.so.6
> #5  0x000003ff9f785f60 in __cxa_throw () from /lib64/libstdc++.so.6
> #6  0x000003ff9f7e4468 in std::__throw_bad_function_call() () from /lib64/libstdc++.so.6
> #7  0x000003ffa139e5c4 in std::function<std::unique_ptr<llvm::orc::IndirectStubsManager,
std::default_delete<llvm::orc::IndirectStubsManager>> ()>::operator()() const () from
/opt/rh/llvm-toolset-6.0/root/usr/lib64/libLLVM-6.0.so
> #8  0x000003ffa139f2a8 in LLVMOrcCreateInstance () from /opt/rh/llvm-toolset-6.0/root/usr/lib64/libLLVM-6.0.so
> #9  0x000003ffa9c8a984 in llvm_session_initialize () at llvmjit.c:670
> #10 llvm_create_context (jitFlags=<optimized out>) at llvmjit.c:146
> #11 0x000003ffa9c98992 in llvm_compile_expr (state=0xa8c52218) at llvmjit_expr.c:131
> #12 0x0000000080219986 in ExecReadyExpr (state=state@entry=0xa8c52218) at execExpr.c:628
> #13 0x000000008021cd6e in ExecBuildProjectionInfo (targetList=<optimized out>, econtext=<optimized out>,
slot=<optimizedout>, parent=parent@entry=0xa8c51e30, inputDesc=inputDesc@entry=0x0) at execExpr.c:472
 
> #14 0x0000000080232ed6 in ExecAssignProjectionInfo (planstate=planstate@entry=0xa8c51e30,
inputDesc=inputDesc@entry=0x0)at execUtils.c:504
 
> #15 0x0000000080250178 in ExecInitResult (node=node@entry=0xa8c4fb98, estate=estate@entry=0xa8c51bf0,
eflags=eflags@entry=16)at nodeResult.c:221
 
> #16 0x000000008022c72c in ExecInitNode (node=node@entry=0xa8c4fb98, estate=estate@entry=0xa8c51bf0,
eflags=eflags@entry=16)at execProcnode.c:164
 
> #17 0x000000008022675e in InitPlan (eflags=16, queryDesc=0xa8c4f7d0) at execMain.c:1020
> #18 standard_ExecutorStart (queryDesc=0xa8c4f7d0, eflags=16) at execMain.c:266
> #19 0x0000000080388868 in PortalStart (portal=portal@entry=0xa8c91c80, params=params@entry=0x0,
eflags=eflags@entry=0,snapshot=snapshot@entry=0x0) at pquery.c:518
 
> #20 0x0000000080384b2e in exec_simple_query (query_string=query_string@entry=0xa8c06170 "select 5;") at
postgres.c:1176
> #21 0x00000000803852e0 in PostgresMain (argc=<optimized out>, argv=argv@entry=0xa8c55db8, dbname=0xa8c55c80
"postgres",username=<optimized out>) at postgres.c:4247
 
> #22 0x000000008008007e in BackendRun (port=<optimized out>, port=<optimized out>) at postmaster.c:4437
> #23 BackendStartup (port=0xa8c4dc10) at postmaster.c:4128
> #24 ServerLoop () at postmaster.c:1704
> #25 0x000000008030c89e in PostmasterMain (argc=argc@entry=3, argv=argv@entry=0xa8c00cc0) at postmaster.c:1377
> #26 0x00000000800811f4 in main (argc=<optimized out>, argv=0xa8c00cc0) at main.c:228

Hi Tushar,

When testing this stuff on a few different platforms, I ran into a
switch statement in llvm that returned an empty std::function<> that
would throw std::bad_function_call like that, on architectures other
than (IIRC) x86 and ARM:

https://www.postgresql.org/message-id/CAEepm%3D39F_B3Ou8S3OrUw%2BhJEUP3p%3DwCu0ug-TTW67qKN53g3w%40mail.gmail.com

I'm not sure if you're seeing the same problem or another similar one,
but I know that Andres got a patch along those lines into llvm.  Maybe
you could try on a more recent llvm release?



Re: [IBM z Systems] Getting server crash when jit_above_cost =0

From
tushar
Date:
On 4/22/20 2:40 AM, Thomas Munro wrote:
> I'm not sure if you're seeing the same problem or another similar one,
> but I know that Andres got a patch along those lines into llvm.  Maybe
> you could try on a more recent llvm release?
Thanks a lot Thomas. it is working fine with llvm-toolset-7.0. look 
like  issue is with llvm-toolset-6.0 .
Yesterday when we installed llvm-toolset-7  (yum install 
llvm-toolset-7.0), there was no llvm-config available under 
/opt/rh/llvm-toolset-7.0/root/usr/bin/
so we ,chosen  llvm-toolset-6 with PG v12.
today , we again fired this same yum command using asterisk , now all 
the required file have been placed under llvm-toolset-7 directory and 
things look fine.

-- 
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company