Re: JIT compiling with LLVM v12.2 - Mailing list pgsql-hackers

From Thomas Munro
Subject Re: JIT compiling with LLVM v12.2
Date
Msg-id CAEepm=1W27iAV2cT-Lqhy8CH9FxT2P4B+5SWz2VLw+6dzJe+aA@mail.gmail.com
Whole thread Raw
In response to Re: JIT compiling with LLVM v12.2  (Andres Freund <andres@anarazel.de>)
Responses Re: JIT compiling with LLVM v12.2  (Thomas Munro <thomas.munro@enterprisedb.com>)
Re: JIT compiling with LLVM v12.2  (Andres Freund <andres@anarazel.de>)
Re: JIT compiling with LLVM v12.2  (Thomas Munro <thomas.munro@enterprisedb.com>)
List pgsql-hackers
On Wed, Mar 21, 2018 at 4:07 PM, Andres Freund <andres@anarazel.de> wrote:
> Indeed. I've pushed a rebased version now, that basically just fixes the
> issue Thomas observed.

I set up a 32 bit i386 virtual machine and installed Debian 9.4.
Compiler warnings:

gcc -Wall -Wmissing-prototypes -Wpointer-arith
-Wdeclaration-after-statement -Wendif-labels
-Wmissing-format-attribute -Wformat-security -fno-strict-aliasing
-fwrapv -fexcess-precision=standard -g -O2  -fPIC
-D__STDC_LIMIT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_CONSTANT_MACROS
-D_GNU_SOURCE -I/usr/lib/llvm-3.9/include  -I../../../../src/include
-D_GNU_SOURCE   -c -o llvmjit.o llvmjit.c
llvmjit.c: In function ‘llvm_get_function’:
llvmjit.c:268:10: warning: cast to pointer from integer of different
size [-Wint-to-pointer-cast]
   return (void *) addr;
          ^
llvmjit.c:270:10: warning: cast to pointer from integer of different
size [-Wint-to-pointer-cast]
   return (void *) addr;
          ^
llvmjit.c: In function ‘llvm_resolve_symbol’:
llvmjit.c:842:10: warning: cast from pointer to integer of different
size [-Wpointer-to-int-cast]
   addr = (uint64_t) load_external_function(modname, funcname,
          ^
llvmjit.c:845:10: warning: cast from pointer to integer of different
size [-Wpointer-to-int-cast]
   addr = (uint64_t) LLVMSearchForAddressOfSymbol(symname);
          ^

Then "make check" bombs:

Program terminated with signal SIGSEGV, Segmentation fault.
#0  0xac233453 in llvm::SelectionDAG::getNode(unsigned int,
llvm::SDLoc const&, llvm::EVT, llvm::SDValue) () from
/usr/lib/i386-linux-gnu/libLLVM-3.9.so.1
(gdb) bt
#0  0xac233453 in llvm::SelectionDAG::getNode(unsigned int,
llvm::SDLoc const&, llvm::EVT, llvm::SDValue) () from
/usr/lib/i386-linux-gnu/libLLVM-3.9.so.1
#1  0xac270c29 in llvm::TargetLowering::SimplifySetCC(llvm::EVT,
llvm::SDValue, llvm::SDValue, llvm::ISD::CondCode, bool,
llvm::TargetLowering::DAGCombinerInfo&, llvm::SDLoc const&) const ()
from /usr/lib/i386-linux-gnu/libLLVM-3.9.so.1
#2  0xac11d3a8 in ?? () from /usr/lib/i386-linux-gnu/libLLVM-3.9.so.1
#3  0xac11ef0b in ?? () from /usr/lib/i386-linux-gnu/libLLVM-3.9.so.1
#4  0xac12030e in llvm::SelectionDAG::Combine(llvm::CombineLevel,
llvm::AAResults&, llvm::CodeGenOpt::Level) () from
/usr/lib/i386-linux-gnu/libLLVM-3.9.so.1
#5  0xac24ccec in llvm::SelectionDAGISel::CodeGenAndEmitDAG() () from
/usr/lib/i386-linux-gnu/libLLVM-3.9.so.1
#6  0xac24d239 in
llvm::SelectionDAGISel::SelectBasicBlock(llvm::ilist_iterator<llvm::Instruction
const>, llvm::ilist_iterator<llvm::Instruction const>, bool&) () from
/usr/lib/i386-linux-gnu/libLLVM-3.9.so.1
#7  0xac25466f in
llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&) ()
from /usr/lib/i386-linux-gnu/libLLVM-3.9.so.1
#8  0xac25773c in
llvm::SelectionDAGISel::runOnMachineFunction(llvm::MachineFunction&)
() from /usr/lib/i386-linux-gnu/libLLVM-3.9.so.1
#9  0xad356414 in ?? () from /usr/lib/i386-linux-gnu/libLLVM-3.9.so.1
#10 0xabf5a019 in
llvm::MachineFunctionPass::runOnFunction(llvm::Function&) () from
/usr/lib/i386-linux-gnu/libLLVM-3.9.so.1
#11 0xabdefaeb in llvm::FPPassManager::runOnFunction(llvm::Function&)
() from /usr/lib/i386-linux-gnu/libLLVM-3.9.so.1
#12 0xabdefe35 in llvm::FPPassManager::runOnModule(llvm::Module&) ()
from /usr/lib/i386-linux-gnu/libLLVM-3.9.so.1
#13 0xabdf019a in llvm::legacy::PassManagerImpl::run(llvm::Module&) ()
from /usr/lib/i386-linux-gnu/libLLVM-3.9.so.1
#14 0xabdf037f in llvm::legacy::PassManager::run(llvm::Module&) ()
from /usr/lib/i386-linux-gnu/libLLVM-3.9.so.1
#15 0xacb3c3de in
std::_Function_handler<llvm::object::OwningBinary<llvm::object::ObjectFile>
(llvm::Module&), llvm::orc::SimpleCompiler>::_M_invoke(std::_Any_data
const&, llvm::Module&) () from
/usr/lib/i386-linux-gnu/libLLVM-3.9.so.1
#16 0xacb37d00 in ?? () from /usr/lib/i386-linux-gnu/libLLVM-3.9.so.1
#17 0xacb384f8 in ?? () from /usr/lib/i386-linux-gnu/libLLVM-3.9.so.1
#18 0xacb388d5 in LLVMOrcAddEagerlyCompiledIR () from
/usr/lib/i386-linux-gnu/libLLVM-3.9.so.1
#19 0xae7bb3e4 in llvm_compile_module (context=0x20858a0) at llvmjit.c:539
#20 llvm_get_function (context=0x20858a0, funcname=0x21da818
"evalexpr_2_3") at llvmjit.c:244
#21 0xae7c333e in ExecRunCompiledExpr (state=0x2119634,
econtext=0x211810c, isNull=0xbfdd138e "\207") at llvmjit_expr.c:2563
#22 0x00745e10 in ExecEvalExprSwitchContext (isNull=0xbfdd138e "\207",
econtext=<optimized out>, state=0x2119634) at
../../../src/include/executor/executor.h:305
#23 ExecQual (econtext=<optimized out>, state=0x2119634) at
../../../src/include/executor/executor.h:374
#24 ExecNestLoop (pstate=<optimized out>) at nodeNestloop.c:214
#25 0x00748ddd in ExecProcNode (node=0x2118080) at
../../../src/include/executor/executor.h:239
#26 ExecSort (pstate=0x2117ff4) at nodeSort.c:107
#27 0x0071e9d2 in ExecProcNode (node=0x2117ff4) at
../../../src/include/executor/executor.h:239
#28 ExecutePlan (execute_once=<optimized out>, dest=0x0,
direction=NoMovementScanDirection, numberTuples=<optimized out>,
sendTuples=<optimized out>, operation=CMD_SELECT,
use_parallel_mode=<optimized out>, planstate=0x2117ff4,
estate=0x2117ee8) at execMain.c:1729
#29 standard_ExecutorRun (queryDesc=0x207da50,
direction=ForwardScanDirection, count=0, execute_once=1 '\001') at
execMain.c:365
#30 0x00883e8d in PortalRunSelect (portal=portal@entry=0x20a7f58,
forward=forward@entry=1 '\001', count=0, count@entry=2147483647,
dest=0x21a8888) at pquery.c:932
#31 0x008856a0 in PortalRun (portal=0x20a7f58, count=2147483647,
isTopLevel=1 '\001', run_once=1 '\001', dest=0x21a8888,
altdest=0x21a8888, completionTag=0xbfdd1620 "") at pquery.c:773
#32 0x008808a7 in exec_simple_query
(query_string=query_string@entry=0x205a628 "SELECT '' AS tf_12_ff_4,
BOOLTBL1.*, BOOLTBL2.*\n   FROM BOOLTBL1, BOOLTBL2\n   WHERE
BOOLTBL2.f1 = BOOLTBL1.f1 or BOOLTBL1.f1 = bool 'true'\n   ORDER BY
BOOLTBL1.f1, BOOLTBL2.f1;")
    at postgres.c:1121
#33 0x0088270e in PostgresMain (argc=1, argv=0x2083c44,
dbname=<optimized out>, username=0x2083aa0 "munro") at postgres.c:4147
#34 0x00552cff in BackendRun (port=0x207d518) at postmaster.c:4409
#35 BackendStartup (port=0x207d518) at postmaster.c:4081
#36 ServerLoop () at postmaster.c:1754
#37 0x007fc68f in PostmasterMain (argc=<optimized out>,
argv=<optimized out>) at postmaster.c:1362
#38 0x0055475a in main (argc=<optimized out>, argv=<optimized out>) at
main.c:228
(gdb)

That's with clang-3.9 and llvm-3.9-dev installed, which configure
automagically found.

"make -C src/interfaces/ecpg/test check" consistently fails on my macOS machine:

test compat_oracle/char_array     ... stderr source FAILED

*** /Users/munro/projects/postgresql/src/interfaces/ecpg/test/expected/compat_oracle-char_array.stdout
 2018-03-21 09:46:33.000000000 +1300
--- /Users/munro/projects/postgresql/src/interfaces/ecpg/test/results/compat_oracle-char_array.stdout
  2018-03-21 19:13:43.000000000 +1300
***************
*** 1,10 ****
  Full Str.  :  Short  Ind.
! "          ": "    "  -1
! "AB        ": "AB  "  0
! "ABCD      ": "ABCD"  0
! "ABCDE     ": "ABCD"  5
! "ABCDEF    ": "ABCD"  6
! "ABCDEFGHIJ": "ABCD"  10

  GOOD-BYE!!

--- 1,10 ----
  Full Str.  :  Short  Ind.
! "": ""  0
! "AB": "AB"  0
! "ABCD": "ABCD"  0
! "ABCDE": "ABCDE"  0
! "ABCDEF": "ABCDE"  6
! "ABCDEFGHIJ": "ABCDE"  10

  GOOD-BYE!!


======================================================================

*** /Users/munro/projects/postgresql/src/interfaces/ecpg/test/expected/compat_oracle-char_array.stderr
 2018-03-21 16:27:05.000000000 +1300
--- /Users/munro/projects/postgresql/src/interfaces/ecpg/test/results/compat_oracle-char_array.stderr
  2018-03-21 19:13:43.000000000 +1300
***************
*** 90,96 ****
  [NO_PID]: sqlca: code: 0, state: 00000
  [NO_PID]: ecpg_get_data on line 50: RESULT: ABCDE offset: -1; array: no
  [NO_PID]: sqlca: code: 0, state: 00000
- Warning: At least one column was truncated
  [NO_PID]: ecpg_execute on line 50: query: fetch C; with 0
parameter(s) on connection ecpg1_regression
  [NO_PID]: sqlca: code: 0, state: 00000
  [NO_PID]: ecpg_execute on line 50: using PQexec
--- 90,95 ----

======================================================================

I couldn't immediately see what was going wrong there since I'm not
too familiar with ecpg...  That's with vendor cc/c++ and LLVM 5.0 and
6.0, using a couple of different clang versions.

While trying out many combinations of versions of stuff on different
OSes, I found another way to screw up that I wanted to report here.
It's obvious that this is doomed if you know what's going on, but I
thought the failure mode was interesting enough to report here.  There
is a hazard for people running systems where the vendor ships some
version (possibly a mystery version) of clang in the PATH but you have
to get LLVM separately (eg from ports/brew/whatever):

1.  If you use macOS High Sierra's current /usr/bin/clang ("9.0.0"),
ie the default if you didn't set CLANG to something else when you ran
./configure, and you build against LLVM 3.9, then llvm-lto gives this
message during "make install":

Invalid summary version 3, 1 expected
error: can't create ModuleSummaryIndexObjectFile for buffer: Corrupted bitcode

Then it segfaults!  Presumably clang "9.0.0" derives from a more
recent upstream version (why must they mess with the reported
version?!).  Apple's clang 9.0.0 bitcode works fine with LLVM 5.0.  I
don't have 4.0 to hand to test.

2.  If you use FreeBSD 11's current /usr/bin/clang (4.0) and you build
against LLVM 3.9 then it's the same:

Invalid summary version 3, 1 expected
error: can't create ModuleSummaryIndexObjectFile for buffer: Corrupted bitcode
gmake[3]: *** [Makefile:252: install-postgres-bitcode] Segmentation
fault (core dumped)

It works fine with 4.0 or 5.0, as expected.

Neither of these cases should be too surprising, and users of those
operating systems can easily get a newer LLVM or an older -- it was
just interesting to see exactly what goes wrong and exactly when.  I
suppose there could be a configure test to see if your $CLANG can play
nicely with your $LLVM_CONFIG.

--
Thomas Munro
http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Edmund Horner
Date:
Subject: Re: PATCH: psql tab completion for SELECT
Next
From: Konstantin Knizhnik
Date:
Subject: Re: Question about WalSndWriteData