Re: BUG #16971: Incompatible datalayout errors with llvmjit - Mailing list pgsql-bugs
From | Andres Freund |
---|---|
Subject | Re: BUG #16971: Incompatible datalayout errors with llvmjit |
Date | |
Msg-id | 20210420225228.qr4x6zv3hqjorh5t@alap3.anarazel.de Whole thread Raw |
In response to | BUG #16971: Incompatible datalayout errors with llvmjit (PG Bug reporting form <noreply@postgresql.org>) |
List | pgsql-bugs |
Hi, On 2021-04-20 14:42:28 -0700, Tom Stellard wrote: > On 4/20/21 12:29 PM, Andres Freund wrote: > > On 2021-04-19 18:29:52 +0000, PG Bug reporting form wrote: > > > In our Fedora builds, we are getting errors[1] in the postgresql tests due > > > to incompatible datalayouts between the JIT engine and the LLVM modules > > > being compiled. The problem is that the JIT engine is being created with > > > host specific CPU and features, while the datalayout for the compiled module > > > is being taken from llvmjit_types.bc which is compiled without any specified > > > CPU type or features. > > > > It's very odd that features would change the data layout - analogizing > > with plain C code that'd mean that you cannot link a binary compiled > > with something like -mavx2 against a library compiled without. To me > > this smells like a bug somewhere lower level. > You are correct that is odd, and to be honest, I didn't think that LLVM > targets were allowed to change the datalayout based on the CPU type. That was my impression... > > Reformatting the error yields: > > ERROR: failed to JIT module: Added modules have incompatible data layouts: > > E-m:e-i1:8:16-i8:8:16-i64:64-f128:64- a:8:16-n32:64 (module) vs > > E-m:e-i1:8:16-i8:8:16-i64:64-f128:64-v128:64-a:8:16-n32:64 (jit) > > > > The -v128:64 is about how to align vectors. Skimming the relevant LLVM > > code I don't see why it'd be included in JIted code but not native code. > The relevant code in LLVM is here: > https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/SystemZ/SystemZTargetMachine.cpp#L88 Thanks for the pointer! > I'm checking with upstream LLVM to see if this allowed or not. However, > this behavior is present in at least LLVM 11 and LLVM 12 (I haven't > checked earlier versions), so postgresql will have to deal with this > somehow. Yea, seems we need to add a workaround for the issue, given how much longer LLVM releases tend to be used than they are maintained. One simple hack would be to add "-vector" to the list of features on s390x, which afaict should avoid the issue for now? In LLVM's main branch the code is this: // Determine whether we use the vector ABI. static bool UsesVectorABI(StringRef CPU, StringRef FS) { // We use the vector ABI whenever the vector facility is avaiable. // This is the case by default if CPU is z13 or later, and can be // overridden via "[+-]vector" feature string elements. bool VectorABI = true; bool SoftFloat = false; if (CPU.empty() || CPU == "generic" || CPU == "z10" || CPU == "z196" || CPU == "zEC12" || CPU == "arch8" || CPU == "arch9" || CPU == "arch10") VectorABI = false; SmallVector<StringRef, 3> Features; FS.split(Features, ',', -1, false /* KeepEmpty */); for (auto &Feature : Features) { if (Feature == "vector" || Feature == "+vector") VectorABI = true; if (Feature == "-vector") VectorABI = false; if (Feature == "soft-float" || Feature == "+soft-float") SoftFloat = true; if (Feature == "-soft-float") SoftFloat = false; } return VectorABI && !SoftFloat; } So appending -vector should be sufficient? But we'd have to do so only after checking that there's a data layout mismatch, because otherwise we'd just create a new problem if somebody compiles with -march=native or such. Greetings, Andres Freund
pgsql-bugs by date: