Re: BUG #16971: Incompatible datalayout errors with llvmjit - Mailing list pgsql-bugs

From Andres Freund
Subject Re: BUG #16971: Incompatible datalayout errors with llvmjit
Date
Msg-id 20210420192937.3zu4wpdemxwfvo4u@alap3.anarazel.de
Whole thread Raw
In response to BUG #16971: Incompatible datalayout errors with llvmjit  (PG Bug reporting form <noreply@postgresql.org>)
Responses Re: BUG #16971: Incompatible datalayout errors with llvmjit  (Tom Stellard <tstellar@redhat.com>)
List pgsql-bugs
Hi,

Thanks to tgl for pointing me this thread...

On 2021-04-19 18:29:52 +0000, PG Bug reporting form wrote:
> In our Fedora builds, we are getting errors[1] in the postgresql tests due
> to incompatible datalayouts between the JIT engine and the LLVM modules
> being compiled.  The problem is that the JIT engine is being created with
> host specific CPU and features, while the datalayout for the compiled module
> is being taken from llvmjit_types.bc which is compiled without any specified
> CPU type or features.

It's very odd that features would change the data layout - analogizing
with plain C code that'd mean that you cannot link a binary compiled
with something like -mavx2 against a library compiled without. To me
this smells like a bug somewhere lower level.

Reformatting the error yields:
ERROR:  failed to JIT module: Added modules have incompatible data layouts:
E-m:e-i1:8:16-i8:8:16-i64:64-f128:64-        a:8:16-n32:64 (module) vs
E-m:e-i1:8:16-i8:8:16-i64:64-f128:64-v128:64-a:8:16-n32:64 (jit)

The -v128:64 is about how to align vectors. Skimming the relevant LLVM
code I don't see why it'd be included in JIted code but not native code.

Just to be sure, I take
checking for llvm-config... /usr/bin/llvm-config
checking for clang... /usr/bin/clang
are for llvm & clang compiled from the same code?

I temporarily had access to s390x to debug an unrelated issue in the
past, and there I did hit this problem, but IIRC only because of clang
vs llvm version mismatches.

Unfortunately it is a bit hard to debug without access to a s390x
box... Can you provide that? If not I might ask the Debian folks for
access to one of their porter machines.

FWIW, the Debian s390x build seems to succeed at the moment:
https://buildd.debian.org/status/fetch.php?pkg=postgresql-13&arch=s390x&ver=13.2-1&stamp=1613044202&raw=0


> One way to fix this would be to add -march=native to the %.bc rules in
> src/Makefile.global.in.  However, this will only work when the build system
> and the run system are the same.

> I think to fix this correctly, the llvmjit_types.bc file will need to be
> compiled when the JIT engine is initialized at runtime, so that it can use
> the same datalayout as the JIT engine.

That'd require headers to be present, which I don't think we should
require... And more importantly, it'd not go very far, because we also
have lot of other .bc files that will have the layout embedded (for
inlining functions/operators into JITed code). Which'd then require all
the source code to be present and to be compiled into bitcode. Not an,
uh, satisfying option.



> [1] https://kojipkgs.fedoraproject.org//work/tasks/2182/66082182/build.log

Random thing I noticed while scrolling through the log:
> configure: WARNING: unrecognized options: --disable-dependency-tracking

PG requires dependency tracking to be explicitly enabled, and it's a
different flag name (--enable-depend).

Greetings,

Andres Freund



pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: BUG #16973: Backward compatibility: pg_restore: [archiver] unsupported version (1.14) in file header
Next
From: Tom Stellard
Date:
Subject: Re: BUG #16971: Incompatible datalayout errors with llvmjit