Thread: Segfault in jit tuple deforming on arm64 due to LLVM issue

Segfault in jit tuple deforming on arm64 due to LLVM issue

From
Anthonin Bonnefoy
Date:
Hi!

I have an instance that started to consistently crash with segfault or
bus error and most of the generated coredumps had corrupted stacks.
Some salvageable frames showed the error happening within
ExecRunCompiledExpr. Sure enough, running the query with jit disabled
stopped the crashes. The issue happens with the following setup:

Ubuntu jammy on arm64, 30G
postgresql-14 14.12-1.pgdg22.04+1
libllvm15 1:15.0.7-0ubuntu0.22.04.3

I was able to isolate the impacted database the db (pg_dump of the
table was not enough, a base backup had to be used) and reproduce the
issue on a debug build of PostgresSQL. This time, there's no crash but
it was stuck in an infinite loop within jit tuple deforming:

#0  0x0000ec53660aa14c in deform_0_1 ()
#1  0x0000ec53660aa064 in evalexpr_0_0 ()
#2  0x0000ab8f9b322948 in ExecEvalExprSwitchContext
(isNull=0xfffff47c3c87, econtext=0xab8fd0f13878, state=0xab8fd0f13c50)
at executor/./build/../src/include/executor/executor.h:342
#3  ExecProject (projInfo=0xab8fd0f13c48) at
executor/./build/../src/include/executor/executor.h:376

Looking at the generated assembly, the infinite loop happens between
deform_0_1+140 and deform_0_1+188

// Store address page in x11 register
0xec53660aa130 <deform_0_1+132> adrp    x11, 0xec53fd308000
// Start of the infinite loop
0xec53660aa138 <deform_0_1+140> adr     x8, 0xec53660aa138 <deform_0_1+140>
// Load the content of 0xec53fd308000[x12] in x10, x12 was 0 at that time
0xec53660aa13c <deform_0_1+144> ldrsw   x10, [x11, x12, lsl #2]
// Add the loaded offset to x8
0xec53660aa140 <deform_0_1+148> add     x8, x8, x10
...
// Branch to address in x8. Since x10 was 0, x8 has the value
deform_0_1+140, creating the infinite loop
0xec53660aa168 <deform_0_1+188> br      x8

Looking at the content of 0xec53fd308000, We only see 0 values stored
at the address.

x/6 0xec53fd308000
0xec53fd308000: 0x00000000      0x00000000      0x00000000      0x00000000
0xec53fd308010: 0x00000000      0x00000000

The assembly matches the code for the find_start switch case in
llvmjit_deform[1]. The content at the address 0xec53fd308000 should
contain the offset table from the PC to branch to the correct
attcheckattnoblocks block. As a comparison, if I execute a query not
impacted by the issue (the size of the jit compiled module seems to be
a factor), I can see that the offset table was correctly filled.

x/6 0xec55fd30700
0xec55fd307000: 0x00000060      0x00000098      0x000000e8      0x00000170
0xec55fd307010: 0x0000022c      0x000002e8

I was suspecting something was erasing the content of the offset table
so I've checked with rr. However, it was only initialized and nothing
was written at this memory address. I was starting to suspect a
possible LLVM issue and ran the query against a debug build of
llvm_jit. It immediately triggered the following assertion[2]:

void llvm::RuntimeDyldELF::resolveAArch64Relocation(const
llvm::SectionEntry &, uint64_t, uint64_t, uint32_t, int64_t):
Assertion `isInt<33>(Result) && "overflow check failed for
relocation"' failed.

This happens when LLVM is resolving relocations.

#5  __GI___assert_fail (assertion=0xf693f214771a "isInt<33>(Result) &&
\"overflow check failed for relocation\"", file=0xf693f2147269
"/var/lib/postgresql/llvm-project/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp",
line=507, function=0xf693f214754f "void
llvm::RuntimeDyldELF::resolveAArch64Relocation(const
llvm::SectionEntry &, uint64_t, uint64_t, uint32_t, int64_t)") at
./assert/assert.c:101
#6  llvm::RuntimeDyldELF::resolveAArch64Relocation () at
/var/lib/postgresql/llvm-project/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp:507
#7  llvm::RuntimeDyldELF::resolveRelocation () at
/var/lib/postgresql/llvm-project/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp:1044
#8  llvm::RuntimeDyldELF::resolveRelocation () at
/var/lib/postgresql/llvm-project/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp:1026
#9  llvm::RuntimeDyldImpl::resolveRelocationList () at
/var/lib/postgresql/llvm-project/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp:1112
#10 llvm::RuntimeDyldImpl::resolveLocalRelocations () at
/var/lib/postgresql/llvm-project/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp:157
#11 llvm::RuntimeDyldImpl::finalizeAsync() at
/var/lib/postgresql/llvm-project/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyld.cpp:1247

During the assertion failure, I have the following values:
Value: 0xfbc84fab9000
FinalAddress: 0xfbc5b9cea12c
Addend: 0x0
Result: 0x295dcf000

The result is indeed greater than an int32, triggering the assert.
Looking at the sections created by LLVM in allocateSection[3], we have
3 sections created:
.text     {Address = 0xfbc5b9cea000, AllocatedSize = 90112}
.rodata   {Address = 0xfbc84fab9000, AllocatedSize = 4096}
.eh_frame {Address = 0xfbc84fab7000, AllocatedSize = 8192}

When resolving relocation, the difference between the rodata section
and the PC is computed and stored in the ADRP instruction. However,
when a new section is allocated, LLVM will request a new memory block
from the memory allocator[4]. The MemGroup.Near is passed as the start
hint of mmap but that's only a hint and the kernel doesn't provide any
guarantee that the new allocated block will be near. With the impacted
query, there are more than 10GB of gap between the .text section and
the .rodata section, making it impossible for the code in the .text
section to correctly fetch data from the .rodata section as the
address in ADRP is limited to a +/-4GB range.

There are mentions about this in the ABI that the GOT section should
be within 4GB from the text section[5]. Though in this case, there's
no GOT section as the offsets are stored in the .rodata section but
the constraint is going to be similar. This is a known LLVM issue[6]
that impacted Impala, Numba and Julia. There's an open PR[7] to fix
the issue by allocating all sections as a single memory block,
avoiding the gaps between sections. There's also a related discussion
on this on llvm-rtdyld discourse[8].

A possible mitigation is to switch from RuntimeDyld to JITLinking but
this requires at least LLVM15 as LLVM14 doesn't have any significant
relocation support for aarch64[9]. I did test using JITLinking on my
impacted db and it seems to fix the issue. JITLinking has no exposed C
interface though so it requires additional wrapping.

I don't necessarily have a good answer for this issue. I've tried to
tweak relocation settings or the jit code to avoid relocation without
too much success. Ideally, the llvm fix will be merged and backported
in llvm but the PR has been open for some time now. I've seen multiple
segfault reports that look similar to this issue (example: [10], [11])
but I don't think it was linked to the LLVM bug so I figured I would
at least share my findings.

[1] https://github.com/postgres/postgres/blob/REL_14_STABLE/src/backend/jit/llvm/llvmjit_deform.c#L364-L382
[2]
https://github.com/llvm/llvm-project/blob/release/14.x/llvm/lib/ExecutionEngine/RuntimeDyld/RuntimeDyldELF.cpp#L501-L513
[3]
https://github.com/llvm/llvm-project/blob/release/14.x/llvm/lib/ExecutionEngine/SectionMemoryManager.cpp#L41C32-L41C47
[4] https://github.com/llvm/llvm-project/blob/release/14.x/llvm/lib/ExecutionEngine/SectionMemoryManager.cpp#L94-L110
[5] https://github.com/ARM-software/abi-aa/blob/main/sysvabi64/sysvabi64.rst#7code-models
[6] https://github.com/llvm/llvm-project/issues/71963
[7] https://github.com/llvm/llvm-project/pull/71968
[8] https://discourse.llvm.org/t/llvm-rtdyld-aarch64-abi-relocation-restrictions/74616
[9] https://github.com/llvm/llvm-project/blob/release/14.x/llvm/lib/ExecutionEngine/JITLink/ELF_aarch64.cpp#L75-L84
[10]
https://www.postgresql.org/message-id/flat/CABa%2BnRvwZy_5t1QF9NJNGwAf03tv_PO_Sg1FsN1%2B-3Odb1XgBA%40mail.gmail.com
[11]
https://www.postgresql.org/message-id/flat/CADAf1kavcN-kY%3DvEm3MYxhUa%2BrtGFs7tym5d7Ee6Ni2cwwxGqQ%40mail.gmail.com

Regards,
Anthonin Bonnefoy



Re: Segfault in jit tuple deforming on arm64 due to LLVM issue

From
Thomas Munro
Date:
On Thu, Aug 22, 2024 at 7:22 PM Anthonin Bonnefoy
<anthonin.bonnefoy@datadoghq.com> wrote:
> Ideally, the llvm fix will be merged and backported
> in llvm but the PR has been open for some time now.

I fear that back-porting, for the LLVM project, would mean "we fix it
in main/20.x, and also back-port it to 19.x".  Do distros back-port
further?

Nice detective work!

The JITLINK change sounds interesting, and like something we need to
do sooner or later.



Re: Segfault in jit tuple deforming on arm64 due to LLVM issue

From
Anthonin Bonnefoy
Date:
On Thu, Aug 22, 2024 at 12:33 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> I fear that back-porting, for the LLVM project, would mean "we fix it
> in main/20.x, and also back-port it to 19.x".  Do distros back-port
> further?

That's also my fear, I'm not familiar with distros back-port policy
but eyeballing ubuntu package changelog[1], it seems to be mostly
build fixes.

Given that there's no visible way to fix the relocation issue, I
wonder if jit shouldn't be disabled for arm64 until either the
RuntimeDyld fix is merged or the switch to JITLink is done. Disabling
jit tuple deforming may be enough but I'm not confident the issue
won't happen in a different part.

[1] https://launchpad.net/ubuntu/+source/llvm-toolchain-16/+changelog



Re: Segfault in jit tuple deforming on arm64 due to LLVM issue

From
Thomas Munro
Date:
On Sat, Aug 24, 2024 at 12:22 AM Anthonin Bonnefoy
<anthonin.bonnefoy@datadoghq.com> wrote:
> On Thu, Aug 22, 2024 at 12:33 PM Thomas Munro <thomas.munro@gmail.com> wrote:
> > I fear that back-porting, for the LLVM project, would mean "we fix it
> > in main/20.x, and also back-port it to 19.x".  Do distros back-port
> > further?
>
> That's also my fear, I'm not familiar with distros back-port policy
> but eyeballing ubuntu package changelog[1], it seems to be mostly
> build fixes.
>
> Given that there's no visible way to fix the relocation issue, I
> wonder if jit shouldn't be disabled for arm64 until either the
> RuntimeDyld fix is merged or the switch to JITLink is done. Disabling
> jit tuple deforming may be enough but I'm not confident the issue
> won't happen in a different part.

We've experienced something a little similar before: In the early days
of PostgreSQL LLVM, it didn't work at all on ARM or POWER.  We sent a
trivial fix[1] upstream that landed in LLVM 7; since it was a small
and obvious problem and it took a long time for some distros to ship
LLVM 7, we even contemplated hot-patching that LLVM function with our
own copy (but, ugh, only for about 7 nanoseconds).  That was before we
turned JIT on by default, and was also easier to deal with because it
was an obvious consistent failure in basic tests, so packagers
probably just disabled the build option on those architectures.  IIUC
this one is a random and rare crash depending on malloc() and perhaps
also the working size of your virtual memory dart board.  (Annoyingly,
I had tried to reproduce this quite a few times on small ARM systems
when earlier reports came in, d'oh!).

This degree of support window mismatch is probably what triggered RHEL
to develop their new rolling LLVM version policy.  Unfortunately, it's
the other distros that tell *us* which versions to support, and not
the reverse (for example CF #4920 is about to drop support for LLVM <
14, but that will only be for PostgreSQL 18+).

Ultimately, if it doesn't work, and doesn't get fixed, it's hard for
us to do much about it.  But hmm, this is probably madness... I wonder
if it would be feasible to detect address span overflow ourselves at a
useful time, as a kind of band-aid defence...

[1] https://www.postgresql.org/message-id/CAEepm%3D39F_B3Ou8S3OrUw%2BhJEUP3p%3DwCu0ug-TTW67qKN53g3w%40mail.gmail.com



Re: Segfault in jit tuple deforming on arm64 due to LLVM issue

From
Anthonin Bonnefoy
Date:
On Mon, Aug 26, 2024 at 4:33 AM Thomas Munro <thomas.munro@gmail.com> wrote:
> IIUC this one is a random and rare crash depending on malloc() and
> perhaps also the working size of your virtual memory dart board.
> (Annoyingly, I had tried to reproduce this quite a few times on small ARM
> systems when earlier reports came in, d'oh!).

allocateMappedMemory used when creating sections will eventually call
mmap[1], not malloc. So the amount of shared memory configured may be
a factor in triggering the issue.

My first attempts to reproduce the issue from scratch weren't
successful either. However, trying again with different values of
shared_buffers, I've managed to trigger the issue somewhat reliably.

On a clean Ubuntu jammy, I've compiled the current PostgreSQL
REL_14_STABLE (6bc2bfc3) with the following options:
CLANG=clang-14 ../configure --enable-cassert --enable-debug --prefix
~/.local/ --with-llvm

Set "shared_buffers = '4GB'" in the configuration. More may be needed
but 4GB was enough for me.

Create a table with multiple partitions with pgbench. The goal is to
have a jit module big enough to trigger the issue.
pgbench -i --partitions=64

Then run the following query with jit forcefully enabled:
psql options=-cjit_above_cost=0 -c 'SELECT count(bid) from pgbench_accounts;'

If the issue was successfully triggered, it should segfault or be
stuck in an infinite loop.

> Ultimately, if it doesn't work, and doesn't get fixed, it's hard for
> us to do much about it.  But hmm, this is probably madness... I wonder
> if it would be feasible to detect address span overflow ourselves at a
> useful time, as a kind of band-aid defence...

There's a possible alternative, but it's definitely in the same
category as the hot-patching idea. llvmjit uses
LLVMOrcCreateRTDyldObjectLinkingLayerWithSectionMemoryManager to
create the ObjectLinkingLayer and it will be created with the default
SectionMemoryManager[2]. It should be possible to provide a modified
SectionMemoryManager with the change to allocate sections in a single
block and it could be restricted to arm64 architecture. A part of me
tells me this is probably a bad idea but on the other hand, LLVM
provides this way to plug a custom allocator and it would fix the
issue...

[1] https://github.com/llvm/llvm-project/blob/release/14.x/llvm/lib/Support/Unix/Memory.inc#L115-L117
[2] https://github.com/llvm/llvm-project/blob/release/14.x/llvm/lib/ExecutionEngine/Orc/OrcV2CBindings.cpp#L967-L973



Re: Segfault in jit tuple deforming on arm64 due to LLVM issue

From
Thomas Munro
Date:
On Tue, Aug 27, 2024 at 11:32 AM Thomas Munro <thomas.munro@gmail.com> wrote:
> SectorMemoryManager

Erm, "Section".  (I was working on some file system stuff at the
weekend, and apparently my fingers now auto-complete "sector".)



Re: Segfault in jit tuple deforming on arm64 due to LLVM issue

From
Thomas Munro
Date:
Thanks!  And that's great news.  Do you want to report this experience
to the PR, in support of committing it?  That'd make it seem easier to
consider shipping a back-ported copy...



Re: Segfault in jit tuple deforming on arm64 due to LLVM issue

From
Anthonin Bonnefoy
Date:
On Tue, Aug 27, 2024 at 12:01 PM Thomas Munro <thomas.munro@gmail.com> wrote:
>
> Thanks!  And that's great news.  Do you want to report this experience
> to the PR, in support of committing it?  That'd make it seem easier to
> consider shipping a back-ported copy...

Yes, I will do that.